TY - CONF T1 - Ad hoc Cloud Computing T2 - IEEE Cloud Y1 - 2015 A1 - Gary McGilvary A1 - Barker, Adam A1 - Malcolm Atkinson KW - ad hoc KW - cloud computing KW - reliability KW - virtualization KW - volunteer computing AB - This paper presents the first complete, integrated and end-to-end solution for ad hoc cloud computing environments. Ad hoc clouds harvest resources from existing sporadically available, non-exclusive (i.e. primarily used for some other purpose) and unreliable infrastructures. In this paper we discuss the problems ad hoc cloud computing solves and outline our architecture which is based on BOINC. JF - IEEE Cloud UR - http://arxiv.org/abs/1505.08097 ER - TY - CHAP T1 - Evolutionary Computation and Constraint Satisfaction Y1 - 2015 A1 - van Hemert, J. ED - Kacpryk, J. ED - Pedrycz, W. KW - constraint satisfaction KW - evolutionary computation AB - In this chapter we will focus on the combination of evolutionary computation techniques and constraint satisfaction problems. Constraint Programming (CP) is another approach to deal with constraint satisfaction problems. In fact, it is an important prelude to the work covered here as it advocates itself as an alternative approach to programming (Apt). The first step is to formulate a problem as a CSP such that techniques from CP, EC, combinations of the two (c.f., Hybrid) or other approaches can be deployed to solve the problem. The formulation of a problem has an impact on its complexity in terms of effort required to either find a solution or proof no solution exists. It is therefore vital to spend time on getting this right. Main differences between CP and EC. CP defines search as iterative steps over a search tree where nodes are partial solutions to the problem where not all variables are assigned values. The search then maintain a partial solution that satisfies all variables assigned values. Instead, in EC most often solver sample a space of candidate solutions where variables are all assigned values. None of these candidate solutions will satisfy all constraints in the problem until a solution is found. Another major difference is that many constraint solvers from CP are sound whereas EC solvers are not. A solver is sound if it always finds a solution if it exists. PB - Springer ER - TY - BOOK T1 - Ad hoc Cloud Computing (PhD Thesis) Y1 - 2014 A1 - Gary McGilvary AB - Commercial and private cloud providers offer virtualized resources via a set of co-located and dedicated hosts that are exclusively reserved for the purpose of offering a cloud service. While both cloud models appeal to the mass market, there are many cases where outsourcing to a remote platform or procuring an in-house infrastructure may not be ideal or even possible. To offer an attractive alternative, we introduce and develop an ad hoc cloud computing platform to transform spare resource capacity from an infrastructure owner's locally available, but non-exclusive and unreliable infrastructure, into an overlay cloud platform. The foundation of the ad hoc cloud relies on transferring and instantiating lightweight virtual machines on-demand upon near-optimal hosts while virtual machine checkpoints are distributed in a P2P fashion to other members of the ad hoc cloud. Virtual machines found to be non-operational are restored elsewhere ensuring the continuity of cloud jobs. In this thesis we investigate the feasibility, reliability and performance of ad hoc cloud computing infrastructures. We firstly show that the combination of both volunteer computing and virtualization is the backbone of the ad hoc cloud. We outline the process of virtualizing the volunteer system BOINC to create V-BOINC. V-BOINC distributes virtual machines to volunteer hosts allowing volunteer applications to be executed in the sandbox environment to solve many of the downfalls of BOINC; this however also provides the basis for an ad hoc cloud computing platform to be developed. We detail the challenges of transforming V-BOINC into an ad hoc cloud and outline the transformational process and integrated extensions. These include a BOINC job submission system, cloud job and virtual machine restoration schedulers and a periodic P2P checkpoint distribution component. Furthermore, as current monitoring tools are unable to cope with the dynamic nature of ad hoc clouds, a dynamic infrastructure monitoring and management tool called the Cloudlet Control Monitoring System is developed and presented. We evaluate each of our individual contributions as well as the reliability, performance and overheads associated with an ad hoc cloud deployed on a realistically simulated unreliable infrastructure. We conclude that the ad hoc cloud is not only a feasible concept but also a viable computational alternative that offers high levels of reliability and can at least offer reasonable performance, which at times may exceed the performance of a commercial cloud infrastructure. PB - The University of Edinburgh CY - Edinburgh ER - TY - CONF T1 - Applying selectively parallel IO compression to parallel storage systems T2 - Euro-Par Y1 - 2014 A1 - Rosa Filgueira A1 - Malcolm Atkinson A1 - Yusuke Tanimura A1 - Isao Kojima JF - Euro-Par ER - TY - CONF T1 - FAST: Flexible Automated Syncrhonization Transfer tool T2 - Proceedings of the Sixth International Workshop on Data-Intensive Distributed Computing Date Y1 - 2014 A1 - Rosa Filgueira A1 - Iraklis Klampanos A1 - Yusuke Tanimura A1 - Malcolm Atkinson. JF - Proceedings of the Sixth International Workshop on Data-Intensive Distributed Computing Date PB - ACM CY - New York, NY, USA ER - TY - JOUR T1 - Precise montaging and metric quantification of retinal surface area from ultra-widefield fundus photography and fluorescein angiography JF - Ophthalmic Surg Lasers Imaging Retina Y1 - 2014 A1 - Croft, D.E. A1 - van Hemert, J. A1 - Wykoff, C.C. A1 - Clifton, D. A1 - Verhoek, M. A1 - Fleming, A. A1 - Brown, D.M. KW - medical KW - retinal imaging AB - BACKGROUND AND OBJECTIVE: Accurate quantification of retinal surface area from ultra-widefield (UWF) images is challenging due to warping produced when the retina is projected onto a two-dimensional plane for analysis. By accounting for this, the authors sought to precisely montage and accurately quantify retinal surface area in square millimeters. PATIENTS AND METHODS: Montages were created using Optos 200Tx (Optos, Dunfermline, U.K.) images taken at different gaze angles. A transformation projected the images to their correct location on a three-dimensional model. Area was quantified with spherical trigonometry. Warping, precision, and accuracy were assessed. RESULTS: Uncorrected, posterior pixels represented up to 79% greater surface area than peripheral pixels. Assessing precision, a standard region was quantified across 10 montages of the same eye (RSD: 0.7%; mean: 408.97 mm(2); range: 405.34-413.87 mm(2)). Assessing accuracy, 50 patients' disc areas were quantified (mean: 2.21 mm(2); SE: 0.06 mm(2)), and the results fell within the normative range. CONCLUSION: By accounting for warping inherent in UWF images, precise montaging and accurate quantification of retinal surface area in square millimeters were achieved. [Ophthalmic Surg Lasers Imaging Retina. 2014;45:312-317.]. VL - 45 ER - TY - JOUR T1 - Quantification of Ultra-Widefield Retinal Images JF - Retina Today Y1 - 2014 A1 - D.E. Croft A1 - C.C. Wykoff A1 - D.M. Brown A1 - van Hemert, J. A1 - M. Verhoek KW - medical KW - retinal imaging AB - Advances in imaging periodically lead to dramatic changes in the diagnosis, management, and study of retinal disease. For example, the innovation and wide-spread application of fluorescein angiography and optical coherence tomography (OCT) have had tremendous impact on the management of retinal disorders.1,2 Recently, ultra-widefield (UWF) imaging has opened a new window into the retina, allowing the capture of greater than 80% of the fundus with a single shot.3 With montaging, much of the remaining retinal surface area can be captured.4,5 However, to maximize the potential of these new modalities, accurate quantification of the pathology they capture is critical. UR - http://www.bmctoday.net/retinatoday/pdfs/0514RT_imaging_Croft.pdf ER - TY - Generic T1 - Varpy: A python library for volcanology and rock physics data analysis. EGU2014-3699 Y1 - 2014 A1 - Rosa Filgueira A1 - Malcolm Atkinson A1 - Andrew Bell A1 - Branwen Snelling ER - TY - JOUR T1 - Automatic extraction of retinal features from colour retinal images for glaucoma diagnosis: A review JF - Computerized Medical Imaging and Graphics Y1 - 2013 A1 - Haleem, M.S. A1 - Han, L. A1 - van Hemert, J. A1 - Li, B. KW - retinal imaging AB - Glaucoma is a group of eye diseases that have common traits such as, high eye pressure, damage to the Optic Nerve Head and gradual vision loss. It affects peripheral vision and eventually leads to blindness if left untreated. The current common methods of pre-diagnosis of Glaucoma include measurement of Intra-Ocular Pressure (IOP) using Tonometer, Pachymetry, Gonioscopy; which are performed manually by the clinicians. These tests are usually followed by Optic Nerve Head (ONH) Appearance examination for the confirmed diagnosis of Glaucoma. The diagnoses require regular monitoring, which is costly and time consuming. The accuracy and reliability of diagnosis is limited by the domain knowledge of different ophthalmologists. Therefore automatic diagnosis of Glaucoma attracts a lot of attention. This paper surveys the state-of-the-art of automatic extraction of anatomical features from retinal images to assist early diagnosis of the Glaucoma. We have conducted critical evaluation of the existing automatic extraction methods based on features including Optic Cup to Disc Ratio (CDR), Retinal Nerve Fibre Layer (RNFL), Peripapillary Atrophy (PPA), Neuroretinal Rim Notching, Vasculature Shift, etc., which adds value on efficient feature extraction related to Glaucoma diagnosis. VL - 37 SN - 0895-6111 UR - http://linkinghub.elsevier.com/retrieve/pii/S0895611113001468?showall=true ER - TY - CONF T1 - Automatic Extraction of the Optic Disc Boundary for Detecting Retinal Diseases T2 - 14th {IASTED} International Conference on Computer Graphics and Imaging (CGIM) Y1 - 2013 A1 - M.S. Haleem A1 - L. Han A1 - B. Li A1 - A. Nisbet A1 - van Hemert, J. A1 - M. Verhoek ED - L. Linsen ED - M. Kampel KW - retinal imaging AB - In this paper, we propose an algorithm based on active shape model for the extraction of Optic Disc boundary. The determination of Optic Disc boundary is fundamental to the automation of retinal eye disease diagnosis because the Optic Disc Center is typically used as a reference point to locate other retinal structures, and any structural change in Optic Disc, whether textural or geometrical, can be used to determine the occurrence of retinal diseases such as Glaucoma. The algorithm is based on determining a model for the Optic Disc boundary by learning patterns of variability from a training set of annotated Optic Discs. The model can be deformed so as to reflect the boundary of Optic Disc in any feasible shape. The algorithm provides some initial steps towards automation of the diagnostic process for retinal eye disease in order that more patients can be screened with consistent diagnoses. The overall accuracy of the algorithm was 92% on a set of 110 images. JF - 14th {IASTED} International Conference on Computer Graphics and Imaging (CGIM) PB - {ACTA} Press ER - TY - CONF T1 - C2MS: Dynamic Monitoring and Management of Cloud Infrastructures T2 - IEEE CloudCom Y1 - 2013 A1 - Gary McGilvary A1 - Josep Rius A1 - Íñigo Goiri A1 - Francesc Solsona A1 - Barker, Adam A1 - Atkinson, Malcolm P. AB - Server clustering is a common design principle employed by many organisations who require high availability, scalability and easier management of their infrastructure. Servers are typically clustered according to the service they provide whether it be the application(s) installed, the role of the server or server accessibility for example. In order to optimize performance, manage load and maintain availability, servers may migrate from one cluster group to another making it difficult for server monitoring tools to continuously monitor these dynamically changing groups. Server monitoring tools are usually statically configured and with any change of group membership requires manual reconfiguration; an unreasonable task to undertake on large-scale cloud infrastructures. In this paper we present the Cloudlet Control and Management System (C2MS); a system for monitoring and controlling dynamic groups of physical or virtual servers within cloud infrastructures. The C2MS extends Ganglia - an open source scalable system performance monitoring tool - by allowing system administrators to define, monitor and modify server groups without the need for server reconfiguration. In turn administrators can easily monitor group and individual server metrics on large-scale dynamic cloud infrastructures where roles of servers may change frequently. Furthermore, we complement group monitoring with a control element allowing administrator-specified actions to be performed over servers within service groups as well as introduce further customized monitoring metrics. This paper outlines the design, implementation and evaluation of the C2MS. JF - IEEE CloudCom CY - Bristol, UK ER - TY - JOUR T1 - The Cloud Paradigm Applied to e-Health JF - BMC Med. Inf. {&} Decision Making Y1 - 2013 A1 - Jordi Vilaplana A1 - Francesc Solsona A1 - Francesc Abella A1 - Rosa Filgueira A1 - Josep Rius Torrento VL - 13 ER - TY - BOOK T1 - The DATA Bonanza: Improving Knowledge Discovery in Science, Engineering, and Business T2 - Wiley Series on Parallel and Distributed Computing (Editor: Albert Y. Zomaya) Y1 - 2013 A1 - Atkinson, Malcolm P. A1 - Baxter, Robert M. A1 - Peter Brezany A1 - Oscar Corcho A1 - Michelle Galea A1 - Parsons, Mark A1 - Snelling, David A1 - van Hemert, Jano KW - Big Data KW - Data Intensive KW - data mining KW - Data Streaming KW - Databases KW - Dispel KW - Distributed Computing KW - Knowledge Discovery KW - Workflows AB - With the digital revolution opening up tremendous opportunities in many fields, there is a growing need for skilled professionals who can develop data-intensive systems and extract information and knowledge from them. This book frames for the first time a new systematic approach for tackling the challenges of data-intensive computing, providing decision makers and technical experts alike with practical tools for dealing with our exploding data collections. Emphasising data-intensive thinking and interdisciplinary collaboration, The DATA Bonanza: Improving Knowledge Discovery in Science, Engineering, and Business examines the essential components of knowledge discovery, surveys many of the current research efforts worldwide, and points to new areas for innovation. Complete with a wealth of examples and DISPEL-based methods demonstrating how to gain more from data in real-world systems, the book: * Outlines the concepts and rationale for implementing data-intensive computing in organisations * Covers from the ground up problem-solving strategies for data analysis in a data-rich world * Introduces techniques for data-intensive engineering using the Data-Intensive Systems Process Engineering Language DISPEL * Features in-depth case studies in customer relations, environmental hazards, seismology, and more * Showcases successful applications in areas ranging from astronomy and the humanities to transport engineering * Includes sample program snippets throughout the text as well as additional materials on a companion website The DATA Bonanza is a must-have guide for information strategists, data analysts, and engineers in business, research, and government, and for anyone wishing to be on the cutting edge of data mining, machine learning, databases, distributed systems, or large-scale computing. JF - Wiley Series on Parallel and Distributed Computing (Editor: Albert Y. Zomaya) PB - John Wiley & Sons Inc. SN - 978-1-118-39864-7 ER - TY - CHAP T1 - Data-Intensive Analysis T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Oscar Corcho A1 - van Hemert, Jano ED - Malcolm Atkinson ED - Rob Baxter ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - data mining KW - Data-Analysis Experts KW - Data-Intensive Analysis KW - Knowledge Discovery AB - Part II: "Data-intensive Knowledge Discovery", focuses on the needs of data-analysis experts. It illustrates the problem-solving strategies appropriate for a data-rich world, without delving into the details of underlying technologies. It should engage and inform data-analysis specialists, such as statisticians, data miners, image analysts, bio-informaticians or chemo-informaticians, and generate ideas pertinent to their application areas. Chapter 5: "Data-intensive Analysis", introduces a set of common problems that data-analysis experts often encounter, by means of a set of scenarios of increasing levels of complexity. The scenarios typify knowledge discovery challenges and the presented solutions provide practical methods; a starting point for readers addressing their own data challenges. JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business PB - John Wiley & Sons Ltd. ER - TY - CHAP T1 - Data-Intensive Components and Usage Patterns T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Oscar Corcho ED - Malcolm Atkinson ED - Rob Baxter ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - Data Analysis KW - data mining KW - Data-Intensive Components KW - Registry KW - Workflow Libraries KW - Workflow Sharing AB - Chapter 7: "Data-intensive components and usage patterns", provides a systematic review of the components that are commonly used in knowledge discovery tasks as well as common patterns of component composition. That is, it introduces the processing elements from which knowledge discovery solutions are built and common composition patterns for delivering trustworthy information. It reflects on how these components and patterns are evolving in a data-intensive context. JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business PB - John Wiley & Sons Ltd. ER - TY - CHAP T1 - The Data-Intensive Survival Guide T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Malcolm Atkinson ED - Malcolm Atkinson ED - Rob Baxter ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - Data-Analysis Experts KW - Data-Intensive Architecture KW - Data-intensive Computing KW - Data-Intensive Engineers KW - Datascopes KW - Dispel KW - Domain Experts KW - Intellectual Ramps KW - Knowledge Discovery KW - Workflows AB - Chapter 3: "The data-intensive survival guide", presents an overview of all of the elements of the proposed data-intensive strategy. Sufficient detail is presented for readers to understand the principles and practice that we recommend. It should also provide a good preparation for readers who choose to sample later chapters. It introduces three professional viewpoints: domain experts, data-analysis experts, and data-intensive engineers. Success depends on a balanced approach that develops the capacity of all three groups. A data-intensive architecture provides a flexible framework for that balanced approach. This enables the three groups to build and exploit data-intensive processes that incrementally step from data to results. A language is introduced to describe these incremental data processes from all three points of view. The chapter introduces ‘datascopes’ as the productized data-handling environments and ‘intellectual ramps’ as the ‘on ramps’ for the highways from data to knowledge. JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business PB - John Wiley & Sons Ltd. ER - TY - CHAP T1 - Data-Intensive Thinking with DISPEL T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Malcolm Atkinson ED - Malcolm Atkinson ED - Rob Baxter ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - Data-Intensive Machines KW - Data-Intensive Thinking, Data-intensive Computing KW - Dispel KW - Distributed Computing KW - Knowledge Discovery AB - Chapter 4: "Data-intensive thinking with DISPEL", engages the reader with technical issues and solutions, by working through a sequence of examples, building up from a sketch of a solution to a large-scale data challenge. It uses the DISPEL language extensively, introducing its concepts and constructs. It shows how DISPEL may help designers, data-analysts, and engineers develop solutions to the requirements emerging in any data-intensive application domain. The reader is taken through simple steps initially, this then builds to conceptually complex steps that are necessary to cope with the realities of real data providers, real data, real distributed systems, and long-running processes. JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business PB - John Wiley & Sons Inc. ER - TY - CHAP T1 - Definition of the DISPEL Language T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Paul Martin A1 - Yaikhom, Gagarine ED - Malcolm Atkinson ED - Rob Baxter ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - Data Streaming KW - Data-intensive Computing KW - Dispel AB - Chapter 10: "Definition of the DISPEL language", describes the novel aspects of the DISPEL language: its constructs, capabilities, and anticipated programming style. JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business T3 - {Parallel and Distributed Computing, series editor Albert Y. Zomaya} PB - John Wiley & Sons Inc. ER - TY - CONF T1 - The demand for consistent web-based workflow editors T2 - Proceedings of the 8th Workshop on Workflows in Support of Large-Scale Science Y1 - 2013 A1 - Gesing, Sandra A1 - Atkinson, Malcolm A1 - Klampanos, Iraklis A1 - Galea, Michelle A1 - Berthold, Michael R. A1 - Barbera, Roberto A1 - Scardaci, Diego A1 - Terstyanszky, Gabor A1 - Kiss, Tamas A1 - Kacsuk, Peter KW - web-based workflow editors KW - workflow composition KW - workflow interoperability KW - workflow languages and concepts JF - Proceedings of the 8th Workshop on Workflows in Support of Large-Scale Science PB - ACM CY - New York, NY, USA SN - 978-1-4503-2502-8 UR - http://doi.acm.org/10.1145/2534248.2534260 ER - TY - CHAP T1 - The Digital-Data Challenge T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Malcolm Atkinson A1 - Parsons, Mark ED - Malcolm Atkinson ED - Rob Baxter ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - Big Data KW - Data-intensive Computing, Knowledge Discovery KW - Digital Data KW - Digital-Data Revolution AB - Part I: Strategies for success in the digital-data revolution, provides an executive summary of the whole book to convince strategists, politicians, managers, and educators that our future data-intensive society requires new thinking, new behavior, new culture, and new distribution of investment and effort. This part will introduce the major concepts so that readers are equipped to discuss and steer their organization’s response to the opportunities and obligations brought by the growing wealth of data. It will help readers understand the changing context brought about by advances in digital devices, digital communication, and ubiquitous computing. Chapter 1: The digital-data challenge, will help readers to understand the challenges ahead in making good use of the data and introduce ideas that will lead to helpful strategies. A global digital-data revolution is catalyzing change in the ways in which we live, work, relax, govern, and organize. This is a significant change in society, as important as the invention of printing or the industrial revolution, but more challenging because it is happening globally at lnternet speed. Becoming agile in adapting to this new world is essential. JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business PB - John Wiley & Sons Ltd. ER - TY - CHAP T1 - The Digital-Data Revolution T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Malcolm Atkinson ED - Malcolm Atkinson ED - Rob Baxter ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - Data KW - Information KW - Knowledge KW - Knowledge Discovery KW - Social Impact of Digital Data KW - Wisdom, Data-intensive Computing AB - Chapter 2: "The digital-data revolution", reviews the relationships between data, information, knowledge, and wisdom. It analyses and quantifies the changes in technology and society that are delivering the data bonanza, and then reviews the consequential changes via representative examples in biology, Earth sciences, social sciences, leisure activity, and business. It exposes quantitative details and shows the complexity and diversity of the growing wealth of data, introducing some of its potential benefits and examples of the impediments to successfully realizing those benefits. JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business PB - John Wiley & Sons Ltd. ER - TY - CHAP T1 - DISPEL Development T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Adrian Mouat A1 - Snelling, David ED - Malcolm Atkinson ED - Rob Baxter ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - Diagnostics KW - Dispel KW - IDE KW - Libraries KW - Processing Elements AB - Chapter 11: "DISPEL development", describes the tools and libraries that a DISPEL developer might expect to use. The tools include those needed during process definition, those required to organize enactment, and diagnostic aids for developers of applications and platforms. JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business PB - John Wiley & Sons Inc. ER - TY - CHAP T1 - DISPEL Enactment T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Chee Sun Liew A1 - Krause, Amrey A1 - Snelling, David ED - Malcolm Atkinson ED - Rob Baxter ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - Data Streaming KW - Data-Intensive Engineering KW - Dispel KW - Workflow Enactment AB - Chapter 12: "DISPEL enactment", describes the four stages of DISPEL enactment. It is targeted at the data-intensive engineers who implement enactment services. JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business PB - John Wiley & Sons Inc. ER - TY - JOUR T1 - Embedded systems for global e-Social Science: Moving computation rather than data JF - Future Generation Computer Systems Y1 - 2013 A1 - Ashley D. Lloyd A1 - Terence M. Sloan A1 - Antonioletti, Mario A1 - Gary McGilvary AB - There is a wealth of digital data currently being gathered by commercial and private concerns that could supplement academic research. To unlock this data it is important to gain the trust of the companies that hold the data as well as showing them how they may benefit from this research. Part of this trust is gained through established reputation and the other through the technology used to safeguard the data. This paper discusses how different technology frameworks have been applied to safeguard the data and facilitate collaborative work between commercial concerns and academic institutions. The paper focuses on the distinctive requirements of e-Social Science: access to large-scale data on behaviour in society in environments that impose confidentiality constraints on access. These constraints arise from both privacy concerns and the commercial sensitivities of that data. In particular, the paper draws on the experiences of building an intercontinental Grid–INWA–from its first operation connecting Australia and Scotland to its subsequent extension to China across the Trans-Eurasia Information Network–the first large-scale research and education network for the Asia-Pacific region. This allowed commercial data to be analysed by experts that were geographically distributed across the globe. It also provided an entry point for a major Chinese commercial organization to approve use of a Grid solution in a new collaboration provided the centre of gravity of the data is retained within the jurisdiction of the data owner. We describe why, despite this approval, an embedded solution was eventually adopted. We find that ‘data sovereignty’ dominates any decision on whether and how to participate in e-Social Science collaborations and how this might impact on a Cloud based solution to this type of collaboration. VL - 29 UR - http://www.sciencedirect.com/science/article/pii/S0167739X12002336 IS - 5 ER - TY - JOUR T1 - Exploiting Parallel R in the Cloud with SPRINT JF - Methods of Information in Medicine Y1 - 2013 A1 - Piotrowski, Michal A1 - Gary McGilvary A1 - Sloan, Terence A1 - Mewissen, Muriel A1 - Ashley Lloyd A1 - Forster, Thorsten A1 - Mitchell, Lawrence A1 - Ghazal, Peter A1 - Hill, Jon AB - Background: Advances in DNA Microarray devices and next-generation massively parallel DNA sequencing platforms have led to an exponential growth in data availability but the arising opportunities require adequate computing resources. High Performance Computing (HPC) in the Cloud offers an affordable way of meeting this need. Objectives: Bioconductor, a popular tool for high-throughput genomic data analysis, is distributed as add-on modules for the R statistical programming language but R has no native capabilities for exploiting multi-processor architectures. SPRINT is an R package that enables easy access to HPC for genomics researchers. This paper investigates: setting up and running SPRINT-enabled genomic analyses on Amazon’s Elastic Compute Cloud (EC2), the advantages of submitting applications to EC2 from different parts of the world and, if resource underutilization can improve application performance. Methods: The SPRINT parallel implementations of correlation, permutation testing, partitioning around medoids and the multi-purpose papply have been benchmarked on data sets of various size on Amazon EC2. Jobs have been submitted from both the UK and Thailand to investigate monetary differences. Results: It is possible to obtain good, scalable performance but the level of improvement is dependent upon the nature of algorithm. Resource underutilization can further improve the time to result. End-user’s location impacts on costs due to factors such as local taxation. Conclusions: Although not designed to satisfy HPC requirements, Amazon EC2 and cloud computing in general provides an interesting alternative and provides new possibilities for smaller organisations with limited funds. VL - 52 IS - 1 ER - TY - CHAP T1 - Foreword T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Tony Hey ED - Malcolm Atkinson ED - Rob Baxter ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - Big Data KW - Data-intensive Computing, Knowledge Discovery JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business PB - John Wiley & Sons Ltd. ER - TY - RPRT T1 - The Implementation of OpenStack Cinder and Integration with NetApp and Ceph Y1 - 2013 A1 - Gary McGilvary A1 - Thomas Oulevey AB - With the ever increasing amount of data produced from Large Hadron Collider (LHC) experiments, new ways are sought to help analyze and store this data as well as help researchers perform their own experiments. To help offer solutions to such problems, CERN has employed the use of cloud computing and in particular OpenStack; an open source and scalable platform for building public and private clouds. The OpenStack project contains many components such as Cinder used to create block storage that can be attached to virtual machines and in turn help increase performance. However instead of creating volumes locally with OpenStack, others remote storage clusters exist offering block based storage with features not present in the current OpenStack implementation; two popular solutions are NetApp and Ceph. Two features Ceph offers is the ability to stripe data stored within volumes over the distributed cluster as well as locally cache this data, both with the aim of improving performance. When in use with OpenStack, Ceph performs default data striping where the number and size of stripes is fixed and cannot be changed dependent on the volume to be created. Similarly, Ceph does not perform data caching when integrated with OpenStack. In this project we outline and document the integration of NetApp and Ceph with OpenStack as well as benchmark the performance of the NetApp and Ceph clusters already present at CERN. To allow Ceph data striping, we modify OpenStack to take the number and size of stripes input via the user to create volumes whose data is then striped according to the values they specify. Similarly, we also modify OpenStack to enable Ceph caching and allow users to select the caching policy they require per-volume. In this report, we describe how these features are implemented. JF - CERN Openlab PB - CERN ER - TY - JOUR T1 - Lesion Area Detection Using Source Image Correlation Coefficient for CT Perfusion Imaging JF - IEEE Journal of Biomedical and Health Informatics Y1 - 2013 A1 - Fan Zhu A1 - Rodríguez, David A1 - Carpenter, Trevor K. A1 - Atkinson, Malcolm P. A1 - Wardlaw, Joanna M. KW - CT , Pattern Recognition , Perfusion Source Images , Segmentation AB - Computer tomography (CT) perfusion imaging is widely used to calculate brain hemodynamic quantities such as Cerebral Blood Flow (CBF), Cerebral Blood Volume (CBV) and Mean Transit Time (MTT) that aid the diagnosis of acute stroke. Since perfusion source images contain more information than hemodynamic maps, good utilisation of the source images can lead to better understanding than the hemodynamic maps alone. Correlation-coefficient tests are used in our approach to measure the similarity between healthy tissue time-concentration curves and unknown curves. This information is then used to differentiate penumbra and dead tissues from healthy tissues. The goal of the segmentation is to fully utilize information in the perfusion source images. Our method directly identifies suspected abnormal areas from perfusion source images and then delivers a suggested segmentation of healthy, penumbra and dead tissue. This approach is designed to handle CT perfusion images, but it can also be used to detect lesion areas in MR perfusion images. VL - 17 IS - 5 ER - TY - CONF T1 - MPI collective I/O based on advanced reservations to obtain performance guarantees from shared storage systems T2 - CLUSTER Y1 - 2013 A1 - Yusuke Tanimura A1 - Rosa Filgueira A1 - Isao Kojima A1 - Malcolm P. Atkinson JF - CLUSTER ER - TY - CHAP T1 - Platforms for Data-Intensive Analysis T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Snelling, David ED - Malcolm Atkinson ED - Baxter, Robert M. ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - Data-Intensive Engineering KW - Data-Intensive Systems KW - Dispel KW - Distributed Systems AB - Part III: "Data-intensive engineering", is targeted at technical experts who will develop complex applications, new components, or data-intensive platforms. The techniques introduced may be applied very widely; for example, to any data-intensive distributed application, such as index generation, image processing, sequence comparison, text analysis, and sensor-stream monitoring. The challenges, methods, and implementation requirements are illustrated by making extensive use of DISPEL. Chapter 9: "Platforms for data-intensive analysis", gives a reprise of data-intensive architectures, examines the business case for investing in them, and introduces the stages of data-intensive workflow enactment. JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business PB - John Wiley & Sons Ltd. ER - TY - CHAP T1 - Preface T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Malcolm Atkinson ED - Malcolm Atkinson ED - Rob Baxter ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - Big Data, Data-intensive Computing, Knowledge Discovery AB - Who should read the book and why. The structure and conventions used. Suggested reading paths for different categories of reader. JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business PB - John Wiley & Sons Ltd. ER - TY - CHAP T1 - Problem Solving in Data-Intensive Knowledge Discovery T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Oscar Corcho A1 - van Hemert, Jano ED - Malcolm Atkinson ED - Rob Baxter ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - Data-Analysis Experts KW - Data-Intensive Analysis KW - Design Patterns for Knowledge Discovery KW - Knowledge Discovery AB - Chapter 6: "Problem solving in data-intensive knowledge discovery", on the basis of the previous scenarios, this chapter provides an overview of effective strategies in knowledge discovery, highlighting common problem-solving methods that apply in conventional contexts, and focusing on the similarities and differences of these methods. JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business PB - John Wiley & Sons Ltd. ER - TY - CONF T1 - Provenance for seismological processing pipelines in a distributed streaming workflow T2 - EDBT/ICDT Workshops Y1 - 2013 A1 - Alessandro Spinuso A1 - James Cheney A1 - Malcolm Atkinson JF - EDBT/ICDT Workshops ER - TY - CONF T1 - SANComSim: A Scalable, Adaptive and Non-intrusive Framework to Optimize Performance in Computational Science Applications T2 - ICCS Y1 - 2013 A1 - Alberto Nuñez A1 - Rosa Filgueira A1 - Mercedes G. Merayo JF - ICCS ER - TY - CHAP T1 - Sharing and Reuse in Knowledge Discovery T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Oscar Corcho ED - Malcolm Atkinson ED - Rob Baxter ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - Data-Intensive Analysis KW - Knowledge Discovery KW - Ontologies KW - Semantic Web KW - Sharing AB - Chapter 8: "Sharing and re-use in knowledge discovery", introduces more advanced knowledge discovery problems, and shows how improved component and pattern descriptions facilitate re-use. This supports the assembly of libraries of high level components well-adapted to classes of knowledge discovery methods or application domains. The descriptions are made more powerful by introducing notations from the semantic Web. JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business PB - John Wiley & Sons Ltd. ER - TY - CONF T1 - Towards Addressing CPU-Intensive Seismological Applications in Europe T2 - International Supercomputing Conference Y1 - 2013 A1 - Michele Carpené A1 - I.A. Klampanos A1 - Siew Hoon Leong A1 - Emanuele Casarotti A1 - Peter Danecek A1 - Graziella Ferini A1 - Andre Gemünd A1 - Amrey Krause A1 - Lion Krischer A1 - Federica Magnoni A1 - Marek Simon A1 - Alessandro Spinuso A1 - Luca Trani A1 - Malcolm Atkinson A1 - Giovanni Erbacci A1 - Anton Frank A1 - Heiner Igel A1 - Andreas Rietbrock A1 - Horst Schwichtenberg A1 - Jean-Pierre Vilotte AB - Advanced application environments for seismic analysis help geoscientists to execute complex simulations to predict the behaviour of a geophysical system and potential surface observations. At the same time data collected from seismic stations must be processed comparing recorded signals with predictions. The EU-funded project VERCE (http://verce.eu/) aims to enable specific seismological use-cases and, on the basis of requirements elicited from the seismology community, provide a service-oriented infrastructure to deal with such challenges. In this paper we present VERCE’s architecture, in particular relating to forward and inverse modelling of Earth models and how the, largely file-based, HPC model can be combined with data streaming operations to enhance the scalability of experiments.We posit that the integration of services and HPC resources in an open, collaborative environment is an essential medium for the advancement of sciences of critical importance, such as seismology. JF - International Supercomputing Conference CY - Leipzig, Germany ER - TY - CONF T1 - Towards automatic detection of abnormal retinal capillaries in ultra-widefield-of-view retinal angiographic exams T2 - Conf Proc IEEE Eng Med Biol Soc Y1 - 2013 A1 - Zutis, K. A1 - Trucco, E. A1 - Hubschman, J. P. A1 - Reed, D. A1 - Shah, S. A1 - van Hemert, J. KW - retinal imaging AB - Retinal capillary abnormalities include small, leaky, severely tortuous blood vessels that are associated with a variety of retinal pathologies. We present a prototype image-processing system for detecting abnormal retinal capillary regions in ultra-widefield-of-view (UWFOV) fluorescein angiography exams of the human retina. The algorithm takes as input an UWFOV FA frame and returns the candidate regions identified. An SVM classifier is trained on regions traced by expert ophthalmologists. Tests with a variety of feature sets indicate that edge features and allied properties differentiate best between normal and abnormal retinal capillary regions. Experiments with an initial set of images from patients showing branch retinal vein occlusion (BRVO) indicate promising area under the ROC curve of 0.950 and a weighted Cohen's Kappa value of 0.822. JF - Conf Proc IEEE Eng Med Biol Soc ER - TY - CONF T1 - User-friendly workflows in quantum chemistry T2 - IWSG 2013 Y1 - 2013 A1 - Herres-Pawlis, Sonja A1 - Balaskó, Ákos A1 - Birkenheuer, Georg A1 - Brinkmann, André A1 - Gesing, Sandra A1 - Grunzke, Richard A1 - Hoffmann, Alexander A1 - Kacsuk, Peter A1 - Krüger, Jens A1 - Packschies, Lars A1 - Terstyansky, Gabor A1 - Weingarten, Noam JF - IWSG 2013 PB - CEUR Workshop Proceedings CY - Zurich, Switzerland UR - http://ceur-ws.org/Vol-993/paper14.pdf ER - TY - CONF T1 - V-BOINC: The Virtualization of BOINC T2 - In Proceedings of the 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2013). Y1 - 2013 A1 - Gary McGilvary A1 - Barker, Adam A1 - Ashley Lloyd A1 - Malcolm Atkinson AB - The Berkeley Open Infrastructure for Network Computing (BOINC) is an open source client-server middleware system created to allow projects with large computational requirements, usually set in the scientific domain, to utilize a technically unlimited number of volunteer machines distributed over large physical distances. However various problems exist deploying applications over these heterogeneous machines using BOINC: applications must be ported to each machine architecture type, the project server must be trusted to supply authentic applications, applications that do not regularly checkpoint may lose execution progress upon volunteer machine termination and applications that have dependencies may find it difficult to run under BOINC. To solve such problems we introduce virtual BOINC, or V-BOINC, where virtual machines are used to run computations on volunteer machines. Application developers can then compile their applications on a single architecture, checkpointing issues are solved through virtualization API's and many security concerns are addressed via the virtual machine's sandbox environment. In this paper we focus on outlining a unique approach on how virtualization can be introduced into BOINC and demonstrate that V-BOINC offers acceptable computational performance when compared to regular BOINC. Finally we show that applications with dependencies can easily run under V-BOINC in turn increasing the computational potential volunteer computing offers to the general public and project developers. V-BOINC can be downloaded at http://garymcgilvary.co.uk/vboinc.html JF - In Proceedings of the 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2013). CY - Delft, The Netherlands ER - TY - CONF T1 - The W3C PROV family of specifications for modelling provenance metadata T2 - EDBT Y1 - 2013 A1 - Paolo Missier A1 - Khalid Belhajjame A1 - James Cheney JF - EDBT ER - TY - BOOK T1 - Web-based Science Gateways for Structural Bioinformatics Y1 - 2013 A1 - Gesing, Sandra PB - University of Tübingen UR - http://nbn-resolving.de/urn:nbn:de:bsz:21-opus-67822 ER - TY - CONF T1 - Abstract: Reservation-Based I/O Performance Guarantee for MPI-IO Applications Using Shared Storage Systems T2 - SC Companion Y1 - 2012 A1 - Yusuke Tanimura A1 - Rosa Filgueira A1 - Isao Kojima A1 - Malcolm P. Atkinson JF - SC Companion ER - TY - CONF T1 - An adaptive, scalable, and portable technique for speeding up MPI-based applications T2 - International European Conference on Parallel and Distributed Computing, Europar-2012 Y1 - 2012 A1 - Rosa Filgueira A1 - Alberto Nuñez A1 - Javier Fernandez A1 - Malcolm Atkinson JF - International European Conference on Parallel and Distributed Computing, Europar-2012 ER - TY - JOUR T1 - Computed Tomography Perfusion Imaging Denoising Using Gaussian Process Regression JF - Physics in Medicine and Biology Y1 - 2012 A1 - Fan Zhu A1 - Carpenter, Trevor A1 - Rodríguez, David A1 - Malcolm Atkinson A1 - Wardlaw, Joanna AB - Objective: Brain perfusion weighted images acquired using dynamic contrast studies have an important clinical role in acute stroke diagnosis and treatment decisions. However, Computed Tomography (CT) images suffer from low contrast-to-noise ratios (CNR) as a consequence of the limitation of the exposure to radiation of the patient. As a consequence, the developments of methods for improving the CNR are valuable. Methods: The majority of existing approaches for denoising CT images are optimized for 3D (spatial) information, including spatial decimation (spatially weighted mean filters) and techniques based on wavelet and curvelet transforms. However, perfusion imaging data is 4D as it also contains temporal information. Our approach using Gaussian process regression (GPR), which takes advantage of the temporal information, to reduce the noise level. Results: Over the entire image, GPR gains a 99% CNR improvement over the raw images and also improves the quality of haemodynamic maps allowing a better identification of edges and detailed information. At the level of individual voxel, GPR provides a stable baseline, helps us to identify key parameters from tissue time- concentration curves and reduces the oscillations in the curve. Conclusion: GPR is superior to the comparable techniques used in this study. ER - TY - JOUR T1 - Consistency and repair for XML write-access control policies JF - VLDB J. Y1 - 2012 A1 - Loreto Bravo A1 - James Cheney A1 - Irini Fundulaki A1 - Ricardo Segovia VL - 21 ER - TY - CONF T1 - A Core Calculus for Provenance T2 - POST Y1 - 2012 A1 - Umut A. Acar A1 - Amal Ahmed A1 - James Cheney A1 - Roly Perera JF - POST ER - TY - CONF T1 - A Data Driven Science Gateway for Computational Workflows T2 - UNICORE Summit 2012 Y1 - 2012 A1 - Grunzke, Richard A1 - Birkenheuer, G. A1 - Blunk, D. A1 - Breuers, S. A1 - Brinkmann, A. A1 - Gesing, Sandra A1 - Herres-Pawlis, S A1 - Kohlbacher, O. A1 - Krüger, J. A1 - Kruse, M. A1 - Müller-Pfefferkorn, R. A1 - Schäfer, P. A1 - Schuller, B. A1 - Steinke, T. A1 - Zink, A. JF - UNICORE Summit 2012 ER - TY - CONF T1 - A databank, rather than statistical, model of normal ageing brain structure to indicate pathology T2 - OHBM 2012 Y1 - 2012 A1 - Dickie, David Alexander A1 - Dominic Job A1 - Rodríguez, David A1 - Shenkin, Susan A1 - Wardlaw, Joanna JF - OHBM 2012 UR - http://ww4.aievolution.com/hbm1201/index.cfm?do=abs.viewAbs&abs=5102 ER - TY - JOUR T1 - Data-Intensive Architecture for Scientific Knowledge Discovery JF - Distributed and Parallel Databases Y1 - 2012 A1 - Atkinson, Malcolm P. A1 - Chee Sun Liew A1 - Michelle Galea A1 - Paul Martin A1 - Krause, Amrey A1 - Adrian Mouat A1 - Oscar Corcho A1 - Snelling, David KW - Knowledge discovery, workflow management system AB - This paper presents a data-intensive architecture that demonstrates the ability to support applications from a wide range of application domains, and support the different types of users involved in defining, designing and executing data-intensive processing tasks. The prototype architecture is introduced, and the pivotal role of DISPEL as a canonical language is explained. The architecture promotes the exploration and exploitation of distributed and heterogeneous data and spans the complete knowledge discovery process, from data preparation, to analysis, to evaluation and reiteration. The architecture evaluation included large-scale applications from astronomy, cosmology, hydrology, functional genetics, imaging processing and seismology. VL - 30 UR - http://dx.doi.org/10.1007/s10619-012-7105-3 IS - 5 ER - TY - CONF T1 - Dimensioning Scientific Computing Systems to Improve Performance of Map-Reduce based Applications T2 - Procedia Computer Science, Proceedings of the International Conference on Computational Science, ICCS 2012 Y1 - 2012 A1 - Gabriel G. Castañè A1 - Alberto Nuñez A1 - Rosa Filgueira A1 - Jesus Carretero JF - Procedia Computer Science, Proceedings of the International Conference on Computational Science, ICCS 2012 ER - TY - JOUR T1 - Dimensioning Scientific Computing Systems to Improve Performance of Map-Reduce based Applications JF - Procedia CS Y1 - 2012 A1 - Gabriel G. Castañè A1 - Alberto Nuñez A1 - Rosa Filgueira A1 - Jesus Carretero VL - 9 ER - TY - RPRT T1 - Dispel Tutorial Y1 - 2012 A1 - Paul Martin KW - Dispel AB - Dispel is a strongly-typed imperative language for generating executable workflows for data-intensive distributed applications, particularly (but not exclusively) for use in computational sciences such as bioinformatics, astronomy and seismology — it has been designed to be a portable lingua franca by which researchers can interact with complex distributed research infrastructures without detailed knowledge of the underlying computational middleware, all in order to more easily conduct experiments in data integration, simulation and data-intensive modelling. This document is a tutorial for Dispel. ER - TY - JOUR T1 - EnzML: multi-label prediction of enzyme classes using InterPro signatures. JF - BMC Bioinformatics Y1 - 2012 A1 - De Ferrari, Luna A1 - Stuart Aitken A1 - van Hemert, Jano A1 - Goryanin, Igor AB - BACKGROUND: Manual annotation of enzymatic functions cannot keep up with automatic genome sequencing. In this work we explore the capacity of InterPro sequence signatures to automatically predict enzymatic function. RESULTS: We present EnzML, a multi-label classification method that can efficiently account also for proteins with multiple enzymatic functions: 50,000 in UniProt. EnzML was evaluated using a standard set of 300,747 proteins for which the manually curated Swiss-Prot and KEGG databases have agreeing Enzyme Commission (EC) annotations. EnzML achieved more than 98% subset accuracy (exact match of all correct Enzyme Commission classes of a protein) for the entire dataset and between 87 and 97% subset accuracy in reannotating eight entire proteomes: human, mouse, rat, mouse-ear cress, fruit fly, the S. pombe yeast, the E. coli bacterium and the M. jannaschii archaebacterium. To understand the role played by the dataset size, we compared the cross-evaluation results of smaller datasets, either constructed at random or from specific taxonomic domains such as archaea, bacteria, fungi, invertebrates, plants and vertebrates. The results were confirmed even when the redundancy in the dataset was reduced using UniRef100, UniRef90 or UniRef50 clusters. CONCLUSIONS: InterPro signatures are a compact and powerful attribute space for the prediction of enzymatic function. This representation makes multi-label machine learning feasible in reasonable time (30 minutes to train on 300,747 instances with 10,852 attributes and 2,201 class values) using the Mulan Binary Relevance Nearest Neighbours algorithm implementation (BR-kNN). VL - 13 ER - TY - CONF T1 - Generic User Management for Science Gateways via Virtual Organizations T2 - EGI Technical Forum 2012 Y1 - 2012 A1 - Schlemmer, Tobias A1 - Grunzke, Richard A1 - Gesing, Sandra A1 - Krüger, Jens A1 - Birkenheuer, Georg A1 - Müller-Pfefferkorn, Ralph A1 - Kohlbacher, Oliver JF - EGI Technical Forum 2012 ER - TY - Generic T1 - HealthGrid Applications and Technologies Meet Science Gateways for Life Sciences Y1 - 2012 ED - Gesing, Sandra ED - Glatard, Tristan ED - Krüger, Jens ED - Delgado Olabarriaga, Silvia ED - Solomonides, Tony ED - Silverstein, J. ED - Montagnat, J. ED - Gaignard, A. ED - Krefting, Dagmar PB - IOS Press VL - 175 ER - TY - CONF T1 - The MoSGrid Community - From National to International Scale T2 - EGI Community Forum 2012 Y1 - 2012 A1 - Gesing, Sandra A1 - Herres-Pawlis, Sonja A1 - Birkenheuer, Georg A1 - Brinkmann, André A1 - Grunzke, Richard A1 - Kacsuk, Peter A1 - Kohlbacher, Oliver A1 - Kozlovszky, Miklos A1 - Krüger, Jens A1 - Müller-Pfefferkorn, Ralph A1 - Schäfer, Patrick A1 - Steinke, Thomas JF - EGI Community Forum 2012 ER - TY - CONF T1 - MoSGrid: Progress of Workflow driven Chemical Simulations T2 - Grid Workflow Workshop 2011 Y1 - 2012 A1 - Birkenheuer, Georg A1 - Blunk, Dirk A1 - Breuers, Sebastian A1 - Brinkmann, André A1 - Fels, Gregor A1 - Gesing, Sandra A1 - Grunzke, Richard A1 - Herres-Pawlis, Sonja A1 - Kohlbacher, Oliver A1 - Krüger, Jens A1 - Packschies, Lars A1 - Schäfer, Patrick A1 - Schuller, B. A1 - Schuster, Johannes A1 - Steinke, Thomas A1 - Szikszay Fabri, Anna A1 - Wewior, Martin A1 - Müller-Pfefferkorn, Ralph A1 - Kohlbacher, Oliver JF - Grid Workflow Workshop 2011 PB - CEUR Workshop Proceedings ER - TY - CHAP T1 - Multi-agent Negotiation of Virtual Machine Migration Using the Lightweight Coordination Calculus T2 - Agent and Multi-Agent Systems. Technologies and Applications Y1 - 2012 A1 - Anderson, Paul A1 - Shahriar Bijani A1 - Vichos, Alexandros ED - Jezic, Gordan ED - Kusek, Mario ED - Nguyen, Ngoc-Thanh ED - Howlett, Robert ED - Jain, Lakhmi JF - Agent and Multi-Agent Systems. Technologies and Applications T3 - Lecture Notes in Computer Science PB - Springer Berlin / Heidelberg VL - 7327 SN - 978-3-642-30946-5 UR - http://dx.doi.org/10.1007/978-3-642-30947-2_16 ER - TY - JOUR T1 - OMERO: flexible, model-driven data management for experimental biology JF - NATURE METHODS Y1 - 2012 A1 - Chris Allan A1 - Jean-Marie Burel A1 - Josh Moore A1 - Colin Blackburn A1 - Melissa Linkert A1 - Scott Loynton A1 - Donald MacDonald A1 - et al. AB - Data-intensive research depends on tools that manage multidimensional, heterogeneous datasets. We built OME Remote Objects (OMERO), a software platform that enables access to and use of a wide range of biological data. OMERO uses a server-based middleware application to provide a unified interface for images, matrices and tables. OMERO's design and flexibility have enabled its use for light-microscopy, high-content-screening, electron-microscopy and even non-image-genotype data. OMERO is open-source software, available at http://openmicroscopy.org/. PB - Nature Publishing Group, a division of Macmillan Publishers Limited. All Rights Reserved. VL - 9 SN - 1548-7091 UR - http://dx.doi.org/10.1038/nmeth.1896 IS - 3 ER - TY - BOOK T1 - Optimisation of the enactment of fine-grained distributed data-intensive workflows Y1 - 2012 A1 - Chee Sun Liew AB - The emergence of data-intensive science as the fourth science paradigm has posed a data deluge challenge for enacting scientific workflows. The scientific community is facing an imminent flood of data from the next generation of experiments and simulations, besides dealing with the heterogeneity and complexity of data, applications and execution environments. New scientific workflows involve execution on distributed and heterogeneous computing resources across organisational and geographical boundaries, processing gigabytes of live data streams and petabytes of archived and simulation data, in various formats and from multiple sources. Managing the enactment of such workflows not only requires larger storage space and faster machines, but the capability to support scalability and diversity of the users, applications, data, computing resources and the enactment technologies. We argue that the enactment process can be made efficient using optimisation techniques in an appropriate architecture. This architecture should support the creation of diversified applications and their enactment on diversified execution environments, with a standard interface, i.e.~a workflow language. The workflow language should be both human readable and suitable for communication between the enactment environments. The data-streaming model central to this architecture provides a scalable approach to large-scale data exploitation. Data-flow between computational elements in the scientific workflow is implemented as streams. To cope with the exploratory nature of scientific workflows, the architecture should support fast workflow prototyping, and the re-use of workflows and workflow components. Above all, the enactment process should be easily repeated and automated. In this thesis, we present a candidate data-intensive architecture that includes an intermediate workflow language, named DISPEL. We create a new fine-grained measurement framework to capture performance-related data during enactments, and design a performance database to organise them systematically. We propose a new enactment strategy to demonstrate that optimisation of data-streaming workflows can be automated by exploiting performance data gathered during previous enactments. PB - The University of Edinburgh CY - Edinburgh ER - TY - JOUR T1 - Parallel perfusion imaging processing using GPGPU JF - Computer Methods and Programs in Biomedicine Y1 - 2012 A1 - Fan Zhu A1 - Rodríguez, David A1 - Carpenter, Trevor A1 - Malcolm Atkinson A1 - Wardlaw, Joanna KW - Deconvolution KW - GPGPU KW - Local AIF KW - Parallelization KW - Perfusion Imaging AB - Background and purpose The objective of brain perfusion quantification is to generate parametric maps of relevant hemodynamic quantities such as cerebral blood flow (CBF), cerebral blood volume (CBV) and mean transit time (MTT) that can be used in diagnosis of acute stroke. These calculations involve deconvolution operations that can be very computationally expensive when using local Arterial Input Functions (AIF). As time is vitally important in the case of acute stroke, reducing the analysis time will reduce the number of brain cells damaged and increase the potential for recovery. Methods GPUs originated as graphics generation dedicated co-processors, but modern GPUs have evolved to become a more general processor capable of executing scientific computations. It provides a highly parallel computing environment due to its large number of computing cores and constitutes an affordable high performance computing method. In this paper, we will present the implementation of a deconvolution algorithm for brain perfusion quantification on GPGPU (General Purpose Graphics Processor Units) using the CUDA programming model. We present the serial and parallel implementations of such algorithms and the evaluation of the performance gains using GPUs. Results Our method has gained a 5.56 and 3.75 speedup for CT and MR images respectively. Conclusions It seems that using GPGPU is a desirable approach in perfusion imaging analysis, which does not harm the quality of cerebral hemodynamic maps but delivers results faster than the traditional computation. UR - http://www.sciencedirect.com/science/article/pii/S0169260712001587 ER - TY - BOOK T1 - (PhD Thesis) Brain Perfusion Imaging - Performance and Accuracy Y1 - 2012 A1 - Fan Zhu AB - Title: Brain Perfusion Imaging - Performance and Accuracy Abstract: Brain perfusion weighted images acquired using dynamic contrast studies have an important clinical role in acute stroke diagnosis and treatment decisions. The purpose of my PhD research is to develop novel methodologies for improving the efficiency and quality of brain perfusion-imaging analysis so that clinical decisions can be made more accurately and in shorter time. This thesis consists of three parts: 1. My research investigates the possibilities that parallel computing brings to make perfusion-imaging analysis faster in order to deliver results that are used in stroke diagnosis earlier. Brain perfusion analysis using local Arterial Input Functions (AIF) technique takes a long time to execute due to its heavy computational load. As time is vitally important in the case of acute stroke, reducing analysis time and therefore diagnosis time can reduce the number of brain cells damaged and improve the chances for patient recovery. We present the implementation of a deconvolution algorithm for brain perfusion quantification on GPGPU (General Purpose computing on Graphics Processing Units) using the CUDA programming model. Our method aims to accelerate the process without any quality loss. 2. Specific features of perfusion source images are also used to reduce noise impact, which consequently improves the accuracy of hemodynamic maps. The majority of existing approaches for denoising CT images are optimized for 3D (spatial) information, including spatial decimation (spatially weighted mean filters) and techniques based on wavelet and curvelet transforms. However, perfusion imaging data is 4D as it also contains temporal information. Our approach using Gaussian process regression (GPR) makes use of the temporal information in the perfusion source imges to reduce the noise level. Over the entire image, our noise reduction method based on Gaussian process regression gains a 99% contrast-to-noise ratio improvement over the raw image and also improves the quality of hemodynamic maps, allowing a better identification of edges and detailed information. At the level of individual voxels, GPR provides a stable baseline, helps identify key parameters from tissue time-concentration curves and reduces the oscillations in the curves. Furthermore, the results shows that GPR is superior to the alternative techniques compared in this study. 3. My research also explores automatic segmentation of perfusion images into potentially healthy areas and lesion areas which can be used as additional information that assists in clinical diagnosis. Since perfusion source images contain more information than hemodynamic maps, good utilisation of source images leads to better understanding than the hemodynamic maps alone. Correlation coefficient tests are used to measure the similarities between the expected tissue time-concentration curves (from (reference tissue)) and the measured time-concentration curves (from target tissue). This information is then used to distinguish tissues at risk and dead tissues from healthy tissues. A correlation coefficient based signal analysis method that directly spots suspected lesion areas from perfusion source images is presented. Our method delivers a clear automatic segmentation of healthy tissue, tissue at risk and dead tissue. From our segmentation maps, it is easier to identify lesion boundaries than using traditional hemodynamic maps. ER - TY - JOUR T1 - Principles of Provenance (Dagstuhl Seminar 12091) JF - Dagstuhl Reports Y1 - 2012 A1 - James Cheney A1 - Anthony Finkelstein A1 - Bertram Ludäscher A1 - Stijn Vansummeren VL - 2 ER - TY - JOUR T1 - Requirements for Provenance on the Web JF - IJDC Y1 - 2012 A1 - Paul T. Groth A1 - Yolanda Gil A1 - James Cheney A1 - Simon Miles VL - 7 ER - TY - Generic T1 - A Review of Attacks and Security Approaches in Open Multi-agent Systems Y1 - 2012 A1 - Shahriar Bijani A1 - David Robertson AB - Open multi-agent systems (MASs) have growing popularity in the Multi-agent Systems community and are predicted to have many applications in future, as large scale distributed systems become more widespread. A major practical limitation to open MASs is security because the openness of such systems negates many traditional security solutions. In this paper we introduce and classify main attacks on open MASs. We then survey and analyse various security techniques in the literature and categorise them under prevention and detection approaches. Finally, we suggest which security technique is an appropriate countermeasure for which classes of attack. ER - TY - CONF T1 - A Science Gateway Getting Ready for Serving the International Molecular Simulation Community T2 - Proceedings of Science Y1 - 2012 A1 - Gesing, Sandra A1 - Herres-Pawlis, Sonja A1 - Birkenheuer, Georg A1 - Brinkmann, André A1 - Grunzke, Richard A1 - Kacsuk, Peter A1 - Kohlbacher, Oliver A1 - Kozlovszky, Miklos A1 - Krüger, Jens A1 - Müller-Pfefferkorn, Ralph A1 - Schäfer, Patrick A1 - Steinke, Thomas JF - Proceedings of Science ER - TY - JOUR T1 - Searching in peer-to-peer networks JF - Computer Science Review (Elsevier) Y1 - 2012 A1 - I.A. Klampanos A1 - J.M. Jose UR - http://www.sciencedirect.com/science/article/pii/S1574013712000238 ER - TY - JOUR T1 - SIMCAN: A flexible, scalable and expandable simulation platform for modelling and simulating distributed architectures and applications JF - Simulation Modelling Practice and Theory Y1 - 2012 A1 - Alberto Nuñez A1 - Javier Fernández A1 - Rosa Filgueira A1 - Félix García Carballeira A1 - Jesús Carretero VL - 20 ER - TY - JOUR T1 - A Single Sign-On Infrastructure for Science Gateways on a Use Case for Structural Bioinformatics JF - Journal of Grid Computing Y1 - 2012 A1 - Gesing, Sandra A1 - Grunzke, Richard A1 - Krüger, Jens A1 - Birkenheuer, Georg A1 - Wewior, Martin A1 - Schäfer, Patrick A1 - Schuller, Bernd A1 - Schuster, Johannes A1 - Herres-Pawlis, Sonja A1 - Breuers, Sebastian A1 - Balaskó, Ákos A1 - Kozlovszky, Miklos A1 - Fabri, AnnaSzikszay A1 - Packschies, Lars A1 - Kacsuk, Peter A1 - Blunk, Dirk A1 - Steinke, Thomas A1 - Brinkmann, André A1 - Fels, Gregor A1 - Müller-Pfefferkorn, Ralph A1 - Jäkel, René A1 - Kohlbacher, Oliver KW - DCIs KW - Science gateway KW - security KW - Single sign-on KW - Structural bioinformatics VL - 10 UR - http://dx.doi.org/10.1007/s10723-012-9247-y ER - TY - CONF T1 - The Use of Reputation as Noise-resistant Selection Bias in a Co-evolutionary Multi-agent System T2 - Genetic and Evolutionary Computation Conference Y1 - 2012 A1 - Nikolaos Chatzinikolaou A1 - Dave Robertson JF - Genetic and Evolutionary Computation Conference CY - Philadelphia ER - TY - CONF T1 - Workflow-enhanced conformational analysis of guanidine zinc complexes via a science gateway T2 - HealthGrid Applications and Technologies Meet Science Gateways for Life Sciences Y1 - 2012 A1 - Herres-Pawlis, Sonja A1 - Birkenheuer, Georg A1 - Brinkmann, André A1 - Gesing, Sandra A1 - Grunzke, Richard A1 - Jäkel, René A1 - Kohlbacher, Oliver A1 - Krüger, Jens A1 - Dos Santos Vieira, Ines JF - HealthGrid Applications and Technologies Meet Science Gateways for Life Sciences PB - IOS Press ER - TY - CONF T1 - Automatic Agent Protocol Generation from Argumentation T2 - 13th European Agent Systems Summer Schoo Y1 - 2011 A1 - Ashwag Omar Maghraby JF - 13th European Agent Systems Summer Schoo ER - TY - JOUR T1 - Automatically Identifying and Annotating Mouse Embryo Gene Expression Patterns JF - Bioinformatics Y1 - 2011 A1 - Liangxiu Han A1 - van Hemert, Jano A1 - Richard Baldock KW - classification KW - e-Science AB - Motivation: Deciphering the regulatory and developmental mechanisms for multicellular organisms requires detailed knowledge of gene interactions and gene expressions. The availability of large datasets with both spatial and ontological annotation of the spatio-temporal patterns of gene-expression in mouse embryo provides a powerful resource to discover the biological function of embryo organisation. Ontological annotation of gene expressions consists of labelling images with terms from the anatomy ontology for mouse development. If the spatial genes of an anatomical component are expressed in an image, the image is then tagged with a term of that anatomical component. The current annotation is done manually by domain experts, which is both time consuming and costly. In addition, the level of detail is variable and inevitably, errors arise from the tedious nature of the task. In this paper, we present a new method to automatically identify and annotate gene expression patterns in the mouse embryo with anatomical terms. Results: The method takes images from in situ hybridisation studies and the ontology for the developing mouse embryo, it then combines machine learning and image processing techniques to produce classifiers that automatically identify and annotate gene expression patterns in these images.We evaluate our method on image data from the EURExpress-II study where we use it to automatically classify nine anatomical terms: humerus, handplate, fibula, tibia, femur, ribs, petrous part, scapula and head mesenchyme. The accuracy of our method lies between 70–80% with few exceptions. Conclusions: We show that other known methods have lower classification performance than ours. We have investigated the images misclassified by our method and found several cases where the original annotation was not correct. This shows our method is robust against this kind of noise. Availability: The annotation result and the experimental dataset in the paper can be freely accessed at http://www2.docm.mmu.ac.uk/STAFF/L.Han/geneannotation/ Contact: l.han@mmu.ac.uk, j.vanhemert@ed.ac.uk and Richard.Baldock@hgu.mrc.ac.uk VL - 27 UR - http://bioinformatics.oxfordjournals.org/content/early/2011/02/25/bioinformatics.btr105.abstract ER - TY - JOUR T1 - Discovering the suitability of optimisation algorithms by learning from evolved instances JF - Annals of Mathematics and Artificial Intelligence Y1 - 2011 A1 - K. Smith-Miles A1 - {van Hemert}, J. I. KW - problem evolving VL - Online Fir UR - http://www.springerlink.com/content/6x83q3201gg71554/ ER - TY - CHAP T1 - DISPEL Reference Manual T2 - Advanced Data Mining and Integration Research for Europe (ADMIRE) Y1 - 2011 A1 - Paul Martin A1 - Yaikhom, Gagarine KW - DISPEL, ADMIRE AB - Reference manual for the Data Intensive Systems Process Engineering Language (DISPEL). JF - Advanced Data Mining and Integration Research for Europe (ADMIRE) UR - www.admire-project.eu ER - TY - RPRT T1 - EDIM1 Progress Report Y1 - 2011 A1 - Paul Martin A1 - Malcolm Atkinson A1 - Parsons, Mark A1 - Adam Carter A1 - Gareth Francis AB - The Edinburgh Data-Intensive Machine (EDIM1) is the product of a joint collaboration between the data-intensive group at the School of Informatics and EPCC. EDIM1 is an experimental system, offering an alternative architecture for data-intensive computation and providing a platform for evaluating tools for data-intensive research; a 120 node cluster of ‘data-bricks’ with high storage yet modest computational capacity. This document gives some background into the context in which EDIM1 was designed and constructed, as well as providing an overview of its use so far and future plans. ER - TY - JOUR T1 - An evaluation of ontology matching in geo-service applications JF - Geoinformatica Y1 - 2011 A1 - Lorenzino Vaccari A1 - Pavel Shvaiko A1 - Juan Pane A1 - Paolo Besana A1 - Maurizio Marchese ER - TY - JOUR T1 - Generating web-based user interfaces for computational science JF - Concurrency and Computation: Practice and Experience Y1 - 2011 A1 - van Hemert, J. A1 - Koetsier, J. A1 - Torterolo, L. A1 - Porro, I. A1 - Melato, M. A1 - Barbera, R. AB - Scientific gateways in the form of web portals are becoming the popular approach to share knowledge and resources around a topic in a community of researchers. Unfortunately, the development of web portals is expensive and requires specialists skills. Commercial and more generic web portals have a much larger user base and can afford this kind of development. Here we present two solutions that address this problem in the area of portals for scientific computing; both take the same approach. The whole process of designing, delivering and maintaining a portal can be made more cost-effective by generating a portal from a description rather than programming in the traditional sense. We show four successful use cases to show how this process works and the results it can deliver. PB - Wiley VL - 23 ER - TY - JOUR T1 - A Generic Parallel Processing Model for Facilitating Data Mining and Integration JF - Parallel Computing Y1 - 2011 A1 - Liangxiu Han A1 - Chee Sun Liew A1 - van Hemert, Jano A1 - Malcolm Atkinson KW - Data Mining and Data Integration (DMI) KW - Life Sciences KW - OGSA-DAI KW - Parallelism KW - Pipeline Streaming KW - workflow AB - To facilitate Data Mining and Integration (DMI) processes in a generic way, we investigate a parallel pipeline streaming model. We model a DMI task as a streaming data-flow graph: a directed acyclic graph (DAG) of Processing Elements PEs. The composition mechanism links PEs via data streams, which may be in memory, buffered via disks or inter-computer data-flows. This makes it possible to build arbitrary DAGs with pipelining and both data and task parallelisms, which provides room for performance enhancement. We have applied this approach to a real DMI case in the Life Sciences and implemented a prototype. To demonstrate feasibility of the modelled DMI task and assess the efficiency of the prototype, we have also built a performance evaluation model. The experimental evaluation results show that a linear speedup has been achieved with the increase of the number of distributed computing nodes in this case study. PB - Elsevier VL - 37 IS - 3 ER - TY - CONF T1 - Granular Security for a Science Gateway in Structural Bioinformatics T2 - Proceedings of the International Workshop on Science Gateways for Life Sciences (IWSG-Life 2011) Y1 - 2011 A1 - Gesing, Sandra A1 - Grunzke, Richard A1 - Balaskó, Ákos A1 - Birkenheuer, Georg A1 - Blunk, Dirk A1 - Breuers, Sebastian A1 - Brinkmann, André A1 - Fels, Gregor A1 - Herres-Pawlis, Sonja A1 - Kacsuk, Peter A1 - Kozlovszky, Miklos A1 - Krüger, Jens A1 - Packschies, Lars A1 - Schäfer, Patrick A1 - Schuller, Bernd A1 - Schuster, Johannes A1 - Steinke, Thomas A1 - Szikszay Fabri, Anna A1 - Wewior, Martin A1 - Müller-Pfefferkorn, Ralph A1 - Kohlbacher, Oliver JF - Proceedings of the International Workshop on Science Gateways for Life Sciences (IWSG-Life 2011) PB - CEUR Workshop Proceedings ER - TY - Generic T1 - Intrusion Detection in Open Peer-to-Peer Multi-agent Systems T2 - 5th International Conference on Autonomous Infrastructure, Management and Security (AIMS 2011) Y1 - 2011 A1 - Shahriar Bijani A1 - David Robertson AB - One way to build large-scale autonomous systems is to develop open peer-to-peer architectures in which peers are not pre-engineered to work together and in which peers themselves determine the social norms that govern collective behaviour. A major practical limitation to such systems is security because the very openness of such systems negates most traditional security solutions. We propose a programme of research that addresses this problem by devising ways of attack detection and damage limitation that take advantage of social norms described by electronic institutions. We have analysed security issues of open peer-to-peer multi-agent systems and focused on probing attacks against confidentiality. We have proposed a framework and adapted an inference system, which shows the possibility of private information disclosure by an adversary. We shall suggest effective countermeasures in such systems and propose attack response techniques to limit possible damages. JF - 5th International Conference on Autonomous Infrastructure, Management and Security (AIMS 2011) T3 - Managing the dynamics of networks and services PB - Springer-Verlag Berlin SN - 978-3-642-21483-7 ER - TY - JOUR T1 - Managing dynamic enterprise and urgent workloads on clouds using layered queuing and historical performance models JF - Simulation Modelling Practice and Theory Y1 - 2011 A1 - David A. Bacigalupo A1 - van Hemert, Jano I. A1 - Xiaoyu Chen A1 - Asif Usmani A1 - Adam P. Chester A1 - Ligang He A1 - Donna N. Dillenberger A1 - Gary B. Wills A1 - Lester Gilbert A1 - Stephen A. Jarvis KW - e-Science AB - The automatic allocation of enterprise workload to resources can be enhanced by being able to make what–if response time predictions whilst different allocations are being considered. We experimentally investigate an historical and a layered queuing performance model and show how they can provide a good level of support for a dynamic-urgent cloud environment. Using this we define, implement and experimentally investigate the effectiveness of a prediction-based cloud workload and resource management algorithm. Based on these experimental analyses we: (i) comparatively evaluate the layered queuing and historical techniques; (ii) evaluate the effectiveness of the management algorithm in different operating scenarios; and (iii) provide guidance on using prediction-based workload and resource management. VL - 19 ER - TY - CONF T1 - Optimum Platform Selection and Configuration for Computational Jobs T2 - All Hands Meeting 2011 Y1 - 2011 A1 - Gary McGilvary A1 - Malcolm Atkinson A1 - Barker, Adam A1 - Ashley Lloyd AB - The performance and cost of many scientific applications which execute on a variety of High Performance Computing (HPC), local cluster environments and cloud services could be enhanced, and costs reduced if the platform was carefully selected on a per-application basis and the application itself was optimally configured for a given platform. With a wide-variety of computing platforms on offer, each possessing different properties, all too frequently platform decisions are made on an ad-hoc basis with limited ‘black-box’ information. The limitless number of possible application configurations also make it difficult for an individual who wants to achieve cost-effective results with the maximum performance available. Such individuals may include biomedical researchers analysing microarray data, software developers running aviation simulations or bankers performing risk assessments. However in either case, it is likely that many may not have the required knowledge to select the optimum platform and setup for their application; to do so, would require extensive knowledge of their applications and various platforms. In this paper we describe a framework that aims to resolve such issues by (i) reducing the detail required in the decision making process by placing this information within a selection framework, thereby (ii) maximising an application’s performance gain and/or reducing costs. We present a set of preliminary results where we compare the performance of running the Simple Parallel R INTerface (SPRINT) over a variety of platforms. SPRINT is a framework providing parallel functions of the statistical package R, allowing post genomic data to be easily analysed on HPC resources [1]. We run SPRINT on Amazon’s Elastic Compute Cloud (EC2) to compare the performance with the results obtained from HECToR, the UK’s National Supercomputing Service, and the Edinburgh Compute and Data Facilities (ECDF) cluster. JF - All Hands Meeting 2011 CY - York ER - TY - CONF T1 - A Parallel Deconvolution Algorithm in Perfusion Imaging T2 - Healthcare Informatics, Imaging, and Systems Biology (HISB) Y1 - 2011 A1 - Zhu, Fan. A1 - Rodríguez, David A1 - Carpenter, Trevor A1 - Malcolm Atkinson A1 - Wardlaw, Joanna KW - Deconvolution KW - GPGPU KW - Parallelization KW - Perfusion Imaging AB - In this paper, we will present the implementation of a deconvolution algorithm for brain perfusion quantification on GPGPU (General Purpose Graphics Processor Units) using the CUDA programming model. GPUs originated as graphics generation dedicated co-processors, but the modern GPUs have evolved to become a more general processor capable of executing scientific computations. It provides a highly parallel computing environment due to its huge number of computing cores and constitutes an affordable high performance computing method. The objective of brain perfusion quantification is to generate parametric maps of relevant haemodynamic quantities such as Cerebral Blood Flow (CBF), Cerebral Blood Volume (CBV) and Mean Transit Time (MTT) that can be used in diagnosis of conditions such as stroke or brain tumors. These calculations involve deconvolution operations that in the case of using local Arterial Input Functions (AIF) can be very expensive computationally. We present the serial and parallel implementations of such algorithm and the evaluation of the performance gains using GPUs. JF - Healthcare Informatics, Imaging, and Systems Biology (HISB) CY - San Jose, California SN - 978-1-4577-0325-6 UR - http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6061411&tag=1 ER - TY - JOUR T1 - Performance database: capturing data for optimizing distributed streaming workflows JF - Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences Y1 - 2011 A1 - Chee Sun Liew A1 - Atkinson, Malcolm P. A1 - Radoslaw Ostrowski A1 - Murray Cole A1 - van Hemert, Jano I. A1 - Liangxiu Han KW - measurement framework KW - performance data KW - streaming workflows AB - The performance database (PDB) stores performance-related data gathered during workflow enactment. We argue that by carefully understanding and manipulating this data, we can improve efficiency when enacting workflows. This paper describes the rationale behind the PDB, and proposes a systematic way to implement it. The prototype is built as part of the Advanced Data Mining and Integration Research for Europe project. We use workflows from real-world experiments to demonstrate the usage of PDB. VL - 369 IS - 1949 ER - TY - Generic T1 - Probing Attacks on Multi-agent Systems using Electronic Institutions T2 - Declarative Agent Languages and Technologies Workshop (DALT), AAMAS 2011 Y1 - 2011 A1 - Shahriar Bijani A1 - David Robertson A1 - David Aspinall JF - Declarative Agent Languages and Technologies Workshop (DALT), AAMAS 2011 ER - TY - CONF T1 - RapidBrain: Developing a Portal for Brain Research Imaging T2 - All Hands Meeting 2011, York Y1 - 2011 A1 - Kenton D'Mellow A1 - Rodríguez, David A1 - Carpenter, Trevor A1 - Jos Koetsier A1 - Dominic Job A1 - van Hemert, Jano A1 - Wardlaw, Joanna A1 - Fan Zhu AB - Brain imaging researchers execute complex multistep workflows in their computational analysis. Those workflows often include applications that have very different user interfaces and sometimes use different data formats. A good example is the brain perfusion quantification workflow used at the BRIC (Brain Research Imaging Centre) in Edinburgh. Rapid provides an easy method for creating portlets for computational jobs, and at the same it is extensible. We have exploited this extensibility with additions that stretch the functionality beyond the original limits. These changes can be used by other projects to create their own portals, but it should be noted that the development of such portals involve a greater effort than the required in the regular use of Rapid for creating portlets. In our case it has been used to provide a user-friendly interface for perfusion analysis that covers from volume JF - All Hands Meeting 2011, York CY - York ER - TY - CONF T1 - A Science Gateway for Molecular Simulations T2 - EGI User Forum 2011 Y1 - 2011 A1 - Gesing, Sandra A1 - Kacsuk, Peter A1 - Kozlovszky, Miklos A1 - Birkenheuer, Georg A1 - Blunk, Dirk A1 - Breuers, Sebastian A1 - Brinkmann, André A1 - Fels, Gregor A1 - Grunzke, Richard A1 - Herres-Pawlis, Sonja A1 - Krüger, Jens A1 - Packschies, Lars A1 - Müller-Pfefferkorn, Ralph A1 - Schäfer, Patrick A1 - Steinke, Thomas A1 - Szikszay Fabri, Anna A1 - Warzecha, Klaus A1 - Wewior, Martin A1 - Kohlbacher, Oliver JF - EGI User Forum 2011 SN - 978 90 816927 1 7 ER - TY - JOUR T1 - Special Issue: Portals for life sciences---Providing intuitive access to bioinformatic tools JF - Concurrency and Computation: Practice and Experience Y1 - 2011 A1 - Gesing, Sandra A1 - van Hemert, J. A1 - Kacsuk, P. A1 - Kohlbacher, O. AB - The topic ‘Portals for life sciences’ includes various research fields, on the one hand many different topics out of life sciences, e.g. mass spectrometry, on the other hand portal technologies and different aspects of computer science, such as usability of user interfaces and security of systems. The main aspect about portals is to simplify the user’s interaction with computational resources that are concerted to a supported application domain. PB - Wiley VL - 23 IS - 23 ER - TY - JOUR T1 - A user-friendly web portal for T-Coffee on supercomputers JF - BMC Bioinformatics Y1 - 2011 A1 - J. Rius A1 - F. Cores A1 - F. Solsona A1 - van Hemert, J. I. A1 - Koetsier, J. A1 - C. Notredame KW - e-Science KW - portal KW - rapid AB - Background Parallel T-Coffee (PTC) was the first parallel implementation of the T-Coffee multiple sequence alignment tool. It is based on MPI and RMA mechanisms. Its purpose is to reduce the execution time of the large-scale sequence alignments. It can be run on distributed memory clusters allowing users to align data sets consisting of hundreds of proteins within a reasonable time. However, most of the potential users of this tool are not familiar with the use of grids or supercomputers. Results In this paper we show how PTC can be easily deployed and controlled on a super computer architecture using a web portal developed using Rapid. Rapid is a tool for efficiently generating standardized portlets for a wide range of applications and the approach described here is generic enough to be applied to other applications, or to deploy PTC on different HPC environments. Conclusions The PTC portal allows users to upload a large number of sequences to be aligned by the parallel version of TC that cannot be aligned by a single machine due to memory and execution time constraints. The web portal provides a user-friendly solution. VL - 12 UR - http://www.biomedcentral.com/1471-2105/12/150 ER - TY - JOUR T1 - Validation and mismatch repair of workflows through typed data streams JF - Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences Y1 - 2011 A1 - Yaikhom, Gagarine A1 - Malcolm Atkinson A1 - van Hemert, Jano A1 - Oscar Corcho A1 - Krause, Amy AB - The type system of a language guarantees that all of the operations on a set of data comply with the rules and conditions set by the language. While language typing is a fundamental requirement for any programming language, the typing of data that flow between processing elements within a workflow is currently being treated as optional. In this paper, we introduce a three-level type system for typing workflow data streams. These types are parts of the Data Intensive System Process Engineering Language programming language, which empowers users with the ability to validate the connections inside a workflow composition, and apply appropriate data type conversions when necessary. Furthermore, this system enables the enactment engine in carrying out type-directed workflow optimizations. VL - 369 IS - 1949 ER - TY - CONF T1 - Accelerating Data-Intensive Applications: a Cloud Computing Approach Image Pattern Recognition Tasks T2 - The Fourth International Conference on Advanced Engineering Computing and Applications in Sciences Y1 - 2010 A1 - Han, L A1 - Saengngam, T. A1 - van Hemert, J. JF - The Fourth International Conference on Advanced Engineering Computing and Applications in Sciences ER - TY - JOUR T1 - Adaptive CoMPI: Enhancing MPI based applications performance and scalability by using adaptive compression. JF - International Journal of High Performance Computing and Applications, 2010. Sage Y1 - 2010 A1 - Rosa Filgueira A1 - David E. Singh A1 - Alejandro Calderón A1 - Félix García Carballeira A1 - Jesús Carretero AB - This paper presents an optimization of MPI communication, called Adaptive-CoMPI, based on runtime compression of MPI messages exchanged by applications. The technique developed can be used for any application, because its implementation is transparent for the user, and integrates different compression algorithms for both MPI collective and point-to-point primitives. Furthermore, compression is turned on and off and the most appropriate compression algorithms are selected at runtime, depending on the characteristics of each message, the network behavior, and compression algorithm behavior, following a runtime adaptive strategy. Our system can be optimized for a specific application, through a guided strategy, to reduce the runtime strategy overhead. Adaptive-CoMPI has been validated using several MPI benchmarks and real HPC applications. Results show that, in most cases, by using adaptive compression, communication time is reduced, enhancing application performance and scalability. IS - 25 (3) ER - TY - JOUR T1 - Comparing Clinical Decision Support Systems for Recruitment in Clinical Trials JF - Journal of Medical Informatics Y1 - 2010 A1 - Marc Cuggia A1 - Paolo Besana A1 - David Glasspool. ER - TY - JOUR T1 - Correcting for intra-experiment variation in Illumina BeadChip data is necessary to generate robust gene-expression profiles JF - BMC Genomics Y1 - 2010 A1 - R. R. Kitchen A1 - V. S. Sabine A1 - A. H. Sims A1 - E. J. Macaskill A1 - L. Renshaw A1 - J. S. Thomas A1 - van Hemert, J. I. A1 - J. M. Dixon A1 - J. M. S. Bartlett AB - Background Microarray technology is a popular means of producing whole genome transcriptional profiles, however high cost and scarcity of mRNA has led many studies to be conducted based on the analysis of single samples. We exploit the design of the Illumina platform, specifically multiple arrays on each chip, to evaluate intra-experiment technical variation using repeated hybridisations of universal human reference RNA (UHRR) and duplicate hybridisations of primary breast tumour samples from a clinical study. Results A clear batch-specific bias was detected in the measured expressions of both the UHRR and clinical samples. This bias was found to persist following standard microarray normalisation techniques. However, when mean-centering or empirical Bayes batch-correction methods (ComBat) were applied to the data, inter-batch variation in the UHRR and clinical samples were greatly reduced. Correlation between replicate UHRR samples improved by two orders of magnitude following batch-correction using ComBat (ranging from 0.9833-0.9991 to 0.9997-0.9999) and increased the consistency of the gene-lists from the duplicate clinical samples, from 11.6% in quantile normalised data to 66.4% in batch-corrected data. The use of UHRR as an inter-batch calibrator provided a small additional benefit when used in conjunction with ComBat, further increasing the agreement between the two gene-lists, up to 74.1%. Conclusion In the interests of practicalities and cost, these results suggest that single samples can generate reliable data, but only after careful compensation for technical bias in the experiment. We recommend that investigators appreciate the propensity for such variation in the design stages of a microarray experiment and that the use of suitable correction methods become routine during the statistical analysis of the data. VL - 11 UR - http://www.biomedcentral.com/1471-2164/11/134 IS - 134 ER - TY - RPRT T1 - Data-Intensive Research Workshop (15-19 March 2010) Report Y1 - 2010 A1 - Malcolm Atkinson A1 - Roure, David De A1 - van Hemert, Jano A1 - Shantenu Jha A1 - Ruth McNally A1 - Robert Mann A1 - Stratis Viglas A1 - Chris Williams KW - Data-intensive Computing KW - Data-Intensive Machines KW - Machine Learning KW - Scientific Databases AB - We met at the National e-Science Institute in Edinburgh on 15-19 March 2010 to develop our understanding of DIR. Approximately 100 participants (see Appendix A) worked together to develop their own understanding, and we are offering this report as the first step in communicating that to a wider community. We present this in turns of our developing/emerging understanding of "What is DIR?" and "Why it is important?'". We then review the status of the field, report what the workshop achieved and what remains as open questions. JF - National e-Science Centre PB - Data-Intensive Research Group, School of Informatics, University of Edinburgh CY - Edinburgh ER - TY - JOUR T1 - Dynamic-CoMPI: Dynamic optimization techniques for MPI parallel applications. JF - The Journal of Supercomputing. Y1 - 2010 A1 - Rosa Filgueira A1 - Jesús Carretero A1 - David E. Singh A1 - Alejandro Calderón A1 - Alberto Nunez KW - Adaptive systems KW - Clusters architectures KW - Collective I/O KW - Compression algorithms KW - Heuristics KW - MPI library - Parallel techniques AB - This work presents an optimization of MPI communications, called Dynamic-CoMPI, which uses two techniques in order to reduce the impact of communications and non-contiguous I/O requests in parallel applications. These techniques are independent of the application and complementaries to each other. The first technique is an optimization of the Two-Phase collective I/O technique from ROMIO, called Locality aware strategy for Two-Phase I/O (LA-Two-Phase I/O). In order to increase the locality of the file accesses, LA-Two-Phase I/O employs the Linear Assignment Problem (LAP) for finding an optimal I/O data communication schedule. The main purpose of this technique is the reduction of the number of communications involved in the I/O collective operation. The second technique, called Adaptive-CoMPI, is based on run-time compression of MPI messages exchanged by applications. Both techniques can be applied on every application, because both of them are transparent for the users. Dynamic-CoMPI has been validated by using several MPI benchmarks and real HPC applications. The results show that, for many of the considered scenarios, important reductions in the execution time are achieved by reducing the size and the number of the messages. Additional benefits of our approach are the reduction of the total communication time and the network contention, thus enhancing, not only performance, but also scalability. PB - Springer ER - TY - CHAP T1 - Exploiting P2P and Grid Computing Technologies for Resource Sharing to support High Performance Distributed System T2 - Handbook of Research on P2P and Grid Systems for Service-Oriented Computing: Models, Methodologies and Applications Y1 - 2010 A1 - Liangxiu Han ED - Nick Antonopoulos ED - Georgios Exarchakos ED - Maozhen Li ED - Antonio Liottac JF - Handbook of Research on P2P and Grid Systems for Service-Oriented Computing: Models, Methodologies and Applications PB - IGI Global publishing VL - 1 ER - TY - Generic T1 - Federated Enactment of Workflow Patterns T2 - Lecture Notes in Computer Science Y1 - 2010 A1 - Yaikhom, Gagarine A1 - Liew, Chee A1 - Liangxiu Han A1 - van Hemert, Jano A1 - Malcolm Atkinson A1 - Krause, Amy ED - D’Ambra, Pasqua ED - Guarracino, Mario ED - Talia, Domenico AB - In this paper we address two research questions concerning workflows: 1) how do we abstract and catalogue recurring workflow patterns?; and 2) how do we facilitate optimisation of the mapping from workflow patterns to actual resources at runtime? Our aim here is to explore techniques that are applicable to large-scale workflow compositions, where the resources could change dynamically during the lifetime of an application. We achieve this by introducing a registry-based mechanism where pattern abstractions are catalogued and stored. In conjunction with an enactment engine, which communicates with this registry, concrete computational implementations and resources are assigned to these patterns, conditional to the execution parameters. Using a data mining application from the life sciences, we demonstrate this new approach. JF - Lecture Notes in Computer Science PB - Springer Berlin / Heidelberg VL - 6271 UR - http://dx.doi.org/10.1007/978-3-642-15277-1_31 N1 - 10.1007/978-3-642-15277-1_31 ER - TY - CONF T1 - Grid-Workflows in Molecular Science T2 - Software Engineering 2010, Grid Workflow Workshop Y1 - 2010 A1 - Birkenheuer, Georg A1 - Breuers, Sebastian A1 - Brinkmann, André A1 - Blunk, Dirk A1 - Fels, Gregor A1 - Gesing, Sandra A1 - Herres-Pawlis, Sonja A1 - Kohlbacher, Oliver A1 - Krüger, Jens A1 - Packschies, Lars JF - Software Engineering 2010, Grid Workflow Workshop PB - GI-Edition - Lecture Notes in Informatics (LNI) ER - TY - JOUR T1 - Integrating distributed data sources with OGSA--DAI DQP and Views JF - Philosophical Transactions A Y1 - 2010 A1 - Dobrzelecki, B. A1 - Krause, A. A1 - Hume, A. C. A1 - Grant, A. A1 - Antonioletti, M. A1 - Alemu, T. Y. A1 - Atkinson, M. A1 - Jackson, M. A1 - Theocharopoulos, E. AB - OGSA-DAI (Open Grid Services Architecture Data Access and Integration) is a framework for building distributed data access and integration systems. Until recently, it lacked the built-in functionality that would allow easy creation of federations of distributed data sources. The latest release of the OGSA-DAI framework introduced the OGSA-DAI DQP (Distributed Query Processing) resource. The new resource encapsulates a distributed query processor, that is able to orchestrate distributed data sources when answering declarative user queries. The query processor has many extensibility points, making it easy to customize. We have also introduced a new OGSA-DAI Views resource that provides a flexible method for defining views over relational data. The interoperability of the two new resources, together with the flexibility of the OGSA-DAI framework, allows the building of highly customized data integration solutions. VL - 368 ER - TY - CHAP T1 - Molecular Orbital Calculations of Inorganic Compounds T2 - Inorganic Experiments Y1 - 2010 A1 - C. A. Morrison A1 - N. Robertson A1 - Turner, A. A1 - van Hemert, J. A1 - Koetsier, J. ED - J. Derek Woollins JF - Inorganic Experiments PB - Wiley-VCH SN - 978-3527292530 ER - TY - CONF T1 - The MoSGrid Gaussian Portlet – Technologies for the Implementation of Portlets for Molecular Simulations T2 - Proceedings of the International Workshop on Science Gateways (IWSG10) Y1 - 2010 A1 - Wewior, Martin A1 - Packschies, Lars A1 - Blunk, Dirk A1 - Wickeroth, D. A1 - Warzecha, Klaus A1 - Herres-Pawlis, Sonja A1 - Gesing, Sandra A1 - Breuers, Sebastian A1 - Krüger, Jens A1 - Birkenheuer, Georg A1 - Lang, Ulrich ED - Barbera, Roberto ED - Andronico, Giuseppe ED - La Rocca, Giuseppe JF - Proceedings of the International Workshop on Science Gateways (IWSG10) PB - Consorzio COMETA ER - TY - JOUR T1 - An open source toolkit for medical imaging de-identification JF - European Radiology Y1 - 2010 A1 - Rodríguez, David A1 - Carpenter, Trevor K. A1 - van Hemert, Jano I. A1 - Wardlaw, Joanna M. KW - Anonymisation KW - Data Protection Act (DPA) KW - De-identification KW - Digital Imaging and Communications in Medicine (DICOM) KW - Privacy policies KW - Pseudonymisation KW - Toolkit AB - Objective Medical imaging acquired for clinical purposes can have several legitimate secondary uses in research projects and teaching libraries. No commonly accepted solution for anonymising these images exists because the amount of personal data that should be preserved varies case by case. Our objective is to provide a flexible mechanism for anonymising Digital Imaging and Communications in Medicine (DICOM) data that meets the requirements for deployment in multicentre trials. Methods We reviewed our current de-identification practices and defined the relevant use cases to extract the requirements for the de-identification process. We then used these requirements in the design and implementation of the toolkit. Finally, we tested the toolkit taking as a reference those requirements, including a multicentre deployment. Results The toolkit successfully anonymised DICOM data from various sources. Furthermore, it was shown that it could forward anonymous data to remote destinations, remove burned-in annotations, and add tracking information to the header. The toolkit also implements the DICOM standard confidentiality mechanism. Conclusion A DICOM de-identification toolkit that facilitates the enforcement of privacy policies was developed. It is highly extensible, provides the necessary flexibility to account for different de-identification requirements and has a low adoption barrier for new users. VL - 20 UR - http://www.springerlink.com/content/j20844338623m167/ IS - 8 ER - TY - JOUR T1 - Quality control for quantitative PCR based on amplification compatibility test JF - Methods Y1 - 2010 A1 - Tichopad, Ales A1 - Tzachi Bar A1 - Ladislav Pecen A1 - Robert R. Kitchen A1 - Kubista, Mikael A1 - Michael W. Pfaffl AB - Quantitative qPCR is a routinely used method for the accurate quantification of nucleic acids. Yet it may generate erroneous results if the amplification process is obscured by inhibition or generation of aberrant side-products such as primer dimers. Several methods have been established to control for pre-processing performance that rely on the introduction of a co-amplified reference sequence, however there is currently no method to allow for reliable control of the amplification process without directly modifying the sample mix. Herein we present a statistical approach based on multivariate analysis of the amplification response data generated in real-time. The amplification trajectory in its most resolved and dynamic phase is fitted with a suitable model. Two parameters of this model, related to amplification efficiency, are then used for calculation of the Z-score statistics. Each studied sample is compared to a predefined reference set of reactions, typically calibration reactions. A probabilistic decision for each individual Z-score is then used to identify the majority of inhibited reactions in our experiments. We compare this approach to univariate methods using only the sample specific amplification efficiency as reporter of the compatibility. We demonstrate improved identification performance using the multivariate approach compared to the univariate approach. Finally we stress that the performance of the amplification compatibility test as a quality control procedure depends on the quality of the reference set. PB - Elsevier VL - 50 UR - http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6WN5-4Y88DBN-3&_user=10&_coverDate=04%2F30%2F2010&_alid=1247745718&_rdoc=1&_fmt=high&_orig=search&_cdi=6953&_sort=r&_docanchor=&view=c&_ct=2&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5 IS - 4 ER - TY - CONF T1 - Resource management of enterprise cloud systems using layered queuing and historical performance models T2 - IEEE International Symposium on Parallel Distributed Processing Y1 - 2010 A1 - Bacigalupo, D. A. A1 - van Hemert, J. A1 - Usmani, A. A1 - Dillenberger, D. N. A1 - Wills, G. B. A1 - Jarvis, S. A. KW - e-Science AB - The automatic allocation of enterprise workload to resources can be enhanced by being able to make `what-if' response time predictions, whilst different allocations are being considered. It is important to quantitatively compare the effectiveness of different prediction techniques for use in cloud infrastructures. To help make the comparison of relevance to a wide range of possible cloud environments it is useful to consider the following. 1.) urgent cloud customers such as the emergency services that can demand cloud resources at short notice (e.g. for our FireGrid emergency response software). 2.) dynamic enterprise systems, that must rapidly adapt to frequent changes in workload, system configuration and/or available cloud servers. 3.) The use of the predictions in a coordinated manner by both the cloud infrastructure and cloud customer management systems. 4.) A broad range of criteria for evaluating each technique. However, there have been no previous comparisons meeting these requirements. This paper, meeting the above requirements, quantitatively compares the layered queuing and (\^A¿HYDRA\^A¿) historical techniques - including our initial thoughts on how they could be combined. Supporting results and experiments include the following: i.) defining, investigating and hence providing guidelines on the use of a historical and layered queuing model; ii.) using these guidelines showing that both techniques can make low overhead and typically over 70% accurate predictions, for new server architectures for which only a small number of benchmarks have been run; and iii.) defining and investigating tuning a prediction-based cloud workload and resource management algorithm. JF - IEEE International Symposium on Parallel Distributed Processing ER - TY - JOUR T1 - Statistical aspects of quantitative real-time PCR experiment design JF - Methods Y1 - 2010 A1 - Robert R. Kitchen A1 - Kubista, Mikael A1 - Tichopad, Ales KW - Error propagation KW - Experiment design KW - Gene expression KW - Nested analysis of variance KW - powerNest KW - Prospective power analysis KW - qPCR KW - Real-time PCR KW - Sampling plan KW - Statistical power AB - Experiments using quantitative real-time PCR to test hypotheses are limited by technical and biological variability; we seek to minimise sources of confounding variability through optimum use of biological and technical replicates. The quality of an experiment design is commonly assessed by calculating its prospective power. Such calculations rely on knowledge of the expected variances of the measurements of each group of samples and the magnitude of the treatment effect; the estimation of which is often uninformed and unreliable. Here we introduce a method that exploits a small pilot study to estimate the biological and technical variances in order to improve the design of a subsequent large experiment. We measure the variance contributions at several ‘levels’ of the experiment design and provide a means of using this information to predict both the total variance and the prospective power of the assay. A validation of the method is provided through a variance analysis of representative genes in several bovine tissue-types. We also discuss the effect of normalisation to a reference gene in terms of the measured variance components of the gene of interest. Finally, we describe a software implementation of these methods, powerNest, that gives the user the opportunity to input data from a pilot study and interactively modify the design of the assay. The software automatically calculates expected variances, statistical power, and optimal design of the larger experiment. powerNest enables the researcher to minimise the total confounding variance and maximise prospective power for a specified maximum cost for the large study. PB - Elsevier VL - 50 UR - http://www.sciencedirect.com/science?_ob=GatewayURL&_method=citationSearch&_uoikey=B6WN5-4Y88DBN-1&_origin=SDEMFRHTML&_version=1&md5=7bb0b5b797d6e1f7c5c2df478fc88e5a IS - 4 ER - TY - CONF T1 - TOPP goes Rapid T2 - Cluster Computing and the Grid, IEEE International Symposium on Y1 - 2010 A1 - Gesing, Sandra A1 - van Hemert, Jano A1 - Jos Koetsier A1 - Bertsch, Andreas A1 - Kohlbacher, Oliver AB - Proteomics, the study of all the proteins contained in a particular sample, e.g., a cell, is a key technology in current biomedical research. The complexity and volume of proteomics data sets produced by mass spectrometric methods clearly suggests the use of grid-based high-performance computing for analysis. TOPP and OpenMS are open-source packages for proteomics data analysis; however, they do not provide support for Grid computing. In this work we present a portal interface for high-throughput data analysis with TOPP. The portal is based on Rapid, a tool for efficiently generating standardized portlets for a wide range of applications. The web-based interface allows the creation and editing of user-defined pipelines and their execution and monitoring on a Grid infrastructure. The portal also supports several file transfer protocols for data staging. It thus provides a simple and complete solution to high-throughput proteomics data analysis for inexperienced users through a convenient portal interface. JF - Cluster Computing and the Grid, IEEE International Symposium on PB - IEEE Computer Society CY - Los Alamitos, CA, USA SN - 978-0-7695-4039-9 ER - TY - CONF T1 - Towards Optimising Distributed Data Streaming Graphs using Parallel Streams T2 - Data Intensive Distributed Computing (DIDC'10), in conjunction with the 19th International Symposium on High Performance Distributed Computing Y1 - 2010 A1 - Chee Sun Liew A1 - Atkinson, Malcolm P. A1 - van Hemert, Jano A1 - Liangxiu Han KW - Data-intensive Computing KW - Distributed Computing KW - Optimisation KW - Parallel Stream KW - Scientific Workflows AB - Modern scientific collaborations have opened up the opportunity of solving complex problems that involve multi- disciplinary expertise and large-scale computational experiments. These experiments usually involve large amounts of data that are located in distributed data repositories running various software systems, and managed by different organisations. A common strategy to make the experiments more manageable is executing the processing steps as a workflow. In this paper, we look into the implementation of fine-grained data-flow between computational elements in a scientific workflow as streams. We model the distributed computation as a directed acyclic graph where the nodes represent the processing elements that incrementally implement specific subtasks. The processing elements are connected in a pipelined streaming manner, which allows task executions to overlap. We further optimise the execution by splitting pipelines across processes and by introducing extra parallel streams. We identify performance metrics and design a measurement tool to evaluate each enactment. We conducted ex- periments to evaluate our optimisation strategies with a real world problem in the Life Sciences—EURExpress-II. The paper presents our distributed data-handling model, the optimisation and instrumentation strategies and the evaluation experiments. We demonstrate linear speed up and argue that this use of data-streaming to enable both overlapped pipeline and parallelised enactment is a generally applicable optimisation strategy. JF - Data Intensive Distributed Computing (DIDC'10), in conjunction with the 19th International Symposium on High Performance Distributed Computing PB - ACM CY - Chicago, Illinois UR - http://www.cct.lsu.edu/~kosar/didc10/index.php ER - TY - CONF T1 - Understanding TSP Difficulty by Learning from Evolved Instances T2 - Lecture Notes in Computer Science Y1 - 2010 A1 - Smith-Miles, Kate A1 - van Hemert, Jano A1 - Lim, Xin ED - Blum, Christian ED - Battiti, Roberto AB - Whether the goal is performance prediction, or insights into the relationships between algorithm performance and instance characteristics, a comprehensive set of meta-data from which relationships can be learned is needed. This paper provides a methodology to determine if the meta-data is sufficient, and demonstrates the critical role played by instance generation methods. Instances of the Travelling Salesman Problem (TSP) are evolved using an evolutionary algorithm to produce distinct classes of instances that are intentionally easy or hard for certain algorithms. A comprehensive set of features is used to characterise instances of the TSP, and the impact of these features on difficulty for each algorithm is analysed. Finally, performance predictions are achieved with high accuracy on unseen instances for predicting search effort as well as identifying the algorithm likely to perform best. JF - Lecture Notes in Computer Science PB - Springer Berlin / Heidelberg VL - 6073 UR - http://dx.doi.org/10.1007/978-3-642-13800-3_29 N1 - 10.1007/978-3-642-13800-3_29 ER - TY - CONF T1 - Workflow Interoperability in a Grid Portal for Molecular Simulations T2 - Proceedings of the International Workshop on Science Gateways (IWSG10) Y1 - 2010 A1 - Gesing, Sandra A1 - Marton, Istvan A1 - Birkenheuer, Georg A1 - Schuller, Bernd A1 - Grunzke, Richard A1 - Krüger, Jens A1 - Breuers, Sebastian A1 - Blunk, Dirk A1 - Fels, Gregor A1 - Packschies, Lars A1 - Brinkmann, André A1 - Kohlbacher, Oliver A1 - Kozlovszky, Miklos JF - Proceedings of the International Workshop on Science Gateways (IWSG10) PB - Consorzio COMETA ER - TY - RPRT T1 - ADMIRE D1.5 – Report defining an iteration of the model and language: PM3 and DL3 Y1 - 2009 A1 - Peter Brezany A1 - Ivan Janciak A1 - Alexander Woehrer A1 - Carlos Buil Aranda A1 - Malcolm Atkinson A1 - van Hemert, Jano AB - This document is the third deliverable to report on the progress of the model, language and ontology research conducted within Workpackage 1 of the ADMIRE project. Significant progress has been made on each of the above areas. The new results that we achieved are recorded against the targets defined for project month 18 and are reported in four sections of this document PB - ADMIRE project UR - http://www.admire-project.eu/docs/ADMIRE-D1.5-model-language-ontology.pdf ER - TY - CONF T1 - Adoption of e-Infrastructure Services: inhibitors, enablers and opportunities T2 - 5th International Conference on e-Social Science Y1 - 2009 A1 - Voss, A. A1 - Asgari-Targhi, M. A1 - Procter, R. A1 - Halfpenny, P. A1 - Fragkouli, E. A1 - Anderson, S. A1 - Hughes, L. A1 - Fergusson, D. A1 - Vander Meer, E. A1 - Atkinson, M. AB - Based on more than 100 interviews with respondents from the academic community and information services, we present findings from our study of inhibitors and enablers of adoption of e-Infrastructure services for research. We discuss issues raised and potential ways of addressing them. JF - 5th International Conference on e-Social Science CY - Maternushaus, Cologne ER - TY - CONF T1 - Advanced Data Mining and Integration Research for Europe T2 - All Hands Meeting 2009 Y1 - 2009 A1 - Atkinson, M. A1 - Brezany, P. A1 - Corcho, O. A1 - Han, L A1 - van Hemert, J. A1 - Hluchy, L. A1 - Hume, A. A1 - Janciak, I. A1 - Krause, A. A1 - Snelling, D. A1 - Wöhrer, A. AB - There is a rapidly growing wealth of data [1]. The number of sources of data is increasing, while, at the same time, the diversity, complexity and scale of these data resources are also increasing dramatically. This cornucopia of data oers much potential; a combinatorial explosion of opportunities for knowledge discovery, improved decisions and better policies. Today, most of these opportunities are not realised because composing data from multiple sources and extracting information is too dicult. Every business, organisation and government faces problems that can only be addressed successfully if we improve our techniques for exploiting the data we gather. JF - All Hands Meeting 2009 CY - Oxford ER - TY - CONF T1 - Automating Gene Expression Annotation for Mouse Embryo T2 - Lecture Notes in Computer Science (Advanced Data Mining and Applications, 5th International Conference) Y1 - 2009 A1 - Liangxiu Han A1 - van Hemert, Jano A1 - Richard Baldock A1 - Atkinson, Malcolm P. ED - Ronghuai Huang ED - Qiang Yang ED - Jian Pei ED - et al JF - Lecture Notes in Computer Science (Advanced Data Mining and Applications, 5th International Conference) PB - Springer VL - LNAI 5678 ER - TY - JOUR T1 - The Circulate Architecture: Avoiding Workflow Bottlenecks Caused By Centralised Orchestration JF - Cluster Computing Y1 - 2009 A1 - Barker, A. A1 - Weissman, J. A1 - van Hemert, J. I. KW - grid computing KW - workflow VL - 12 UR - http://www.springerlink.com/content/080q5857711w2054/?p=824749739c6a432ea95a0c3b59f4025f&pi=1 ER - TY - CONF T1 - CoMPI: Enhancing MPI Based Applications Performance and Scalability Using Run-Time Compression. T2 - EUROPVM/MPI 2009.Espoo, Finland. September 2009 Y1 - 2009 A1 - Rosa Filgueira A1 - David E. Singh A1 - Alejandro Calderón A1 - Jesús Carretero AB - This paper presents an optimization of MPI communications, called CoMPI, based on run-time compression of MPI messages exchanged by applications. A broad number of compression algorithms have been fully implemented and tested for both MPI collective and point to point primitives. In addition, this paper presents a study of several compression algorithms that can be used for run-time compression, based on the datatype used by applications. This study has been validated by using several MPI benchmarks and real HPC applications. Show that, in most of the cases, using compression reduces the application communication time enhancing application performance and scalability. In this way, CoMPI obtains important improvements in the overall execution time for many of the considered scenarios. JF - EUROPVM/MPI 2009.Espoo, Finland. September 2009 PB - Springer CY - Espoo. Finland VL - 5759/2009 ER - TY - Generic T1 - Crossing boundaries: computational science, e-Science and global e-Infrastructure I T2 - All Hands meeting 2008 Y1 - 2009 A1 - Coveney, P. V. A1 - Atkinson, M. P. ED - Coveney, P. V. ED - Atkinson, M. P. JF - All Hands meeting 2008 T3 - Philosophical Transactions of the Royal Society Series A PB - Royal Society Publishing CY - Edinburgh VL - 367 UR - http://rsta.royalsocietypublishing.org/content/367/1897.toc ER - TY - Generic T1 - Crossing boundaries: computational science, e-Science and global e-Infrastructure II T2 - All Hands Meeting 2008 Y1 - 2009 A1 - Coveney, P. V. A1 - Atkinson, M. P. ED - Coveney, P. V. ED - Atkinson, M. P. JF - All Hands Meeting 2008 T3 - Philosophical Transactions of the Royal Society Series A PB - Royal Society Publishing CY - Edinburgh VL - 367 UR - http://rsta.royalsocietypublishing.org/content/367/1898.toc ER - TY - JOUR T1 - Design and Optimization of Reverse-Transcription Quantitative PCR Experiments JF - Clin Chem Y1 - 2009 A1 - Tichopad, Ales A1 - Kitchen, Rob A1 - Riedmaier, Irmgard A1 - Becker, Christiane A1 - Stahlberg, Anders A1 - Kubista, Mikael AB - BACKGROUND: Quantitative PCR (qPCR) is a valuable technique for accurately and reliably profiling and quantifying gene expression. Typically, samples obtained from the organism of study have to be processed via several preparative steps before qPCR. METHOD: We estimated the errors of sample withdrawal and extraction, reverse transcription (RT), and qPCR that are introduced into measurements of mRNA concentrations. We performed hierarchically arranged experiments with 3 animals, 3 samples, 3 RT reactions, and 3 qPCRs and quantified the expression of several genes in solid tissue, blood, cell culture, and single cells. RESULTS: A nested ANOVA design was used to model the experiments, and relative and absolute errors were calculated with this model for each processing level in the hierarchical design. We found that intersubject differences became easily confounded by sample heterogeneity for single cells and solid tissue. In cell cultures and blood, the noise from the RT and qPCR steps contributed substantially to the overall error because the sampling noise was less pronounced. CONCLUSIONS: We recommend the use of sample replicates preferentially to any other replicates when working with solid tissue, cell cultures, and single cells, and we recommend the use of RT replicates when working with blood. We show how an optimal sampling plan can be calculated for a limited budget. UR - http://www.clinchem.org/cgi/content/abstract/clinchem.2009.126201v1 ER - TY - CONF T1 - A Distributed Architecture for Data Mining and Integration T2 - Data-Aware Distributed Computing (DADC'09), in conjunction with the 18th International Symposium on High Performance Distributed Computing Y1 - 2009 A1 - Atkinson, Malcolm P. A1 - van Hemert, Jano A1 - Liangxiu Han A1 - Ally Hume A1 - Chee Sun Liew AB - This paper presents the rationale for a new architecture to support a signiﬁcant increase in the scale of data integration and data mining. It proposes the composition into one framework of (1) data mining and (2) data access and integration. We name the combined activity “DMI”. It supports enactment of DMI processes across heterogeneous and distributed data resources and data mining services. It posits that a useful division can be made between the facilities established to support the deﬁnition of DMI processes and the computational infrastructure provided to enact DMI processes. Communication between those two divisions is restricted to requests submitted to gateway services in a canonical DMI language. Larger-scale processes are enabled by incremental reﬁnement of DMI-process deﬁnitions often by recomposition of lower-level deﬁnitions. Autonomous types and descriptions which will support detection of inconsistencies and semi-automatic insertion of adaptations.These architectural ideas are being evaluated in a feasibility study that involves an application scenario and representatives of the community. JF - Data-Aware Distributed Computing (DADC'09), in conjunction with the 18th International Symposium on High Performance Distributed Computing PB - ACM ER - TY - RPRT T1 - An e-Infrastructure for Collaborative Research in Human Embryo Development Y1 - 2009 A1 - Barker, Adam A1 - van Hemert, Jano I. A1 - Baldock, Richard A. A1 - Atkinson, Malcolm P. AB - Within the context of the EU Design Study Developmental Gene Expression Map, we identify a set of challenges when facilitating collaborative research on early human embryo development. These challenges bring forth requirements, for which we have identified solutions and technology. We summarise our solutions and demonstrate how they integrate to form an e-infrastructure to support collaborative research in this area of developmental biology. UR - http://arxiv.org/pdf/0901.2310v1 ER - TY - CONF T1 - An E-infrastructure to Support Collaborative Embryo Research T2 - Cluster Computing and the Grid Y1 - 2009 A1 - Barker, Adam A1 - van Hemert, Jano I. A1 - Baldock, Richard A. A1 - Atkinson, Malcolm P. JF - Cluster Computing and the Grid PB - IEEE Computer Society SN - 978-0-7695-3622-4 ER - TY - CHAP T1 - Exploiting Fruitful Regions in Dynamic Routing using Evolutionary Computation T2 - Studies in Computational Intelligence Y1 - 2009 A1 - van Hemert, J. I. A1 - la Poutré, J. A. ED - Pereira Babtista, F. ED - Tavares, J. JF - Studies in Computational Intelligence PB - Springer VL - 161 SN - 978-3-540-85151-6 N1 - Awaiting publication (due October 2008) ER - TY - JOUR T1 - Giving Computational Science a Friendly Face JF - Zero-In Y1 - 2009 A1 - van Hemert, J. I. A1 - Koetsier, J. AB - Today, most researchers from any discipline will successfully use web-based e-commerce systems to book flights to attend their conferences. But when these same researchers are confronted with compute-intensive problems, they cannot expect elaborate web-based systems to enable their domain-specific tasks. VL - 1 UR - http://www.beliefproject.org/zero-in/zero-in-third-edition/zero-in-issue-3 IS - 3 ER - TY - JOUR T1 - Guest Editorial: Research Data: It’s What You Do With Them JF - International Journal of Digital Curation Y1 - 2009 A1 - Malcolm Atkinson AB - These days it may be stating the obvious that the number of data resources, their complexity and diversity is growing rapidly due to the compound effects of increasing speed and resolution of digital instruments, due to pervasive data-collection automation and due to the growing power of computers. Just because we are becoming used to the accelerating growth of data resources, it does not mean we can be complacent; they represent an enormous wealth of opportunity to extract information, to make discoveries and to inform policy. But all too often it still takes a heroic effort to discover and exploit those opportunities, hence the research and progress, charted by the Fourth International Digital Curation Conference1 and recorded in this issue of the International Journal of Digital Curation, are an invaluable step on a long and demanding journey. VL - 4 UR - http://www.ijdc.net/index.php/ijdc/article/view/96 IS - 1 ER - TY - Generic T1 - A Methodology for Mobile Network Security Risk Management T2 - Sixth International Conference on Information Technology: New Generations (ITNG '09) Y1 - 2009 A1 - Mahdi Seify A1 - Shahriar Bijani JF - Sixth International Conference on Information Technology: New Generations (ITNG '09) PB - IEEE Computer Society ER - TY - CONF T1 - A model of social collaboration in Molecular Biology knowledge bases T2 - Proceedings of the 6th Conference of the European Social Simulation Association (ESSA'09) Y1 - 2009 A1 - De Ferrari, Luna A1 - Stuart Aitken A1 - van Hemert, Jano A1 - Goryanin, Igor AB - Manual annotation of biological data cannot keep up with data production. Open annotation models using wikis have been proposed to address this problem. In this empirical study we analyse 36 years of knowledge collection by 738 authors in two Molecular Biology wikis (EcoliWiki and WikiPathways) and two knowledge bases (OMIM and Reactome). We first investigate authorship metrics (authors per entry and edits per author) which are power-law distributed in Wikipedia and we find they are heavy-tailed in these four systems too. We also find surprising similarities between the open (editing open to everyone) and the closed systems (expert curators only). Secondly, to discriminate between driving forces in the measured distributions, we simulate the curation process and find that knowledge overlap among authors can drive the number of authors per entry, while the time the users spend on the knowledge base can drive the number of contributions per author. JF - Proceedings of the 6th Conference of the European Social Simulation Association (ESSA'09) PB - European Social Simulation Association ER - TY - JOUR T1 - An Open Grid Services Architecture Primer JF - Computer Y1 - 2009 A1 - Grimshaw, Andrew A1 - Morgan, Mark A1 - Merrill, Duane A1 - Kishimoto, Hiro A1 - Savva, Andreas A1 - Snelling, David A1 - Smith, Chris A1 - Dave Berry PB - IEEE Computer Society Press CY - Los Alamitos, CA, USA VL - 42 ER - TY - JOUR T1 - The performance model of dynamic Virtual Organization (VO) formations within grid computing context JF - Chaos, Solitons & Fractals Y1 - 2009 A1 - Liangxiu Han KW - complex network KW - graph theory KW - grid computing KW - virtual organization formation PB - Elsevier Science VL - 40 IS - 4 N1 - In press ER - TY - CONF T1 - Portals for Life Sciences—a Brief Introduction T2 - Proceedings of the IWPLS09 International Workshop on Portals for Life Sciences Y1 - 2009 A1 - Gesing, Sandra A1 - Kohlbacher, O. A1 - van Hemert, J. I. AB - The topic ”‘Portals for Life Sciences”’ includes various research fields, on the one hand many different topics out of life sciences, e.g. mass spectrometry, on the other hand portal technologies and different aspects of computer science, such as usability of user interfaces and security of systems. The main aspect about portals is to simplify the user’s interaction with computational resources which are concer- ted to a supported application domain. JF - Proceedings of the IWPLS09 International Workshop on Portals for Life Sciences T3 - CEUR Workshop Proceedings UR - http://ceur-ws.org/Vol-513/paper01.pdf ER - TY - JOUR T1 - Preface. Crossing boundaries: computational science, e-Science and global e-Infrastructure JF - Philosophical Transactions of the Royal Society Series A Y1 - 2009 A1 - Coveney, P. V. A1 - Atkinson, M. P. PB - Royal Society Publishing VL - 367 ER - TY - Generic T1 - Proceedings of the 1st International Workshop on Portals for Life Sciences T2 - IWPLS09 International Workshop on Portals for Life Sciences Y1 - 2009 A1 - Gesing, Sandra A1 - van Hemert, Jano I. JF - IWPLS09 International Workshop on Portals for Life Sciences T3 - CEUR Workshop Proceedings CY - e-Science Institute, Edinburgh, UK UR - http://ceur-ws.org/Vol-513 ER - TY - CONF T1 - Rapid chemistry portals through engaging researchers T2 - Fifth IEEE International Conference on e-Science Y1 - 2009 A1 - Koetsier, J. A1 - Turner, A. A1 - Richardson, P. A1 - van Hemert, J. I. ED - Trefethen, A ED - De Roure, D AB - In this study, we apply a methodology for rapid development of portlets for scientific computing to the domain of computational chemistry. We report results in terms of the portals delivered, the changes made to our methodology and the experience gained in terms of interaction with domain-specialists. Our major contributions are: several web portals for teaching and research in computational chemistry; a successful transition to having our development tool used by the domain specialist as opposed by us, the developers; and an updated version of our methodology and technology for rapid development of portlets for computational science, which is free for anyone to pick up and use. JF - Fifth IEEE International Conference on e-Science CY - Oxford, UK ER - TY - CONF T1 - Rapid development of computational science portals T2 - Proceedings of the IWPLS09 International Workshop on Portals for Life Sciences Y1 - 2009 A1 - Koetsier, J. A1 - van Hemert, J. I. ED - Gesing, S. ED - van Hemert, J. I. KW - portal JF - Proceedings of the IWPLS09 International Workshop on Portals for Life Sciences T3 - CEUR Workshop Proceedings PB - e-Science Institute CY - Edinburgh UR - http://ceur-ws.org/Vol-513/paper05.pdf ER - TY - JOUR T1 - Simultaneous alignment of short reads against multiple genomes JF - Genome Biol Y1 - 2009 A1 - Schneeberger, Korbinian A1 - Hagmann, Jörg A1 - Ossowski, Stephan A1 - Warthmann, Norman A1 - Gesing, Sandra A1 - Kohlbacher, Oliver A1 - Weigel, Detlef VL - 10 UR - http://www.biomedsearch.com/nih/Simultaneous-alignment-short-reads-against/19761611.html ER - TY - JOUR T1 - Strategies and Policies to Support and Advance Education in e-Science JF - Computing Now Y1 - 2009 A1 - Malcolm Atkinson A1 - Elizabeth Vander Meer A1 - Fergusson, David A1 - Clive Davenhall A1 - Hamza Mehammed AB - In previous installments of this series, we’ve presented tools and resources that university undergraduate and graduate environments must provide to allow for the continued development and success of e-Science education. We’ve introduced related summer (http://doi.ieeecomputersociety.org/ 10.1109/MDSO.2008.20) and winter (http://doi.ieeecomputersociety.org/10.1109/MDSO.2008.26) schools and important issues such as t-Infrastructure provision (http://doi.ieeecomputersociety.org/ 10.1109/MDSO.2008.28), intellectual property rights in the context of digital repositories (http://doi.ieeecomputersociety.org/10.1109/MDSO.2008.34), and curriculum content (http://www2. computer.org/portal/web/computingnow/0309/education). We conclude now with an overview of areas in which we must focus effort and strategies and policies that could provide much-needed support in these areas. We direct these strategy and policy recommendations toward key stakeholders in e-Science education, such as ministries of education, councils in professional societies, and professional teachers and educational strategists. Ministries of education can influence funding councils, thus financially supporting our proposals. Professional societies can assist in curricula revision, and teachers and strategists shape curricula in institutions, which makes them valuable in improving and developing education in e-Science and (perhaps) e-Science in education. We envision incremental change in curricula, so our proposals aim to evolve existing courses, rather than suggesting drastic upheavals and isolated additions. The long-term goal is to ensure that every graduate obtains the appropriate level of e-Science competency for their field, but we don’t presume to define that level for any given discipline or institution. We set out issues and ideas but don’t offer rigid prescriptions, which would take control away from important stakeholders. UR - http://www.computer.org/portal/web/computingnow/education ER - TY - JOUR T1 - A Strategy for Research and Innovation in the Century of Information JF - Prometheus Y1 - 2009 A1 - e-Science Directors’ Forum Strategy Working Group A1 - Atkinson, M. A1 - Britton, D. A1 - Coveney, P. A1 - De Roure, D A1 - Garnett, N. A1 - Geddes, N. A1 - Gurney, R. A1 - Haines, K. A1 - Hughes, L. A1 - Ingram, D. A1 - Jeffreys, P. A1 - Lyon, L. A1 - Osborne, I. A1 - Perrott, P. A1 - Procter. R. A1 - Rusbridge, C. AB - More data will be produced in the next five years than in the entire history of human kind, a digital deluge that marks the beginning of the Century of Information. Through a year‐long consultation with UK researchers, a coherent strategy has been developed, which will nurture Century‐of‐Information Research (CIR); it crystallises the ideas developed by the e‐Science Directors’ Forum Strategy Working Group. This paper is an abridged version of their latest report which can be found at: http://wikis.nesc.ac.uk/escienvoy/Century_of_Information_Research_Strategy which also records the consultation process and the affiliations of the authors. This document is derived from a paper presented at the Oxford e‐Research Conference 2008 and takes into account suggestions made in the ensuing panel discussion. The goals of the CIR Strategy are to facilitate the growth of UK research and innovation that is data and computationally intensive and to develop a new culture of ‘digital‐systems judgement’ that will equip research communities, businesses, government and society as a whole, with the skills essential to compete and prosper in the Century of Information. The CIR Strategy identifies a national requirement for a balanced programme of coordination, research, infrastructure, translational investment and education to empower UK researchers, industry, government and society. The Strategy is designed to deliver an environment which meets the needs of UK researchers so that they can respond agilely to challenges, can create knowledge and skills, and can lead new kinds of research. It is a call to action for those engaged in research, those providing data and computational facilities, those governing research and those shaping education policies. The ultimate aim is to help researchers strengthen the international competitiveness of the UK research base and increase its contribution to the economy. The objectives of the Strategy are to better enable UK researchers across all disciplines to contribute world‐leading fundamental research; to accelerate the translation of research into practice; and to develop improved capabilities, facilities and context for research and innovation. It envisages a culture that is better able to grasp the opportunities provided by the growing wealth of digital information. Computing has, of course, already become a fundamental tool in all research disciplines. The UK e‐Science programme (2001–06)—since emulated internationally—pioneered the invention and use of new research methods, and a new wave of innovations in digital‐information technologies which have enabled them. The Strategy argues that the UK must now harness and leverage its own, plus the now global, investment in digital‐information technology in order to spread the benefits as widely as possible in research, education, industry and government. Implementing the Strategy would deliver the computational infrastructure and its benefits as envisaged in the Science & Innovation Investment Framework 2004–2014 (July 2004), and in the reports developing those proposals. To achieve this, the Strategy proposes the following actions: 1. support the continuous innovation of digital‐information research methods; 2. provide easily used, pervasive and sustained e‐Infrastructure for all research; 3. enlarge the productive research community which exploits the new methods efficiently; 4. generate capacity, propagate knowledge and develop skills via new curricula; and 5. develop coordination mechanisms to improve the opportunities for interdisciplinary research and to make digital‐infrastructure provision more cost effective. To gain the best value for money strategic coordination is required across a broad spectrum of stakeholders. A coherent strategy is essential in order to establish and sustain the UK as an international leader of well‐curated national data assets and computational infrastructure, which is expertly used to shape policy, support decisions, empower researchers and to roll out the results to the wider benefit of society. The value of data as a foundation for wellbeing and a sustainable society must be appreciated; national resources must be more wisely directed to the collection, curation, discovery, widening access, analysis and exploitation of these data. Every researcher must be able to draw on skills, tools and computational resources to develop insights, test hypotheses and translate inventions into productive use, or to extract knowledge in support of governmental decision making. This foundation plus the skills developed will launch significant advances in research, in business, in professional practice and in government with many consequent benefits for UK citizens. The Strategy presented here addresses these complex and interlocking requirements. VL - 27 ER - TY - JOUR T1 - Towards a Virtual Fly Brain JF - Philosophical Transactions A Y1 - 2009 A1 - Armstrong, J. D. A1 - van Hemert, J. I. KW - e-Science AB - Models of the brain that simulate sensory input, behavioural output and information processing in a biologically plausible manner pose significant challenges to both Computer Science and Biology. Here we investigated strategies that could be used to create a model of the insect brain, specifically that of Drosophila melanogaster which is very widely used in laboratory research. The scale of the problem is an order of magnitude above the most complex of the current simulation projects and it is further constrained by the relative sparsity of available electrophysiological recordings from the fly nervous system. However, fly brain research at the anatomical and behavioural level offers some interesting opportunities that could be exploited to create a functional simulation. We propose to exploit these strengths of Drosophila CNS research to focus on a functional model that maps biologically plausible network architecture onto phenotypic data from neuronal inhibition and stimulation studies, leaving aside biophysical modelling of individual neuronal activity for future models until more data is available. VL - 367 UR - http://rsta.royalsocietypublishing.org/content/367/1896/2387.abstract ER - TY - CONF T1 - Using architectural simulation models to aid the design of data intensive application T2 - The Third International Conference on Advanced Engineering Computing and Applications in Sciences (ADVCOMP 2009) Y1 - 2009 A1 - Javier Fernández A1 - Liangxiu Han A1 - Alberto Nuñez A1 - Jesus Carretero A1 - van Hemert, Jano JF - The Third International Conference on Advanced Engineering Computing and Applications in Sciences (ADVCOMP 2009) PB - IEEE Computer Society CY - Sliema, Malta ER - TY - CONF T1 - Using Simulation for Decision Support: Lessons Learned from FireGrid T2 - Proceedings of the 6th International Conference on Information Systems for Crisis Response and Management (ISCRAM 2009) Y1 - 2009 A1 - Gerhard Wickler A1 - George Beckett A1 - Liangxiu Han A1 - Sung Han Koo A1 - Stephen Potter A1 - Gavin Pringle A1 - Austin Tate ED - J. Landgren, U. Nulden ED - B. Van de Walle JF - Proceedings of the 6th International Conference on Information Systems for Crisis Response and Management (ISCRAM 2009) CY - Gothenburg, Sweden ER - TY - JOUR T1 - Using the DCC Lifecycle Model to Curate a Gene Expression Database: A Case Study JF - International Journal of Digital Curation Y1 - 2009 A1 - O’Donoghue, J. A1 - van Hemert, J. I. AB - Developmental Gene Expression Map (DGEMap) is an EU-funded Design Study, which will accelerate an integrated European approach to gene expression in early human development. As part of this design study, we have had to address the challenges and issues raised by the long-term curation of such a resource. As this project is primarily one of data creators, learning about curation, we have been looking at some of the models and tools that are already available in the digital curation field in order to inform our thinking on how we should proceed with curating DGEMap. This has led us to uncover a wide range of resources for data creators and curators alike. Here we will discuss the future curation of DGEMap as a case study. We believe our experience could be instructive to other projects looking to improve the curation and management of their data. PB - UKOLN VL - 4 UR - http://www.ijdc.net/index.php/ijdc/article/view/134 IS - 3 ER - TY - CONF T1 - An Architecture for an Integrated Fire Emergency Response System for the Built Environment T2 - 9th Symposium of the International Association for Fire Safety Science (IAFSS) Y1 - 2008 A1 - Rochan Upadhyay A1 - Galvin Pringle A1 - George Beckett A1 - Stephen Potter A1 - Liangxiu Han A1 - Stephen Welch A1 - Asif Usmani A1 - Jose Torero KW - emergency response system KW - FireGrid KW - system architecture KW - technology integration AB - FireGrid is a modern concept that aims to leverage a number of modern technologies to aid fire emergency response. In this paper we provide a brief introduction to the FireGrid project. A number of different technologies such as wireless sensor networks, grid-enabled High Performance Computing (HPC) implementation of fire models, and artificial intelligence tools need to be integrated to build up a modern fire emergency response system. We propose a system architecture that provides the framework for integration of the various technologies. We describe the components of the generic FireGrid system architecture in detail. Finally we present a small-scale demonstration experiment which has been completed to highlight the concept and application of the FireGrid system to an actual fire. Although our proposed system architecture provides a versatile framework for integration, a number of new and interesting research problems need to be solved before actual deployment of the system. We outline some of the challenges involved which require significant interdisciplinary collaborations. JF - 9th Symposium of the International Association for Fire Safety Science (IAFSS) PB - IAFSS CY - Karlsruhe, GERMANY ER - TY - CHAP T1 - Contraction-Based Heuristics to Improve the Efficiency of Algorithms Solving the Graph Colouring Problem T2 - Studies in Computational Intelligence Y1 - 2008 A1 - Juhos, I. A1 - van Hemert, J. I. ED - Cotta, C. ED - van Hemert, J. I. KW - constraint satisfaction KW - evolutionary computation KW - graph colouring JF - Studies in Computational Intelligence PB - Springer ER - TY - CONF T1 - Data Locality Aware Strategy for Two-Phase Collective I/O T2 - VECPAR Y1 - 2008 A1 - Rosa Filgueira A1 - David E. Singh A1 - Juan Carlos Pichel A1 - Florin Isaila A1 - Jesús Carretero JF - VECPAR ER - TY - JOUR T1 - Distributed Computing Education, Part 1: A Special Case? JF - IEEE Distributed Systems Online Y1 - 2008 A1 - Fergusson, D. A1 - Hopkins, R. A1 - Romano, D. A1 - Vander Meer, E. A1 - Atkinson, M. VL - 9 UR - http://dsonline.computer.org/portal/site/dsonline/menuitem.9ed3d9924aeb0dcd82ccc6716bbe36ec/index.jsp?&pName=dso_level1&path=dsonline/2008/06&file=o6002edu.xml&xsl=article.xsl&;jsessionid=LZ5zjySvc2xPnVv4qTYJXhlvwSnRGGj7S7WvPtrPyv23rJGQdjJr!982319602 IS - 6 ER - TY - JOUR T1 - Distributed Computing Education, Part 2: International Summer Schools JF - IEEE Distributed Systems Online Y1 - 2008 A1 - Fergusson, D. A1 - Hopkins, R. A1 - Romano, D. A1 - Vander Meer, E. A1 - Atkinson, M. VL - 9 UR - http://dsonline.computer.org/portal/site/dsonline/menuitem.9ed3d9924aeb0dcd82ccc6716bbe36ec/index.jsp?&pName=dso_level1&path=dsonline/2008/07&file=o7002edu.xml&xsl=article.xsl& IS - 7 ER - TY - JOUR T1 - Distributed Computing Education, Part 3: The Winter School Online Experience JF - Distributed Systems Online Y1 - 2008 A1 - Low, B. A1 - Cassidy, K. A1 - Fergusson, D. A1 - Atkinson, M. A1 - Vander Meer, E. A1 - McGeever, M. AB - The International Summer Schools in Grid Computing (ISSGC) have provided numerous international students with the opportunity to learn grid systems, as detailed in part 2 of this series (http://doi.ieeecomputersociety.org/10.1109/MDSO.2008.20). The International Winter School on Grid Computing 2008 (IWSGC 08) followed the successful summer schools, opening up the ISSGC experience to a wider range of students because of its online format. The previous summer schools made it clear that many students found the registration and travel costs and the time requirements prohibitive. The EU FP6 ICEAGE project held the first winter school from 6 February to 12 March 2008. The winter school repurposed summer school materials and added resources such as the ICEAGE digital library and summer-school-tested t-Infrastructures such as GILDA (Grid INFN Laboratory for Dissemination Activities). The winter schools shared the goals of the summer school, which emphasized disseminating grid knowledge. The students act as multipliers, spreading the skills and knowledge they acquired at the winter school to their colleagues to build strong and enthusiastic local grid communities. PB - IEEE Computer Society VL - 9 UR - http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4659260 IS - 9 ER - TY - JOUR T1 - Distributed Computing Education, Part 4: Training Infrastructure JF - Distributed Systems Online Y1 - 2008 A1 - Fergusson, D. A1 - Barbera, R. A1 - Giorgio, E. A1 - Fargetta, M. A1 - Sipos, G. A1 - Romano, D. A1 - Atkinson, M. A1 - Vander Meer, E. AB - In the first article of this series (see http://doi.ieeecomputersociety.org/10.1109/MDSO.2008.16), we identified the need for teaching environments that provide infrastructure to support education and training in distributed computing. Training infrastructure, or t-infrastructure, is analogous to the teaching laboratory in biology and is a vital tool for educators and students. In practice, t-infrastructure includes the computing equipment, digital communications, software, data, and support staff necessary to teach a course. The International Summer Schools in Grid Computing (ISSGC) series and the first International Winter School on Grid Computing (IWSGC 08) used the Grid INFN Laboratory of Dissemination Activities (GILDA) infrastructure so students could gain hands-on experience with middleware. Here, we describe GILDA, related summer and winter school experiences, multimiddleware integration, t-infrastructure, and academic courses, concluding with an analysis and recommendations. PB - IEEE Computer Society VL - 9 UR - http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4752926 IS - 10 ER - TY - JOUR T1 - Distributed Computing Education, Part 5: Coming to Terms with Intellectual Property Rights JF - Distributed Systems Online Y1 - 2008 A1 - Boon Low A1 - Kathryn Cassidy A1 - Fergusson, David A1 - Malcolm Atkinson A1 - Elizabeth Vander Meer A1 - Mags McGeever AB - In part 1 of this series on distributed computing education, we introduced a list of components important for teaching environments. We outlined the first three components, which included development of materials for education, education for educators and teaching infrastructures, identifying current practice, challenges, and opportunities for provision. The final component, a supportive policy framework that encourages cooperation and sharing, includes the need to manage intellectual property rights (IPR). PB - IEEE Computer Society VL - 9 UR - http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4755177 IS - 12 ER - TY - RPRT T1 - Education and Training Task Force Report Y1 - 2008 A1 - Atkinson, M. A1 - Vander Meer, E. A1 - Fergusson, D. A1 - Artacho, M. AB - The development of e-Infrastructure, of which grid computing is a fundamental element, will have major economic and social benefits. Online and financial businesses already successfully use grid computing technologies, for instance. There are already demonstrations showing the benefits to engineering, medicine and the creative industries as well. New research methods and technologies generate large data sets that need to be shared in order to ensure continued social and scientific research and innovation. e-Infrastructure provides an environment for coping with these large data sets and for sharing data across regions. An investment in educating people in this technology, then, is an investment that will strengthen our economies and societies. In order to deliver e-Infrastructure education and training successfully in the EU, we must develop a policy framework that will ensure shared responsibility and equivalent training in the field. This document focuses primarily on the current state of grid and e-Science education, introducing key challenges and the opportunities available to educational planners that serve as a starting point for further work. It then proposes strategies and policies to provide a supportive framework for e-Infrastructure education and training. The ETTF Report concludes with policy recommendations to be taken forward by the e-IRG. These recommendations address issues such as the level of Member State investment in e-Infrastructure education, the harmonisation of education in distributed-computation thinking and in the use of e-Infrastructure and the development of standards for student and teacher identification, for the sharing of t-Infrastructure (and training material) and for accreditation. JF - e-Infrastructure Reflection Group UR - http://www.e-irg.eu/index.php?option=com_content&task=view&id=38&Itemid=37 ER - TY - CONF T1 - Eliminating the Middle Man: Peer-to-Peer Dataflow T2 - HPDC '08: Proceedings of the 17th International Symposium on High Performance Distributed Computing Y1 - 2008 A1 - Barker, Adam A1 - Weissman, Jon B. A1 - van Hemert, Jano KW - grid computing KW - workflow JF - HPDC '08: Proceedings of the 17th International Symposium on High Performance Distributed Computing PB - ACM ER - TY - Generic T1 - European Graduate Student Workshop on Evolutionary Computation Y1 - 2008 A1 - Di Chio, Cecilia A1 - Giacobini, Mario A1 - van Hemert, Jano ED - Di Chio, Cecilia ED - Giacobini, Mario ED - van Hemert, Jano KW - evolutionary computation AB - Evolutionary computation involves the study of problem-solving and optimization techniques inspired by principles of evolution and genetics. As any other scientific field, its success relies on the continuity provided by new researchers joining the field to help it progress. One of the most important sources for new researchers is the next generation of PhD students that are actively studying a topic relevant to this field. It is from this main observation the idea arose of providing a platform exclusively for PhD students. ER - TY - Generic T1 - Evolutionary Computation in Combinatorial Optimization, 8th European Conference T2 - Lecture Notes in Computer Science Y1 - 2008 A1 - van Hemert, Jano A1 - Cotta, Carlos ED - van Hemert, Jano ED - Cotta, Carlos KW - evolutionary computation AB - Metaheuristics have shown to be effective for difficult combinatorial optimization problems appearing in various industrial, economical, and scientific domains. Prominent examples of metaheuristics are evolutionary algorithms, tabu search, simulated annealing, scatter search, memetic algorithms, variable neighborhood search, iterated local search, greedy randomized adaptive search procedures, ant colony optimization and estimation of distribution algorithms. Problems solved successfully include scheduling, timetabling, network design, transportation and distribution, vehicle routing, the travelling salesman problem, packing and cutting, satisfiability and general mixed integer programming. EvoCOP began in 2001 and has been held annually since then. It is the first event specifically dedicated to the application of evolutionary computation and related methods to combinatorial optimization problems. Originally held as a workshop, EvoCOP became a conference in 2004. The events gave researchers an excellent opportunity to present their latest research and to discuss current developments and applications. Following the general trend of hybrid metaheuristics and diminishing boundaries between the different classes of metaheuristics, EvoCOP has broadened its scope over the last years and invited submissions on any kind of metaheuristic for combinatorial optimization. JF - Lecture Notes in Computer Science PB - Springer VL - LNCS 4972 ER - TY - CONF T1 - Exploiting data compression in collective I/O techniques. T2 - Cluster Computing 2008. Y1 - 2008 A1 - Rosa Filgueira A1 - David E. Singh A1 - Juan Carlos Pichel A1 - Jesús Carretero JF - Cluster Computing 2008. CY - Tsukuba, Japand. SN - 978-1-4244-2639-3 ER - TY - CONF T1 - Fostering e-Infrastructures: from user-designer relations to community engagement T2 - Symposium on Project Management in e-Science Y1 - 2008 A1 - Voss, A. A1 - Asgari-Targhi, M. A1 - Halfpenny, P. A1 - Procter, R. A1 - Anderson, S. A1 - Dunn, S. A1 - Fragkouli, E. A1 - Hughes, L. A1 - Atkinson, M. A1 - Fergusson, D. A1 - Mineter, M. A1 - Rodden, T. AB - In this paper we discuss how e-Science can draw on the findings, approaches and methods developed in other disciplines to foster e-Infrastructures for research. We also discuss the issue of making user involvement in IT development scale across an open ommunity of researchers and from single systems to distributed e-Infrastructures supporting collaborative research. JF - Symposium on Project Management in e-Science CY - Oxford ER - TY - CONF T1 - Graph Colouring Heuristics Guided by Higher Order Graph Properties T2 - Lecture Notes in Computer Science Y1 - 2008 A1 - Juhos, Istv\'{a}n A1 - van Hemert, Jano ED - van Hemert, Jano ED - Cotta, Carlos KW - evolutionary computation KW - graph colouring AB - Graph vertex colouring can be defined in such a way where colour assignments are substituted by vertex contractions. We present various hyper-graph representations for the graph colouring problem all based on the approach where vertices are merged into groups. In this paper, we show this provides a uniform and compact way to define algorithms, both of a complete or a heuristic nature. Moreover, the representation provides information useful to guide algorithms during their search. In this paper we focus on the quality of solutions obtained by graph colouring heuristics that make use of higher order properties derived during the search. An evolutionary algorithm is used to search permutations of possible merge orderings. JF - Lecture Notes in Computer Science PB - Springer VL - 4972 ER - TY - JOUR T1 - A Grid infrastructure for parallel and interactive applications JF - Computing and Informatics Y1 - 2008 A1 - Gomes, J. A1 - Borges, B. A1 - Montecelo, M. A1 - David, M. A1 - Silva, B. A1 - Dias, N. A1 - Martins, JP A1 - Fernandez, C. A1 - Garcia-Tarres, L. , A1 - Veiga, C. A1 - Cordero, D. A1 - Lopez, J. A1 - J Marco A1 - Campos, I. A1 - Rodríguez, David A1 - Marco, R. A1 - Lopez, A. A1 - Orviz, P. A1 - Hammad, A. VL - 27 IS - 2 ER - TY - Generic T1 - HIDGM: A Hybrid Intrusion Detection System for Mobile networks T2 - International Conference on Computer and Electrical Engineering (ICEEE) Y1 - 2008 A1 - Shahriar Bijani A1 - Maryam Kazemitabar JF - International Conference on Computer and Electrical Engineering (ICEEE) PB - IEEE Computer Society ER - TY - JOUR T1 - The interactive European Grid: Project objectives and achievements JF - Computing and Informatics Y1 - 2008 A1 - J Marco A1 - Campos, I. A1 - Coterillo, I. A1 - Diaz, I. A1 - Lopez, A. A1 - Marco, R. A1 - Martinez-Rivero, C. A1 - Orviz, P. A1 - Rodríguez, David A1 - Gomes, J. A1 - Borges, G. A1 - Montecelo, M. A1 - David, M. A1 - Silva, B. A1 - Dias, N. A1 - Martins, JP A1 - Fernandez, C. A1 - Garcia-Tarres, L. VL - 27 IS - 2 ER - TY - CONF T1 - Matching Spatial Regions with Combinations of Interacting Gene Expression Patterns T2 - Communications in Computer and Information Science Y1 - 2008 A1 - van Hemert, J. I. A1 - Baldock, R. A. ED - M. Elloumi ED - \emph ED - et al KW - biomedical KW - data mining KW - DGEMap KW - e-Science AB - The Edinburgh Mouse Atlas aims to capture in-situ gene expression patterns in a common spatial framework. In this study, we construct a grammar to define spatial regions by combinations of these patterns. Combinations are formed by applying operators to curated gene expression patterns from the atlas, thereby resembling gene interactions in a spatial context. The space of combinations is searched using an evolutionary algorithm with the objective of finding the best match to a given target pattern. We evaluate the method by testing its robustness and the statistical significance of the results it finds. JF - Communications in Computer and Information Science PB - Springer Verlag ER - TY - JOUR T1 - Mobile Multimodality: A Theoretical Approach to Facilitate Virtual Device Environments JF - Mobile Networks and Applications Y1 - 2008 A1 - Srihathai Prammanee A1 - Klaus Moessner VL - 13 UR - http://dx.doi.org/10.1007/s11036-008-0091-z IS - 6 ER - TY - CONF T1 - A novel visual discriminator on network traffic pattern T2 - The Second International Conference on Advanced Engineering Computing and Applications in Sciences (ADVCOMP 2008) Y1 - 2008 A1 - Liangxiu Han A1 - J. Hemert AB - The wavelet transform has been shown to be a powerful tool for characterising network traffic. However, the resulting decomposition of a wavelet transform typically forms a high-dimension space. This is obviously problematic on compact representations, visualizations, and modelling approaches that are based on these high-dimensional data. In this study, we show how data projection techniques can represent the high-dimensional wavelet decomposition in a low dimensional space to facilitate visual analysis. A low-dimensional representation can significantly reduce the model complexity. Hence, features in the data can be presented with a small number of parameters. We demonstrate these projections in the context of network traffic pattern analysis. The experimental results show that the proposed method can effectively discriminate between different application flows, such as FTP and P2P. JF - The Second International Conference on Advanced Engineering Computing and Applications in Sciences (ADVCOMP 2008) PB - IEEE Computer Society CY - Valencia, Spain ER - TY - CONF T1 - OGSA-DAI: Middleware for Data Integration: Selected Applications T2 - ESCIENCE '08: Proceedings of the 2008 Fourth IEEE International Conference on eScience Y1 - 2008 A1 - Grant, Alistair A1 - Antonioletti, Mario A1 - Hume, Alastair C. A1 - Krause, Amy A1 - Dobrzelecki, Bartosz A1 - Jackson, Michael J. A1 - Parsons, Mark A1 - Atkinson, Malcolm P. A1 - Theocharopoulos, Elias JF - ESCIENCE '08: Proceedings of the 2008 Fourth IEEE International Conference on eScience PB - IEEE Computer Society CY - Washington, DC, USA SN - 978-0-7695-3535-7 ER - TY - CONF T1 - Orchestrating Data-Centric Workflows T2 - The 8th IEEE International Symposium on Cluster Computing and the Grid (CCGrid) Y1 - 2008 A1 - Barker, Adam A1 - Weissman, Jon B. A1 - van Hemert, Jano KW - grid computing KW - workflow JF - The 8th IEEE International Symposium on Cluster Computing and the Grid (CCGrid) PB - IEEE Computer Society ER - TY - BOOK T1 - Recent Advances in Evolutionary Computation for Combinatorial Optimization T2 - Studies in Computational Intelligence Y1 - 2008 A1 - Cotta, Carlos A1 - van Hemert, Jano AB - Combinatorial optimisation is a ubiquitous discipline whose usefulness spans vast applications domains. The intrinsic complexity of most combinatorial optimisation problems makes classical methods unaffordable in many cases. To acquire practical solutions to these problems requires the use of metaheuristic approaches that trade completeness for pragmatic effectiveness. Such approaches are able to provide optimal or quasi-optimal solutions to a plethora of difficult combinatorial optimisation problems. The application of metaheuristics to combinatorial optimisation is an active field in which new theoretical developments, new algorithmic models, and new application areas are continuously emerging. This volume presents recent advances in the area of metaheuristic combinatorial optimisation, with a special focus on evolutionary computation methods. Moreover, it addresses local search methods and hybrid approaches. In this sense, the book includes cutting-edge theoretical, methodological, algorithmic and applied developments in the field, from respected experts and with a sound perspective. JF - Studies in Computational Intelligence PB - Springer VL - 153 SN - 978-3-540-70806-3 UR - http://www.springer.com/engineering/book/978-3-540-70806-3 ER - TY - CONF T1 - Scientific Workflow: A Survey and Research Directions T2 - Lecture Notes in Computer Science Y1 - 2008 A1 - Barker, Adam A1 - van Hemert, Jano KW - e-Science KW - workflow AB - Workflow technologies are emerging as the dominant approach to coordinate groups of distributed services. However with a space filled with competing specifications, standards and frameworks from multiple domains, choosing the right tool for the job is not always a straightforward task. Researchers are often unaware of the range of technology that already exists and focus on implementing yet another proprietary workflow system. As an antidote to this common problem, this paper presents a concise survey of existing workflow technology from the business and scientific domain and makes a number of key suggestions towards the future development of scientific workflow systems. JF - Lecture Notes in Computer Science PB - Springer VL - 4967 UR - http://dx.doi.org/10.1007/978-3-540-68111-3_78 ER - TY - JOUR T1 - The Self-adaptation to dynamic failures for efficient Virtual Organization formation in Grid computing context JF - Chaos, Solitons and Fractals Y1 - 2008 A1 - Liangxiu Han KW - complex network system KW - failure recovery KW - graph theory KW - grid computing KW - virtual organization formation AB - Grid computing aims to enable “resource sharing and coordinated problem solving in dynamic, multi-institutional Virtual Organizations (VOs)”. However, due to the nature of heterogeneous and dynamic resources, dynamic failures in the distributed grid environment usually occur more than in traditional computation platforms, which cause failed VO formations. In this paper, we develop a novel self-adaptive mechanism to dynamic failures during VO formations. Such a self-adaptive scheme allows an individual and member of VOs to automatically find other available or replaceable one once a failure happens and therefore makes systems automatically recover from dynamic failures. We define dynamic failure situations of a system by using two standard indicators: Mean Time between Failures (MTBF) and Mean Time to Recover (MTTR). We model both MTBF and MTTR as Poisson distributions. We investigate and analyze the efficiency of the proposed self-adaptation mechanism to dynamic failures by comparing the success probability of VO formations before and after adopting it in three different cases: 1) different failure situations; 2) different organizational structures and scales; 3) different task complexities. The experimental results show that the proposed scheme can automatically adapt to dynamic failures and effectively improve the dynamic VO formation performance in the event of node failures, which provide a valuable addition to the field. PB - Elsevier Science ER - TY - JOUR T1 - Semantic-Supported and Agent-Based Decentralized Grid Resource Discovery JF - Future Generation Computer Systems Y1 - 2008 A1 - Liangxiu Han A1 - Dave Berry KW - grid resource discovery, decentralization, agent, semantic similarity, ontology AB - One of open issues in grid computing is efficient resource discovery. In this paper, we propose a novel semantic-supported and agent-based decentralized grid resource discovery mechanism. Without overhead of negotiation, the algorithm allows individual resource agents to semantically interact with neighbour agents based on local knowledge and dynamically form a resource service chain to complete a task. The algorithm ensures resource agent’s ability to cooperate and coordinate on neighbour knowledge requisition for flexible problem solving. The developed algorithm is evaluated by investigating the relationship between the success probability of resource discovery and semantic similarity under different factors. The experiments show the algorithm could flexibly and dynamically discover resources and therefore provide a valuable addition to the field. PB - ScienceDirect VL - 24 IS - 8 ER - TY - CONF T1 - Widening Uptake of e-Infrastructure Services T2 - 4th International Conference on e-Social Science Y1 - 2008 A1 - Voss, A. A1 - Asgari-Targhi, M. A1 - Procter, R. A1 - Halfpenny, P. A1 - Dunn, S. A1 - Fragkouli, E. A1 - Anderson, S. A1 - Hughes, L. A1 - Mineter, M. A1 - Fergusson, D. A1 - Atkinson, M. AB - This paper presents findings from the e-Uptake project which aims to widen the uptake of e-Infrastructure Services for research. We focus specifically on the identification of barriers and enablers of uptake and the taxonomy developed to structure our findings. Based on these findings, we describe the development of a number of interventions such as training and outreach events, workshops and the deployment of a UK 'one-stop-shop' for support and event information as well as training material. Finally, we will describe how the project relates to other ongoing community engagement efforts in the UK and worldwide. Introduction Existing investments in e-Science and Grid computing technologies have helped to develop the capacity to build e-Infrastructures for research: distributed, networked, interoperable computing and data resources that are available to underpin a wide range of research activities in all research disciplines. In the UK, the Research Councils and the JISC are funding programmes to support the development of essential components of such infrastructures such as National Grid Service (www.ngs.ac.uk) or the UK Access Management Federation (www.ukfederation.org.uk) as well as discipline-specific efforts to build consistent and accessible instantiations of e-Infrastructures, for example the e- Infrastructure for the Social Sciences (Daw et al. 2007). These investments are complemented by an active programme of community engagement (Voss et al. 2007). As part of the community engagement strand of its e-Infrastructure programme, JISC has funded the e-Uptake project, a collaboration between the ESRC National Centre for e-Social Science at the University of Manchester, the Arts & Humanities e-Science Support Centre at King's College London and the National e-Science Centre at the University of Edinburgh. In this paper we present the project's activities to date to widen the uptake of e-Infrastructure services by eliciting information about the barriers to and enablers of uptake, developing adequate interventions such as training and outreach events, running workshops and the deploying a UK 'one-stop-shop' for support and event information as well as training material. JF - 4th International Conference on e-Social Science CY - Manchester UR - http://www.ncess.ac.uk/events/conference/programme/workshop1/?ref=/programme/thurs/1aVoss.htm ER - TY - CONF T1 - WikiSim: simulating knowledge collection and curation in structured wikis. T2 - Proceedings of the 2008 International Symposium on Wikis in Porto, Portugal Y1 - 2008 A1 - De~Ferrari, Luna A1 - Stuart Aitken A1 - van Hemert, Jano A1 - Goryanin, Igor AB - The aim of this work is to model quantitatively one of the main properties of wikis: how high quality knowledge can emerge from the individual work of independent volunteers. The approach chosen is to simulate knowledge collection and curation in wikis. The basic model represents the wiki as a set of of true/false values, added and edited at each simulation round by software agents (users) following a fixed set of rules. The resulting WikiSim simulations already manage to reach distributions of edits and user contributions very close to those reported for Wikipedia. WikiSim can also span conditions not easily measurable in real-life wikis, such as the impact of various amounts of user mistakes. WikiSim could be extended to model wiki software features, such as discussion pages and watch lists, while monitoring the impact they have on user actions and consensus, and their effect on knowledge quality. The method could also be used to compare wikis with other curation scenarios based on centralised editing by experts. The future challenges for WikiSim will be to find appropriate ways to evaluate and validate the models and to keep them simple while still capturing relevant properties of wiki systems. JF - Proceedings of the 2008 International Symposium on Wikis in Porto, Portugal PB - ACM CY - New York, NY, USA ER - TY - CONF T1 - Accessing Data in Grids Using OGSA-DAI T2 - Knowledge and Data Management in Grids Y1 - 2007 A1 - Chue Hong, N. P. A1 - Antonioletti, M. A1 - Karasavvas, K. A. A1 - Atkinson, M. ED - Talia, D. ED - Bilas, A. ED - Dikaiakos, M. AB - The grid provides a vision in which resources, including storage and data, can be shared across organisational boundaries. The original emphasis of grid computing lay in the sharing of computational resources but technological and scientific advances have led to an ongoing data explosion in many fields. However, data is stored in many different storage systems and data formats, with different schema, access rights, metadata attributes, and ontologies all of which are obstacles to the access, integration and management of this information. In this chapter we examine some of the ways in which these differences can be addressed by grid technology to enable the meaningful sharing of data. In particular, we present an overview of the OGSA-DAI (Open Grid Service Architecture - Data Access and Integration) software, which provides a uniform, extensible framework for accessing structured and semi-structured data and provide some examples of its use in other projects. The open-source OGSA-DAI software is freely available from http://www.ogsadai.org.uk. JF - Knowledge and Data Management in Grids SN - 978-0-387-37830-5 UR - http://www.springer.com/computer/communication+networks/book/978-0-387-37830-5 ER - TY - CONF T1 - The Architectural Design of Multi Interface-Device Binding (MID-B) System T2 - The 3rd Workshop on Context Awareness for Proactive Systems Y1 - 2007 A1 - Srihathai Prammanee A1 - Klaus Moessner AB - The Multi Interface-Device Binding (MID-B) System enhances a multimodal interaction in a virtual-device environment. The system promises to overcome the drawbacks of classic multimodal interaction. In the classic sense, multimodality uses a strategy of simultaneously utilising several modalities generally offered on a single device. In contrast, the MID-B’s mechanism gets multimodality out of the solitary-device scenario. In MID-B, a ‘controller- device’ (UE_C) is aware of the availability of various devices in the vicinity, each of which may host one or more user interfaces (modalities). The capabilities of each of the co-located devices, together with the context in which the user acts, is exploited to dynamically customise the interface services available. This paper describes the MID-B architecture and its mechanisms to collect and exploit device and user context information to dynamically adapt the user interfaces. JF - The 3rd Workshop on Context Awareness for Proactive Systems CY - Guildford, UK UR - http://www.geocities.com/sprammanee/ ER - TY - CHAP T1 - COBrA and COBrA-CT: Ontology Engineering Tools T2 - Anatomy Ontologies for Bioinformatics: Principles and Practice Y1 - 2007 A1 - Stuart Aitken A1 - Yin Chen ED - Albert Burger ED - Duncan Davidson ED - Richard Baldock AB - COBrA is a Java-based ontology editor for bio-ontologies and anatomies that dif- fers from other editors by supporting the linking of concepts between two ontologies, and providing sophisticated analysis and verification functions. In addition to the Gene Ontology and Open Biology Ontologies formats, COBrA can import and export ontologies in the Se- mantic Web formats RDF, RDFS and OWL. COBrA is being re-engineered as a Prot ́eg ́e plug-in, and complemented by an ontology server and a tool for the management of ontology versions and collaborative ontology de- velopment. We describe both the original COBrA tool and the current developments in this chapter. JF - Anatomy Ontologies for Bioinformatics: Principles and Practice PB - Springer SN - ISBN-10:1846288843 UR - http://www.amazon.ca/Anatomy-Ontologies-Bioinformatics-Principles-Practice/dp/1846288843 ER - TY - CONF T1 - Data Integration in eHealth: A Domain/Disease Specific Roadmap T2 - Studies in Health Technology and Informatics Y1 - 2007 A1 - Ure, J. A1 - Proctor, R. A1 - Martone, M. A1 - Porteous, D. A1 - Lloyd, S. A1 - Lawrie, S. A1 - Job, D. A1 - Baldock, R. A1 - Philp, A. A1 - Liewald, D. A1 - Rakebrand, F. A1 - Blaikie, A. A1 - McKay, C. A1 - Anderson, S. A1 - Ainsworth, J. A1 - van Hemert, J. A1 - Blanquer, I. A1 - Sinno ED - N. Jacq ED - Y. Legr{\'e} ED - H. Muller ED - I. Blanquer ED - V. Breton ED - D. Hausser ED - V. Hern{\'a}ndez ED - T. Solomonides ED - M. Hofman-Apitius KW - e-Science AB - The paper documents a series of data integration workshops held in 2006 at the UK National e-Science Centre, summarizing a range of the problem/solution scenarios in multi-site and multi-scale data integration with six HealthGrid projects using schizophrenia as a domain-specific test case. It outlines emerging strategies, recommendations and objectives for collaboration on shared ontology-building and harmonization of data for multi-site trials in this domain. JF - Studies in Health Technology and Informatics PB - IOPress VL - 126 SN - 978-1-58603-738-3 ER - TY - CONF T1 - e-Research Infrastructure Development and Community Engagement T2 - All Hands Meeting 2007 Y1 - 2007 A1 - Voss, A. A1 - Mascord, M. A1 - Fraser, M. A1 - Jirotka, M. A1 - Procter, R. A1 - Halfpenny, P. A1 - Fergusson, D. A1 - Atkinson, M. A1 - Dunn, S. A1 - Blanke, T. A1 - Hughes, L. A1 - Anderson, S. AB - The UK and wider international e-Research initiatives are entering a critical phase in which they need to move from the development of the basic underlying technology, demonstrators, prototypes and early applications to wider adoption and the development of stable infrastructures. In this paper we will review existing work on studies of infrastructure and community development, requirements elicitation for existing services as well as work within the arts and humanities and the social sciences to establish e-Research in these communities. We then describe two projects recently funded by JISC to study barriers to adoption and responses to them as well as use cases and service usage models. JF - All Hands Meeting 2007 CY - Nottingham, UK ER - TY - Generic T1 - European Graduate Student Workshop on Evolutionary Computation Y1 - 2007 A1 - Giacobini, Mario A1 - van Hemert, Jano ED - Mario Giacobini ED - van Hemert, Jano KW - evolutionary computation AB - Evolutionary computation involves the study of problem-solving and optimization techniques inspired by principles of evolution and genetics. As any other scientific field, its success relies on the continuity provided by new researchers joining the field to help it progress. One of the most important sources for new researchers is the next generation of PhD students that are actively studying a topic relevant to this field. It is from this main observation the idea arose of providing a platform exclusively for PhD students. CY - Valencia, Spain ER - TY - Generic T1 - Evolutionary Computation in Combinatorial Optimization, 7th European Conference T2 - Lecture Notes in Computer Science Y1 - 2007 A1 - Cotta, Carlos A1 - van Hemert, Jano ED - Carlos Cotta ED - van Hemert, Jano KW - evolutionary computation AB - Metaheuristics have often been shown to be effective for difficult combinatorial optimization problems appearing in various industrial, economical, and scientific domains. Prominent examples of metaheuristics are evolutionary algorithms, simulated annealing, tabu search, scatter search, memetic algorithms, variable neighborhood search, iterated local search, greedy randomized adaptive search procedures, estimation of distribution algorithms, and ant colony optimization. Successfully solved problems include scheduling, timetabling, network design, transportation and distribution, vehicle routing, the traveling salesman problem, satisfiability, packing and cutting, and general mixed integer programming. EvoCOP began in 2001 and has been held annually since then. It was the first event specifically dedicated to the application of evolutionary computation and related methods to combinatorial optimization problems. Originally held as a workshop, EvoCOP became a conference in 2004. The events gave researchers an excellent opportunity to present their latest research and to discuss current developments and applications as well as providing for improved interaction between members of this scientific community. Following the general trend of hybrid metaheuristics and diminishing boundaries between the different classes of metaheuristics, EvoCOP has broadened its scope over the last years and invited submissions on any kind of metaheuristic for combinatorial optimization. JF - Lecture Notes in Computer Science PB - Springer VL - LNCS 4446 UR - http://springerlink.metapress.com/content/105633/ ER - TY - CONF T1 - Grid Enabling Your Data Resources with OGSA-DAI T2 - Applied Parallel Computing. State of the Art in Scientific Computing Y1 - 2007 A1 - Antonioletti, M. A1 - Atkinson, M. A1 - Chue Hong, N. P. A1 - Dobrzelecki, B. A1 - Hume, A. C. A1 - Jackson, M. A1 - Karasavvas, K. A1 - Krause, A. A1 - Schopf, J. M. A1 - Sugden. T. A1 - Theocharopoulos, E. JF - Applied Parallel Computing. State of the Art in Scientific Computing T3 - Lecture Notes in Computer Science VL - 4699 ER - TY - CONF T1 - Interaction as a Grounding for Peer to Peer Knowledge Sharing T2 - Advances in Web Semantics Y1 - 2007 A1 - Robertson, D. A1 - Walton, C. A1 - Barker, A. A1 - Besana, P. A1 - Chen-Burger, Y. A1 - Hassan, F. A1 - Lambert, D. A1 - Li, G. A1 - McGinnis, J A1 - Osman, N. A1 - Bundy, A. A1 - McNeill, F. A1 - van Harmelen, F. A1 - Sierra, C. A1 - Giunchiglia, F. JF - Advances in Web Semantics PB - LNCS-IFIP VL - 1 ER - TY - CONF T1 - Managing the transition from OBO to OWL: The COBrA-CT Bio-Ontology Tools T2 - UK e-Science Al l Hands Meeting 2007 Y1 - 2007 A1 - S. Aitken A1 - Y. Chen KW - Bio-ontology, Grid, OBO, OWL AB - This paper presents the COBrA-CT ontology tools, which include an ontology server database and version manager client tool for collaborative ontology development, and an editor for bio-ontologies that are represented in the Web Ontology Language (OWL) format. The ontology server uses OGSA-DAI Grid technology to provide access to the ontology server database. These tools implement the agreed standard for representing Open Biomedical Ontologies (OBO) in OWL and interoperate with other tools developed for this standard. Such tools are essential for the uptake of OWL in the biomedical ontology community. JF - UK e-Science Al l Hands Meeting 2007 CY - Nottingham, UK ER - TY - JOUR T1 - MAPFS-DAI, an extension of OGSA-DAI based on a parallel file system JF - Future Generation Computer Systems Y1 - 2007 A1 - Sanchez, A. A1 - Perez, M. S. A1 - Karasavvas, K. A1 - Herrero, P. A1 - Perez, A. VL - 23 ER - TY - JOUR T1 - Mining co-regulated gene profiles for the detection of functional associations in gene expression data JF - Bioinformatics Y1 - 2007 A1 - Gyenesei, Attila A1 - Wagner, Ulrich A1 - Barkow-Oesterreicher, Simon A1 - Stolte, Etzard A1 - Schlapbach, Ralph VL - 23 ER - TY - CONF T1 - Mining spatial gene expression data for association rules T2 - Lecture Notes in Bioinformatics Y1 - 2007 A1 - van Hemert, J. I. A1 - Baldock, R. A. ED - S. Hochreiter ED - R. Wagner KW - biomedical KW - data mining KW - DGEMap KW - e-Science AB - We analyse data from the Edinburgh Mouse Atlas Gene-Expression Database (EMAGE) which is a high quality data source for spatio-temporal gene expression patterns. Using a novel process whereby generated patterns are used to probe spatially-mapped gene expression domains, we are able to get unbiased results as opposed to using annotations based predefined anatomy regions. We describe two processes to form association rules based on spatial configurations, one that associates spatial regions, the other associates genes. JF - Lecture Notes in Bioinformatics PB - Springer Verlag UR - http://dx.doi.org/10.1007/978-3-540-71233-6_6 ER - TY - JOUR T1 - OBO Explorer: An Editor for Open Biomedical Ontologies in OWL JF - Bioinformatics Y1 - 2007 A1 - Stuart Aitken A1 - Yin Chen A1 - Jonathan Bard AB - To clarify the semantics, and take advantage of tools and algorithms developed for the Semantic Web, a mapping from the Open Biomedical Ontologies (OBO) format to the Web Ontology Language (OWL) has been established. We present an ontology editor that allows end users to work directly with this OWL representation of OBO format ontologies. PB - Oxford Journals UR - http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btm593? ER - TY - CONF T1 - OGSA-DAI 3.0 - The What's and Whys T2 - UK e-Science All Hands Meeting Y1 - 2007 A1 - Antonioletti, M. A1 - Hong, N. P. Chue A1 - Hume, A. C. A1 - Jackson, M. A1 - Karasavvas, K. A1 - Krause, A. A1 - Schopf, J. M. A1 - Atkinson, M. P. A1 - Dobrzelecki, B. A1 - Illingworth, M. A1 - McDonnell, N. A1 - Parsons, M. A1 - Theocharopoulous, E. JF - UK e-Science All Hands Meeting ER - TY - CONF T1 - Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application T2 - IPDPS Y1 - 2007 A1 - Rosa Filgueira A1 - David E. Singh A1 - Florin Isaila A1 - Jesús Carretero A1 - Antonio Garcia Loureiro JF - IPDPS ER - TY - Generic T1 - Special Issue: Selected Papers from the 2004 U.K. e-Science All Hands Meeting T2 - All Hands Meeting 2004 Y1 - 2007 A1 - Walker, D. W. A1 - Atkinson, M. P. A1 - Sommerville, I. ED - Walker, D. W. ED - Atkinson, M. P. ED - Sommerville, I. JF - All Hands Meeting 2004 T3 - Concurrency and Computation: Practice and Experience PB - John Wiley & Sons Ltd CY - Nottingham, UK VL - 19 ER - TY - CONF T1 - Study of User Priorities for e-Infrastructure for e-Research (SUPER) T2 - Proceedings of the UK e-Science All Hands Meeting Y1 - 2007 A1 - Newhouse, S. A1 - Schopf, J. M. A1 - Richards, A. A1 - Atkinson, M. P. JF - Proceedings of the UK e-Science All Hands Meeting ER - TY - CONF T1 - Towards a Grid-Enabled Simulation Framework for Nano-CMOS Electronics T2 - 3rd IEEE International Conference on eScience and Grid Computing Y1 - 2007 A1 - Liangxiu Han A1 - Asen Asenov A1 - Dave Berry A1 - Campbell Millar A1 - Gareth Roy A1 - Scott Roy A1 - Richard Sinnott A1 - Gordon Stewart AB - The electronics design industry is facing major challenges as transistors continue to decrease in size. The next generation of devices will be so small that the position of individual atoms will affect their behaviour. This will cause the transistors on a chip to have highly variable characteristics, which in turn will impact circuit and system design tools. The EPSRC project “Meeting the Design Challenges of Nano-CMOS Electronics” (Nano-CMOS) has been funded to explore this area. In this paper, we describe the distributed data-management and computing framework under development within Nano-CMOS. A key aspect of this framework is the need for robust and reliable security mechanisms that support distributed electronics design groups who wish to collaborate by sharing designs, simulations, workflows, datasets and computation resources. This paper presents the system design, and an early prototype of the project which hasbeen useful in helping us to understand the benefits of such a grid infrastructure. In particular, we also present two typical use cases: user authentication, and execution of large-scale device simulations. JF - 3rd IEEE International Conference on eScience and Grid Computing PB - IEEE Computer Society CY - Bangalore, India ER - TY - CONF T1 - Transaction-Based Grid Database Replication T2 - UK e-Science Al l Hands Meeting 2007 Y1 - 2007 A1 - Y. Chen A1 - D. Berry A1 - P. Dantressangle KW - Grid, Replication, Transaction-based, OGSA-DAI AB - We present a framework for grid database replication. Data replication is one of the most useful strategies to achieve high levels of availability and fault tolerance as well as minimal access time in grids. It is commonly demanded by many grid applications. However, most existing grid replication systems only deal with read-only files. By contrast, several relational database vendors provide tools that offer transaction-based replication, but the capabilities of these products are insufficient to address grid issues. They lack scalability and cannot cope with the heterogeneous nature of grid resources. Our approach uses existing grid mechanisms to provide a metadata registry and to make initial replicas of data resources. We then define high-level APIs for managing transaction-based replication. These APIs can be mapped to a variety of relational database replication mechanisms allowing us to use existing vendor-specific solutions. The next stage in the project will use OGSA- DAI to manage replication across multiple domains. In this way, our framework can support transaction-based database synchronisation that maintains consistency in a data-intensive, large- scale distributed, disparate networking environment. JF - UK e-Science Al l Hands Meeting 2007 CY - Nottingham, UK ER - TY - CONF T1 - EGEE: building a pan-European grid training organisation T2 - ACSW Frontiers Y1 - 2006 A1 - Berlich, R{\"u}diger A1 - Hardt, Marcus A1 - Kunze, Marcel A1 - Atkinson, Malcolm P. A1 - Fergusson, David JF - ACSW Frontiers ER - TY - Generic T1 - European Graduate Student Workshop on Evolutionary Computation Y1 - 2006 A1 - Giacobini, Mario A1 - van Hemert, Jano ED - Giacobini, Mario ED - van Hemert, Jano KW - evolutionary computation AB - Evolutionary computation involves the study of problem-solving and optimization techniques inspired by principles of evolution and genetics. As any other scientific field, its success relies on the continuity provided by new researchers joining the field to help it progress. One of the most important sources for new researchers is the next generation of PhD students that are actively studying a topic relevant to this field. It is from this main observation the idea arose of providing a platform exclusively for PhD students. CY - Budapest, Hungary ER - TY - JOUR T1 - Evolving combinatorial problem instances that are difficult to solve JF - Evolutionary Computation Y1 - 2006 A1 - van Hemert, J. I. KW - constraint programming KW - constraint satisfaction KW - evolutionary computation KW - problem evolving KW - satisfiability KW - travelling salesman AB - In this paper we demonstrate how evolutionary computation can be used to acquire difficult to solve combinatorial problem instances, thereby stress-testing the corresponding algorithms used to solve these instances. The technique is applied in three important domains of combinatorial optimisation, binary constraint satisfaction, Boolean satisfiability, and the travelling salesman problem. Problem instances acquired through this technique are more difficult than ones found in popular benchmarks. We analyse these evolved instances with the aim to explain their difficulty in terms of structural properties, thereby exposing the weaknesses of corresponding algorithms. VL - 14 UR - http://www.mitpressjournals.org/toc/evco/14/4 ER - TY - CONF T1 - FireGrid: Integrated emergency response and fire safety engineering for the future built environment T2 - All Hands Meeting 2005 Y1 - 2006 A1 - D. Berry A1 - Usmani, A. A1 - Torero, J. A1 - Tate, A. A1 - McLaughlin, S. A1 - Potter, S. A1 - Trew, A. A1 - Baxter, R. A1 - Bull, M. A1 - Atkinson, M. AB - Analyses of disasters such as the Piper Alpha explosion (Sylvester-Evans and Drysdale, 1998), the World Trade Centre collapse (Torero et al, 2002, Usmani et al, 2003) and the fires at Kings Cross (Drysdale et al, 1992) and the Mont Blanc tunnel (Rapport Commun, 1999) have revealed many mistaken decisions, such as that which sent 300 fire-fighters to their deaths in the World Trade Centre. Many of these mistakes have been attributed to a lack of information about the conditions within the fire and the imminent consequences of the event. E-Science offers an opportunity to significantly improve the intervention in fire emergencies. The FireGrid Consortium is working on a mixture of research projects to make this vision a reality. This paper describes the research challenges and our plans for solving them. JF - All Hands Meeting 2005 CY - Nottingham, UK ER - TY - CONF T1 - Grid Enabling your Data Resources with OGSA-DAI T2 - Workshop on State-of-the-Art in Scientific and Parallel Computing Y1 - 2006 A1 - Antonioletti, M. A1 - Atkinson, M. A1 - Hong, N. Chue A1 - Dobrzelecki, B. A1 - Hume, A. A1 - Jackson, M. A1 - Karasavvas, K. A1 - Krause, A. A1 - Sugden, T. A1 - Theocharopoulos, E. JF - Workshop on State-of-the-Art in Scientific and Parallel Computing ER - TY - CONF T1 - Grid Infrastructures for Secure Access to and Use of Bioinformatics Data: Experiences from the BRIDGES Project T2 - Proceedings of the First International Conference on Availability, Reliability and Security, ARES Y1 - 2006 A1 - Richard O. Sinnott A1 - Micha Bayer A1 - A. J. Stell A1 - Jos Koetsier JF - Proceedings of the First International Conference on Availability, Reliability and Security, ARES T3 - Proceedings of the The First International Conference on Availability, Reliability and Security CY - Vienna, Austria ER - TY - CONF T1 - Improving Graph Colouring Algorithms and Heuristics Using a Novel Representation T2 - Springer Lecture Notes on Computer Science Y1 - 2006 A1 - Juhos, I. A1 - van Hemert, J. I. ED - J. Gottlieb ED - G. Raidl KW - constraint satisfaction KW - graph colouring AB - We introduce a novel representation for the graph colouring problem, called the Integer Merge Model, which aims to reduce the time complexity of an algorithm. Moreover, our model provides useful information for guiding heuristics as well as a compact description for algorithms. To verify the potential of the model, we use it in dsatur, in an evolutionary algorithm, and in the same evolutionary algorithm extended with heuristics. An empiricial investigation is performed to show an increase in efficiency on two problem suites , a set of practical problem instances and a set of hard problem instances from the phase transition. JF - Springer Lecture Notes on Computer Science PB - Springer-Verlag ER - TY - JOUR T1 - Increasing the efficiency of graph colouring algorithms with a representation based on vector operations JF - Journal of Software Y1 - 2006 A1 - Juhos, I. A1 - van Hemert, J. I. KW - graph colouring AB - We introduce a novel representation for the graph colouring problem, called the Integer Merge Model, which aims to reduce the time complexity of graph colouring algorithms. Moreover, this model provides useful information to aid in the creation of heuristics that can make the colouring process even faster. It also serves as a compact definition for the description of graph colouring algorithms. To verify the potential of the model, we use it in the complete algorithm DSATUR, and in two version of an incomplete approximation algorithm; an evolutionary algorithm and the same evolutionary algorithm extended with guiding heuristics. Both theoretical and empirical results are provided investigation is performed to show an increase in the efficiency of solving graph colouring problems. Two problem suites were used for the empirical evidence: a set of practical problem instances and a set of hard problem instances from the phase transition. VL - 1 ER - TY - CHAP T1 - Knowledge and Data Management in Grids, CoreGRID T2 - Euro-Par'06 Proceedings of the CoreGRID 2006, UNICORE Summit 2006, Petascale Computational Biology and Bioinformatics conference on Parallel processing Y1 - 2006 A1 - Chue Hong, N. P. A1 - Antonioletti, M. A1 - Karasavvas, K. A. A1 - Atkinson, M. ED - Lehner, W. ED - Meyer, N. ED - Streit, A. ED - Stewart, C. JF - Euro-Par'06 Proceedings of the CoreGRID 2006, UNICORE Summit 2006, Petascale Computational Biology and Bioinformatics conference on Parallel processing T3 - Lecture Notes in Computer Science PB - Springer CY - Berlin, Germany VL - 4375 SN - 978-3-540-72226-7 UR - http://www.springer.com/computer/communication+networks/book/978-3-540-72226-7 ER - TY - CONF T1 - Neighborhood Searches for the Bounded Diameter Minimum Spanning Tree Problem Embedded in a VNS, EA, and ACO T2 - Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2006) Y1 - 2006 A1 - Gruber, M. A1 - van Hemert, J. I. A1 - Raidl, G. R. ED - Maarten Keijzer ED - et al KW - constraint satisfaction KW - evolutionary computation KW - variable neighbourhood search AB - We consider the Bounded Diameter Minimum Spanning Tree problem and describe four neighbourhood searches for it. They are used as local improvement strategies within a variable neighbourhood search (VNS), an evolutionary algorithm (EA) utilising a new encoding of solutions, and an ant colony optimisation (ACO).We compare the performance in terms of effectiveness between these three hybrid methods on a suite f popular benchmark instances, which contains instances too large to solve by current exact methods. Our results show that the EA and the ACO outperform the VNS on almost all used benchmark instances. Furthermore, the ACO yields most of the time better solutions than the EA in long-term runs, whereas the EA dominates when the computation time is strongly restricted. JF - Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2006) PB - ACM CY - Seattle, USA VL - 2 ER - TY - CONF T1 - Profiling OGSA-DAI Performance for Common Use Patterns T2 - UK e-Science All Hands Meeting Y1 - 2006 A1 - Dobrzelecki, B. A1 - Antonioletti, M. A1 - Schopf, J. M. A1 - Hume, A. C. A1 - Atkinson, M. A1 - Hong, N. P. Chue A1 - Jackson, M. A1 - Karasavvas, K. A1 - Krause, A. A1 - Parsons, M. A1 - Sugden, T. A1 - Theocharopoulos, E. JF - UK e-Science All Hands Meeting ER - TY - CONF T1 - A Shibboleth-Protected Privilege Management Infrastructure for e-Science Education T2 - Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2006 Y1 - 2006 A1 - J. Watt A1 - Oluwafemi Ajayi A1 - J. Jiang A1 - Jos Koetsier A1 - Richard O. Sinnott KW - security JF - Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2006 PB - IEEE Computer Society CY - Singapore ER - TY - CONF T1 - Complexity Transitions in Evolutionary Algorithms: Evaluating the impact of the initial population T2 - Proceedings of the Congress on Evolutionary Computation Y1 - 2005 A1 - Defaweux, A. A1 - Lenaerts, T. A1 - van Hemert, J. I. A1 - Parent, J. KW - constraint satisfaction KW - transition models AB - This paper proposes an evolutionary approach for the composition of solutions in an incremental way. The approach is based on the metaphor of transitions in complexity discussed in the context of evolutionary biology. Partially defined solutions interact and evolve into aggregations until a full solution for the problem at hand is found. The impact of the initial population on the outcome and the dynamics of the process is evaluated using the domain of binary constraint satisfaction problems. JF - Proceedings of the Congress on Evolutionary Computation PB - {IEEE} Press ER - TY - JOUR T1 - A Criticality-Based Framework for Task Composition in Multi-Agent Bioinformatics Integration Systems JF - Bioinformatics Y1 - 2005 A1 - Karasavvas, K. A1 - Baldock, R. A1 - Burger, A. VL - 21 ER - TY - CONF T1 - Cross-Layer Peer-to-Peer Traffic Identification and Optimization Based on Active Networking T2 - 7th International Working Conference on Active and Programmable Networks Y1 - 2005 A1 - Dedinski, I. A1 - De Meer, H. A1 - Liangxiu Han A1 - Mathy, L. A1 - Pezaros, D. A1 - P. , Sventek, J. S. A1 - Xiaoying, Z. JF - 7th International Working Conference on Active and Programmable Networks CY - Sophia Antipolis, French Riviera, La Cote d'Azur, France, November 21-23, 2005. ER - TY - JOUR T1 - The design and implementation of Grid database services in OGSA-DAI JF - Concurrency - Practice and Experience Y1 - 2005 A1 - Antonioletti, Mario A1 - Atkinson, Malcolm P. A1 - Baxter, Robert M. A1 - Borley, Andrew A1 - Hong, Neil P. Chue A1 - Collins, Brian A1 - Hardman, Neil A1 - Hume, Alastair C. A1 - Knox, Alan A1 - Mike Jackson A1 - Krause, Amrey A1 - Laws, Simon A1 - Magowan, James A1 - Pato VL - 17 ER - TY - CONF T1 - The Digital Curation Centre: a vision for digital curation T2 - 2005 IEEE International Symposium on Mass Storage Systems and Technology Y1 - 2005 A1 - Rusbridge, C. A1 - P. Burnhill A1 - S. Ross A1 - P. Buneman A1 - D. Giaretta A1 - Lyon, L. A1 - Atkinson, M. AB - We describe the aims and aspirations for the Digital Curation Centre (DCC), the UK response to the realisation that digital information is both essential and fragile. We recognise the equivalence of preservation as "interoperability with the future", asserting that digital curation is concerned with "communication across time". We see the DCC as having relevance for present day data curation and for continuing data access for generations to come. We describe the structure and plans of the DCC, designed to support these aspirations and based on a view of world class research being developed into curation services, all of which are underpinned by outreach to the broadest community. JF - 2005 IEEE International Symposium on Mass Storage Systems and Technology PB - IEEE Computer Society CY - Sardinia, Italy SN - 0-7803-9228-0 ER - TY - CONF T1 - Evolutionary Transitions as a Metaphor for Evolutionary Optimization T2 - LNAI 3630 Y1 - 2005 A1 - Defaweux, A. A1 - Lenaerts, T. A1 - van Hemert, J. I. ED - M. Capcarrere ED - A. A. Freitas ED - P. J. Bentley ED - C. G. Johnson ED - J. Timmis KW - constraint satisfaction KW - transition models AB - This paper proposes a computational model for solving optimisation problems that mimics the principle of evolutionary transitions in individual complexity. More specifically it incorporates mechanisms for the emergence of increasingly complex individuals from the interaction of more simple ones. The biological principles for transition are outlined and mapped onto an evolutionary computation context. The class of binary constraint satisfaction problems is used to illustrate the transition mechanism. JF - LNAI 3630 PB - Springer-Verlag SN - 3-540-28848-1 ER - TY - Generic T1 - Experience with the international testbed in the crossgrid project T2 - Advances in Grid Computing-EGC 2005 Y1 - 2005 A1 - Gomes, J. A1 - David, M. A1 - Martins, J. A1 - Bernardo, L. A1 - A García A1 - Hardt, M. A1 - Kornmayer, H. A1 - Marco, Jesus A1 - Marco, Rafael A1 - Rodríguez, David A1 - Diaz, Irma A1 - Cano, Daniel A1 - Salt, J. A1 - Gonzalez, S. A1 - J Sánchez A1 - Fassi, F. A1 - Lara, V. A1 - Nyczyk, P. A1 - Lason, P. A1 - Ozieblo, A. A1 - Wolniewicz, P. A1 - Bluj, M. A1 - K Nawrocki A1 - A Padee A1 - W Wislicki ED - Peter M. A. Sloot, Alfons G. Hoekstra, Thierry Priol, Alexander Reinefeld ED - Marian Bubak JF - Advances in Grid Computing-EGC 2005 T3 - LNCS PB - Springer Berlin/Heidelberg CY - Amsterdam VL - 3470 ER - TY - Generic T1 - Genetic Programming, Proceedings of the 8th European Conference T2 - Lecture Notes in Computer Science Y1 - 2005 A1 - Keijzer, M. A1 - Tettamanzi, A. A1 - Collet, P. A1 - van Hemert, J. A1 - Tomassini, M. ED - M. Keijzer ED - A. Tettamanzi ED - P. Collet ED - van Hemert, J. ED - M. Tomassini KW - evolutionary computation JF - Lecture Notes in Computer Science PB - Springer VL - 3447 SN - 3-540-25436-6 UR - http://www.springeronline.com/sgw/cda/frontpage/0,11855,3-40100-22-45347265-0,00.html?changeHeader=true ER - TY - CONF T1 - Heuristic Colour Assignment Strategies for Merge Models in Graph Colouring T2 - Springer Lecture Notes on Computer Science Y1 - 2005 A1 - Juhos, I. A1 - Tóth, A. A1 - van Hemert, J. I. ED - G. Raidl ED - J. Gottlieb KW - constraint satisfaction KW - graph colouring AB - In this paper, we combine a powerful representation for graph colouring problems with different heuristic strategies for colour assignment. Our novel strategies employ heuristics that exploit information about the partial colouring in an aim to improve performance. An evolutionary algorithm is used to drive the search. We compare the different strategies to each other on several very hard benchmarks and on generated problem instances, and show where the novel strategies improve the efficiency. JF - Springer Lecture Notes on Computer Science PB - Springer-Verlag, Berlin ER - TY - CONF T1 - Introduction to OGSA-DAI Services T2 - Scientific Applications of Grid Computing Y1 - 2005 A1 - Karasavvas, K. A1 - Antonioletti, M. A1 - Atkinson, M. A1 - Hong, N. C. A1 - Sugden, T. A1 - Hume, A. A1 - Jackson, M. A1 - Krause, A. A1 - Palansuriya, C. JF - Scientific Applications of Grid Computing VL - 3458 SN - 978-3-540-25810-0 ER - TY - CONF T1 - A New Architecture for OGSA-DAI T2 - UK e-Science All Hands Meeting Y1 - 2005 A1 - Atkinson, M. A1 - Karasavvas, K. A1 - Antonioletti, M. A1 - Baxter, R. A1 - Borley, A. A1 - Hong, N. C. A1 - Hume, A. A1 - Jackson, M. A1 - Krause, A. A1 - Laws, S. A1 - Paton, N. A1 - Schopf, J. A1 - Sugden, T. A1 - Tourlas, K. A1 - Watson, P. JF - UK e-Science All Hands Meeting ER - TY - CONF T1 - OGSA-DAI Status and Benchmarks T2 - All Hands Meeting 2005 Y1 - 2005 A1 - Antonioletti, Mario A1 - Malcolm Atkinson A1 - Rob Baxter A1 - Andrew Borle A1 - Hong, Neil P. Chue A1 - Patrick Dantressangle A1 - Hume, Alastair C. A1 - Mike Jackson A1 - Krause, Amy A1 - Laws, Simon A1 - Parsons, Mark A1 - Paton, Norman W. A1 - Jennifer M. Schopf A1 - Tom Sugden A1 - Watson, Paul AB - This paper presents a status report on some of the highlights that have taken place within the OGSADAI project since the last AHM. A description of Release 6.0 functionality and details of the forthcoming release, due in September 2005, is given. Future directions for this project are discussed. This paper also describes initial results of work being done to systematically benchmark recent OGSADAI releases. The OGSA-DAI software distribution, and more information about the project, is available from the project website at www.ogsadai.org.uk. JF - All Hands Meeting 2005 CY - Nottingham, UK ER - TY - CONF T1 - Organization of the International Testbed of the CrossGrid Project T2 - Cracow Grid Workshop 2005 Y1 - 2005 A1 - Gomes, J. A1 - David, M. A1 - Martins, J. A1 - Bernardo, L. A1 - Garcia, A. A1 - Hardt, M. A1 - Kornmayer, H. A1 - Marco, Rafael A1 - Rodríguez, David A1 - Diaz, Irma A1 - Cano, Daniel A1 - Salt, J. A1 - Gonzalez, S. A1 - Sanchez, J. A1 - Fassi, F. A1 - Lara, V. A1 - Nyczyk, P. A1 - Lason, P. A1 - Ozieblo, A. A1 - Wolniewicz, P. A1 - Bluj, M. JF - Cracow Grid Workshop 2005 ER - TY - CONF T1 - Property analysis of symmetric travelling salesman problem instances acquired through evolution T2 - Springer Lecture Notes on Computer Science Y1 - 2005 A1 - van Hemert, J. I. ED - G. Raidl ED - J. Gottlieb KW - problem evolving KW - travelling salesman AB - We show how an evolutionary algorithm can successfully be used to evolve a set of difficult to solve symmetric travelling salesman problem instances for two variants of the Lin-Kernighan algorithm. Then we analyse the instances in those sets to guide us towards deferring general knowledge about the efficiency of the two variants in relation to structural properties of the symmetric travelling salesman problem. JF - Springer Lecture Notes on Computer Science PB - Springer-Verlag, Berlin ER - TY - JOUR T1 - Specifying use case behavior with interaction models JF - Journal of Object Technology Y1 - 2005 A1 - José Daniel García,Jesús Carretero,José María Pérez,Félix García,Rosa Filgueira. AB - Functional requirements for information systems can be modeled through use cases. Furthermore, use case models have been successfully used in broader contexts than software engineering, as systems engineering. Even if small systems may be modeled as a set of use cases, when large systems requirements are modeled with a plain use case model several difficulties arise. Traditionally, the behavior of use cases has been modeled through textual specifications. In this paper we present an alternate approach based on interaction modeling. The behavior modeling has two variants (one for UML 1.x and one for UML 2.0). We also integrate our behavior modeling with standard use case relationships. VL - 4 IS - 9 ER - TY - JOUR T1 - Specifying use case behavior with interaction models JF - Journal of Object Technology Y1 - 2005 A1 - José Daniel Garcia A1 - Jesús Carretero A1 - José Maria Pérez A1 - Félix García Carballeira A1 - Rosa Filgueira VL - 4 ER - TY - CONF T1 - Transition Models as an incremental approach for problem solving in Evolutionary Algorithms T2 - Proceedings of the Genetic and Evolutionary Computation Conference Y1 - 2005 A1 - Defaweux, A. A1 - Lenaerts, T. A1 - van Hemert, J. I. A1 - Parent, J. ED - H.-G. Beyer ED - et al KW - constraint satisfaction KW - transition models AB - This paper proposes an incremental approach for building solutions using evolutionary computation. It presents a simple evolutionary model called a Transition model. It lets building units of a solution interact and then uses an evolutionary process to merge these units toward a full solution for the problem at hand. The paper provides a preliminary study on the evolutionary dynamics of this model as well as an empirical comparison with other evolutionary techniques on binary constraint satisfaction. JF - Proceedings of the Genetic and Evolutionary Computation Conference PB - {ACM} Press ER - TY - JOUR T1 - Web Service Grids: an evolutionary approach JF - Concurrency - Practice and Experience Y1 - 2005 A1 - Atkinson, Malcolm P. A1 - Roure, David De A1 - Dunlop, Alistair N. A1 - Fox, Geoffrey A1 - Henderson, Peter A1 - Hey, Anthony J. G. A1 - Paton, Norman W. A1 - Newhouse, Steven A1 - Parastatidis, Savas A1 - Trefethen, Anne E. A1 - Watson, Paul A1 - Webber, Jim VL - 17 ER - TY - CONF T1 - Binary Merge Model Representation of the Graph Colouring Problem T2 - Springer Lecture Notes on Computer Science Y1 - 2004 A1 - Juhos, I. A1 - Tóth, A. A1 - van Hemert, J. I. ED - J. Gottlieb ED - G. Raidl KW - constraint satisfaction KW - graph colouring AB - This paper describes a novel representation and ordering model that aided by an evolutionary algorithm, is used in solving the graph \emph{k}-colouring problem. Its strength lies in reducing the search space by breaking symmetry. An empirical comparison is made with two other algorithms on a standard suit of problem instances and on a suit of instances in the phase transition where it shows promising results. JF - Springer Lecture Notes on Computer Science PB - Springer-Verlag, Berlin SN - 3-540-21367-8 ER - TY - JOUR T1 - Bioinformatics System Integration and Agent Technology JF - Journal of Biomedical Informatics Y1 - 2004 A1 - Karasavvas, K. A1 - Baldock, R. A1 - Burger, A. VL - 37 ER - TY - Generic T1 - Development of a Grid Infrastructure for Functional Genomics T2 - Life Science Grid Conference (LSGrid 2004) Y1 - 2004 A1 - Sinnott, R. O. A1 - Bayer, M. A1 - Houghton, D. A1 - D. Berry A1 - Ferrier, M. JF - Life Science Grid Conference (LSGrid 2004) T3 - LNCS PB - Springer Verlag CY - Kanazawa, Japan ER - TY - CONF T1 - Dynamic Routing Problems with Fruitful Regions: Models and Evolutionary Computation T2 - LNCS Y1 - 2004 A1 - van Hemert, J. I. A1 - la Poutré, J. A. ED - Xin Yao ED - Edmund Burke ED - Jose A. Lozano ED - Jim Smith ED - Juan J. Merelo-Guerv\'os ED - John A. Bullinaria ED - Jonathan Rowe ED - Peter Ti\v{n}o Ata Kab\'an ED - Hans-Paul Schwefel KW - dynamic problems KW - evolutionary computation KW - vehicle routing AB - We introduce the concept of fruitful regions in a dynamic routing context: regions that have a high potential of generating loads to be transported. The objective is to maximise the number of loads transported, while keeping to capacity and time constraints. Loads arrive while the problem is being solved, which makes it a real-time routing problem. The solver is a self-adaptive evolutionary algorithm that ensures feasible solutions at all times. We investigate under what conditions the exploration of fruitful regions improves the effectiveness of the evolutionary algorithm. JF - LNCS PB - Springer-Verlag CY - Birmingham, UK VL - 3242 SN - 3-540-23092-0 ER - TY - Generic T1 - Grid Services Supporting the Usage of Secure Federated, Distributed Biomedical Data T2 - All Hands Meeting 2004 Y1 - 2004 A1 - Richard Sinnott A1 - Malcolm Atkinson A1 - Micha Bayer A1 - Dave Berry A1 - Anna Dominiczak A1 - Magnus Ferrier A1 - David Gilbert A1 - Neil Hanlon A1 - Derek Houghton A1 - Hunt, Ela A1 - David White AB - The BRIDGES project is a UK e-Science project that provides grid based support for biomedical research into the genetics of hypertension – the Cardiovascular Functional Genomics Project (CFG). Its main goal is to provide an effective environment for CFG, and biomedical research in general, including access to integrated data, analysis and visualization, with appropriate authorisation and privacy, as well as grid based computational tools and resources. It also aims to provide an improved understanding of the requirements of academic biomedical research virtual organizations and to evaluate the utility of existing data federation tools. JF - All Hands Meeting 2004 CY - Nottingham, UK UR - http://www.allhands.org.uk/2004/proceedings/papers/87.pdf ER - TY - CONF T1 - Grid-Based Metadata Services T2 - SSDBM Y1 - 2004 A1 - Deelman, Ewa A1 - Singh, Gurmeet Singh A1 - Atkinson, Malcolm P. A1 - Chervenak, Ann L. A1 - Hong, Neil P. Chue A1 - Kesselman, Carl A1 - Patil, Sonal A1 - Pearlman, Laura A1 - Su, Mei-Hui JF - SSDBM ER - TY - CONF T1 - OGSA-DAI Status Report and Future Directions T2 - All Hands Meeting 2004 Y1 - 2004 A1 - Antonioletti, Mario A1 - Malcolm Atkinson A1 - Rob Baxter A1 - Borley, Andrew A1 - Hong, Neil P. Chue A1 - Collins, Brian A1 - Jonathan Davies A1 - Desmond Fitzgerald A1 - Hardman, Neil A1 - Hume, Alastair C. A1 - Mike Jackson A1 - Krause, Amrey A1 - Laws, Simon A1 - Paton, Norman W. A1 - Tom Sugden A1 - Watson, Paul A1 - Mar AB - Data Access and Integration (DAI) of data resources, such as relational and XML databases, within a Grid context. Project members also participate in the development of DAI standards through the GGF DAIS WG. The standards that emerge through this effort will be adopted by OGSA-DAI once they have stabilised. The OGSA-DAI developers are also engaging with a growing user community to gather their data and functionality requirements. Several large projects are already using OGSA-DAI to provide their DAI capabilities. This paper presents a status report on OGSA-DAI activities since the last AHM and announces future directions. The OGSA-DAI software distribution and more information about the project is available from the project website at http://www.ogsadai.org.uk/. JF - All Hands Meeting 2004 CY - Nottingham, UK ER - TY - CONF T1 - OGSA-DAI: Two Years On T2 - GGF10 Y1 - 2004 A1 - Antonioletti, Mario A1 - Malcolm Atkinson A1 - Rob Baxter A1 - Borley, Andrew A1 - Neil Chue Hong A1 - Collins, Brian A1 - Jonathan Davies A1 - Hardman, Neil A1 - George Hicken A1 - Ally Hume A1 - Mike Jackson A1 - Krause, Amrey A1 - Laws, Simon A1 - Magowan, James A1 - Jeremy Nowell A1 - Paton, Norman W. A1 - Dave Pearson A1 - To AB - The OGSA-DAI project has been producing Grid-enabled middleware for almost two years now, providing data access and integration capabilities to data resources, such as databases, within an OGSA context. In these two years, OGSA-DAI has been tracking rapidly evolving standards, managing changes in software dependencies, contributing to the standardisation process and liasing with a growing user community together with their associated data requirements. This process has imparted important lessons and raised a number of issues that need to be addressed if a middleware product is to be widely adopted. This paper examines the experiences of OGSA-DAI in implementing proposed standards, the likely impact that the still-evolving standards landscape will have on future implementations and how these affect uptake of the software. The paper also examines the gathering of requirements from and engagement with the Grid community, the difficulties of defining a process for the management and publishing of metadata, and whether relevant standards can be implemented in an efficient manner. The OGSA-DAI software distribution and more details about the project are available from the project Web site at http://www.ogsadai.org.uk/. JF - GGF10 CY - Berlin, Germany ER - TY - CONF T1 - Phase transition properties of clustered travelling salesman problem instances generated with evolutionary computation T2 - LNCS Y1 - 2004 A1 - van Hemert, J. I. A1 - Urquhart, N. B. ED - Xin Yao ED - Edmund Burke ED - Jose A. Lozano ED - Jim Smith ED - Juan J. Merelo-Guerv\'os ED - John A. Bullinaria ED - Jonathan Rowe ED - Peter Ti\v{n}o Ata Kab\'an ED - Hans-Paul Schwefel KW - evolutionary computation KW - problem evolving KW - travelling salesman AB - This paper introduces a generator that creates problem instances for the Euclidean symmetric travelling salesman problem. To fit real world problems, we look at maps consisting of clustered nodes. Uniform random sampling methods do not result in maps where the nodes are spread out to form identifiable clusters. To improve upon this, we propose an evolutionary algorithm that uses the layout of nodes on a map as its genotype. By optimising the spread until a set of constraints is satisfied, we are able to produce better clustered maps, in a more robust way. When varying the number of clusters in these maps and, when solving the Euclidean symmetric travelling salesman person using Chained Lin-Kernighan, we observe a phase transition in the form of an easy-hard-easy pattern. JF - LNCS PB - Springer-Verlag CY - Birmingham, UK VL - 3242 SN - 3-540-23092-0 UR - http://www.vanhemert.co.uk/files/clustered-phase-transition-tsp.tar.gz ER - TY - JOUR T1 - The Research of Relationship between Self-similar of TCP and Network Performance JF - Journal on communications Y1 - 2004 A1 - yan Liu A1 - Liangxiu Han VL - 25 IS - 4 ER - TY - JOUR T1 - Robust parameter settings for variation operators by measuring the resampling ratio: A study on binary constraint satisfaction problems JF - Journal of Heuristics Y1 - 2004 A1 - van Hemert, J. I. A1 - Bäck, T. KW - constraint satisfaction KW - evolutionary computation KW - resampling ratio AB - In this article, we try to provide insight into the consequence of mutation and crossover rates when solving binary constraint satisfaction problems. This insight is based on a measurement of the space searched by an evolutionary algorithm. From data empirically acquired we describe the relation between the success ratio and the searched space. This is achieved using the resampling ratio, which is a measure for the amount of points revisited by a search algorithm. This relation is based on combinations of parameter settings for the variation operators. We then show that the resampling ratio is useful for identifying the quality of parameter settings, and provide a range that corresponds to robust parameter settings. VL - 10 ER - TY - RPRT T1 - SPLAT: (Suffix-tree Powered Local Alignment Tool): A Full-Sensitivity Protein Database Search Program that Accelerates the Smith-Waterman Algorithm using a Generalised Suffix Tree Index. Y1 - 2004 A1 - Harding, N. J. A1 - Atkinson, M. P. JF - Department of Computer Science (DCS Tech Report TR-2003-141) PB - University of Glasgow ER - TY - CONF T1 - A Study into Ant Colony Optimization, Evolutionary Computation and Constraint Programming on Binary Constraint Satisfaction Problems T2 - Springer Lecture Notes on Computer Science Y1 - 2004 A1 - van Hemert, J. I. A1 - Solnon, C. ED - J. Gottlieb ED - G. Raidl KW - ant colony optimisation KW - constraint programming KW - constraint satisfaction KW - evolutionary computation AB - We compare two heuristic approaches, evolutionary computation and ant colony optimisation, and a complete tree-search approach, constraint programming, for solving binary constraint satisfaction problems. We experimentally show that, if evolutionary computation is far from being able to compete with the two other approaches, ant colony optimisation nearly always succeeds in finding a solution, so that it can actually compete with constraint programming. The resampling ratio is used to provide insight into heuristic algorithms performances. Regarding efficiency, we show that if constraint programming is the fastest when instances have a low number of variables, ant colony optimisation becomes faster when increasing the number of variables. JF - Springer Lecture Notes on Computer Science PB - Springer-Verlag, Berlin SN - 3-540-21367-8 ER - TY - RPRT T1 - Web Service Grids: An Evolutionary Approach Y1 - 2004 A1 - Malcolm Atkinson A1 - Roure, David De A1 - Alistair Dunlop A1 - Fox, Geoffrey A1 - Henderson, Peter A1 - Tony Hey A1 - Norman Paton A1 - Newhouse, Steven A1 - Parastatidis, Savas A1 - Anne Trefethen A1 - Watson, Paul A1 - Webber, Jim AB - The UK e-Science Programme is a £250M, 5 year initiative which has funded over 100 projects. These application-led projects are under-pinned by an emerging set of core middleware services that allow the coordinated, collaborative use of distributed resources. This set of middleware services runs on top of the research network and beneath the applications we call the ‘Grid’. Grid middleware is currently in transition from pre-Web Service versions to a new version based on Web Services. Unfortunately, only a very basic set of Web Services embodied in the Web Services Interoperability proposal, WS-I, are agreed by most IT companies. IBM and others have submitted proposals for Web Services for Grids - the Web Services ResourceFramework and Web Services Notification specifications - to the OASIS organisation for standardisation. This process could take up to 12 months from March 2004 and the specifications are subject to debate and potentially significant changes. Since several significant UK e-Science projects come to an end before the end of this process, the UK therefore needs to develop a strategy that will protect the UK’s investment in Grid middleware by informing the Open Middleware Infrastructure Institute’s (OMII) roadmap and UK middleware repository in Southampton. This paper sets out an evolutionary roadmap that will allow us to capture generic middleware components from projects in a form that will facilitate migration or interoperability with the emerging Grid Web Services standards and with on-going OGSA developments. In this paper we therefore define a set of Web Services specifications - that we call ‘WS-I+’ to reflect the fact that this is a larger set than currently accepted by WS-I – that we believe will enable us to achieve the twin goals of capturing these components and facilitating migration to future standards. We believe that the extra Web Services specifications we have included in WS-I+ are both helpful in building e-Science Grids and likely to be widely accepted. JF - UK e-Science Technical Report Series ER - TY - JOUR T1 - Comparing Evolutionary Algorithms on Binary Constraint Satisfaction Problems JF - IEEE Transactions on Evolutionary Computation Y1 - 2003 A1 - Craenen, B. G. W. A1 - Eiben, A. E. A1 - van Hemert, J. I. KW - constraint satisfaction AB - Constraint handling is not straightforward in evolutionary algorithms (EA) since the usual search operators, mutation and recombination, are `blind' to constraints. Nevertheless, the issue is highly relevant, for many challenging problems involve constraints. Over the last decade numerous EAs for solving constraint satisfaction problems (CSP) have been introduced and studied on various problems. The diversity of approaches and the variety of problems used to study the resulting algorithms prevents a fair and accurate comparison of these algorithms. This paper aligns related work by presenting a concise overview and an extensive performance comparison of all these EAs on a systematically generated test suite of random binary CSPs. The random problem instance generator is based on a theoretical model that fixes deficiencies of models and respective generators that have been formerly used in the Evolutionary Computing (EC) field. VL - 7 UR - http://ieeexplore.ieee.org/xpl/abs_free.jsp?isNumber=27734&prod=JNL&arnumber=1237162&arSt=+424&ared=+444&arAuthor=+Craenen%2C+B.G.W.%3B++Eiben%2C+A.E.%3B++van+Hemert%2C+J.I.&arNumber=1237162&a_id0=1237161&a_id1=1237162&a_id2=1237163&a_id3=1237164&a_id4=12 ER - TY - RPRT T1 - Computer Challenges to emerge from e-Science. Y1 - 2003 A1 - Atkinson, M. A1 - Crowcroft, J. A1 - Goble, C. A1 - Gurd, J. A1 - Rodden, T. A1 - Shadbolt, N. A1 - Sloman, M. A1 - Sommerville, I. A1 - Storey, T. AB - The UK e-Science programme has initiated significant developments that allow networked grid technology to be used to form virtual colaboratories. The e-Science vision of a globally connected community has broader application than science with the same fundamental technologies being used to support eCommerce and e-Government. The broadest vision of e-Science outlines a challenging research agenda for the computing community. New theories and models will be needed to provide a sound foundation for the tools used to specify, design, analyse and prove the properties of future grid technologies and applications. Fundamental research is needed in order to build a future e-Science infrastructure and to understand how to exploit the infrastructure to best effect. A future infrastructure needs to be dynamic, universally available and promote trust. Realising this infrastructure will need new theories, methods and techniques to be developed and deployed. Although often not directly visible these fundamental infrastructure advances will provide the foundation for future scientific advancement, wealth generation and governance. • We need to move from the current data focus to a semantic grid with facilities for the generation, support and traceability of knowledge. • We need to make the infrastructure more available and more trusted by developing trusted ubiquitous systems. • We need to reduce the cost of development by enabling the rapid customised assembly of services. • We need to reduce the cost and complexity of managing the infrastructure by realising autonomic computing systems. JF - EPSRC ER - TY - CHAP T1 - Data Access, Integration, and Management T2 - The Grid 2: Blueprint for a New Computing Infrastructure (2nd edition), Y1 - 2003 A1 - Atkinson. M. A1 - Chervenak, A. L. A1 - Kunszt, P. A1 - Narang, I. A1 - Paton, N. W. A1 - Pearson, D. A1 - Shoshani, A. A1 - Watson, P. ED - Foster, I. ED - Kesselman, C JF - The Grid 2: Blueprint for a New Computing Infrastructure (2nd edition), PB - Morgan Kaufmann SN - 1-55860-933-4 ER - TY - CONF T1 - Databases and the Grid: Who Challenges Whom? T2 - BNCOD Y1 - 2003 A1 - Atkinson, Malcolm P. JF - BNCOD ER - TY - CONF T1 - Dependable Grid Services T2 - UK e-Science All Hands Meeting 2003, 2-4th September, Nottingham, UK Y1 - 2003 A1 - Stuart Anderson A1 - Yin Chen A1 - Glen Dobson A1 - Stephen Hall A1 - Conrad Hughes A1 - Yong Li A1 - Sheng Qu A1 - Ed Smith A1 - Ian Sommerville A1 - Ma Tiejun ED - Proceedings of UK e-Science All Hands Meeting 2003 AB - The provision of dependable computer systems by deploying diverse, redundant components in order to mask or provide recovery from component failures has mostly been restricted to systems with very high criticality. In this paper we present an architecture and prototype implementation of an approach to providing such redundancy at low cost in service-based infrastructures. In particular we consider services that are supplied by composing a number of component services and consider how service discovery, automatic monitoring and failure detection have the potential to create composed services that are more dependable than might be possible using a straightforward approach. The work is still in its early stages and so far no evaluation of the approach has been carried out. JF - UK e-Science All Hands Meeting 2003, 2-4th September, Nottingham, UK CY - Nottingham, UK ER - TY - CONF T1 - The Design and Implementation of Grid Database Services in OGSA-DAI T2 - All Hands Meeting 2003 Y1 - 2003 A1 - Ali Anjomshoaa A1 - Antonioletti, Mario A1 - Malcolm Atkinson A1 - Rob Baxter A1 - Borley, Andrew A1 - Hong, Neil P. Chue A1 - Collins, Brian A1 - Hardman, Neil A1 - George Hicken A1 - Ally Hume A1 - Knox, Alan A1 - Mike Jackson A1 - Krause, Amrey A1 - Laws, Simon A1 - Magowan, James A1 - Charaka Palansuriya A1 - Paton, Norman W. AB - This paper presents a high-level overview of the design and implementation of the core components of the OGSA-DAI project. It describes the design decisions made, the project’s interaction with the Data Access and Integration Working Group of the Global Grid Forum and provides an overview of implementation characteristics. Further details of the implementation are provided in the extensive documentation available from the project web site. JF - All Hands Meeting 2003 CY - Nottingham, UK ER - TY - JOUR T1 - The DWMM network traffic model JF - Journal of Communication Y1 - 2003 A1 - Cong Suo A1 - Liangxiu Han VL - 24 IS - 5 ER - TY - CONF T1 - Evolving binary constraint satisfaction problem instances that are difficult to solve T2 - Proceedings of the IEEE 2003 Congress on Evolutionary Computation Y1 - 2003 A1 - van Hemert, J. I. KW - constraint satisfaction KW - problem evolving AB - We present a study on the difficulty of solving binary constraint satisfaction problems where an evolutionary algorithm is used to explore the space of problem instances. By directly altering the structure of problem instances and by evaluating the effort it takes to solve them using a complete algorithm we show that the evolutionary algorithm is able to detect problem instances that are harder to solve than those produced with conventional methods. Results from the search of the evolutionary algorithm confirm conjectures about where the most difficult to solve problem instances can be found with respect to the tightness. JF - Proceedings of the IEEE 2003 Congress on Evolutionary Computation PB - IEEE Press SN - 0-7803-7804-0 ER - TY - CONF T1 - Experiences of Designing and Implementing Grid Database Services in the OGSA-DAI project T2 - Global Grid Forum Workshop on Designing and Building Grid Services/GGF9 Y1 - 2003 A1 - Antonioletti, Mario A1 - Neil Chue Hong A1 - Ally Hume A1 - Mike Jackson A1 - Krause, Amy A1 - Jeremy Nowell A1 - Charaka Palansuriya A1 - Tom Sugden A1 - Martin Westhead AB - This paper describes the experiences of the OGSA-DAI team in designing and building a database access layer using the OGSI and the emerging DAIS GGF recommendations. This middleware is designed for enabling other UK e-Science projects that require database access and providing the basic primitives for higher-level services such as Distributed Query Processing. OGSA-DAI also intends to produce one of the required reference implementations of the DAIS specification once this becomes a proposed recommendation and, until then, scope out their ideas, provide feedback as well as directly contributing to the GGF working group. This paper enumerates the issues that have arisen in tracking the DAIS and OGSI specifications whilst developing a software distribution using the Grid services model; trying to serve the needs of the various target communities; and using the Globus Toolkit OGSI core distribution. The OGSA-DAI software distribution and more details are available from the project web site at http://www.ogsadai.org.uk/. JF - Global Grid Forum Workshop on Designing and Building Grid Services/GGF9 CY - Chicago, USA ER - TY - RPRT T1 - Grid Database Access and Integration: Requirements and Functionalities Y1 - 2003 A1 - Atkinson, M. P. A1 - Dialani, V. A1 - Guy, L. A1 - Narang, I. A1 - Paton, N. W. A1 - Pearson, D. A1 - Storey, T. A1 - Watson, P. AB - This document is intended to provide the context for developing Grid data service standard recommendations within the Global Grid Forum. It defines the generic requirements for accessing and integrating persistent structured and semi-structured data. In addition, it defines the generic functionalities which a Grid data service needs to provide in supporting discovery of and controlled access to data, in performing data manipulation operations, and in virtualising data resources. The document also defines the scope of Grid data service standard recommendations which are presented in a separate document. JF - Global Grid Forum ER - TY - CONF T1 - A new permutation model for solving the graph k-coloring problem T2 - Kalmàr Workshop on Logic and Computer Science Y1 - 2003 A1 - Juhos, I. A1 - Tóth, A. A1 - Tezuka, M. A1 - Tann, P. A1 - van Hemert, J. I. KW - constraint satisfaction KW - graph colouring AB - This paper describes a novel representation and ordering model, that is aided by an evolutionary algorithm, is used in solving the graph k-coloring. A comparison is made between the new representation and an improved version of the traditional graph coloring technique DSATUR on an extensive list of graph k-coloring problem instances with different properties. The results show that our model outperforms the improved DSATUR on most of the problem instances. JF - Kalmàr Workshop on Logic and Computer Science ER - TY - JOUR T1 - The pervasiveness of evolution in GRUMPS software JF - Softw., Pract. Exper. Y1 - 2003 A1 - Evans, Huw A1 - Atkinson, Malcolm P. A1 - Brown, Margaret A1 - Cargill, Julie A1 - Crease, Murray A1 - Draper, Steve A1 - Gray, Philip D. A1 - Thomas, Richard VL - 33 ER - TY - CHAP T1 - Rationale for Choosing the Open Grid Services Architecture T2 - Grid Computing: Making the Global Infrastructure a Reality Y1 - 2003 A1 - Atkinson, M. ED - F. Berman ED - G. Fox ED - T. Hey JF - Grid Computing: Making the Global Infrastructure a Reality PB - John Wiley & Sons, Ltd CY - Chichester, UK SN - 9780470853191 ER - TY - JOUR T1 - Application of the methodology combined formal method with Object-Oriented technology in E-commerce JF - Journal of Computer Engineering Y1 - 2002 A1 - Liangxiu Han VL - 29 IS - z1 ER - TY - CONF T1 - Comparing Classical Methods for Solving Binary Constraint Satisfaction Problems with State of the Art Evolutionary Computation T2 - Springer Lecture Notes on Computer Science Y1 - 2002 A1 - van Hemert, J. I. ED - S. Cagnoni ED - J. Gottlieb ED - E. Hart ED - M. Middendorf ED - G. Raidl KW - constraint satisfaction AB - Constraint Satisfaction Problems form a class of problems that are generally computationally difficult and have been addressed with many complete and heuristic algorithms. We present two complete algorithms, as well as two evolutionary algorithms, and compare them on randomly generated instances of binary constraint satisfaction prob-lems. We find that the evolutionary algorithms are less effective than the classical techniques. JF - Springer Lecture Notes on Computer Science PB - Springer-Verlag, Berlin ER - TY - CONF T1 - Criticality-Based Task Composition in Distributed Bioinformatics Systems T2 - Proceedings of the Twelfth International Conference on Intelligent Systems for Molecular Biology Y1 - 2002 A1 - Karasavvas, K. A1 - Baldock, R. A1 - Burger, A. JF - Proceedings of the Twelfth International Conference on Intelligent Systems for Molecular Biology ER - TY - JOUR T1 - Database indexing for large DNA and protein sequence collections JF - VLDB J. Y1 - 2002 A1 - Hunt, Ela A1 - Atkinson, Malcolm P. A1 - Irving, Robert W. VL - 11 ER - TY - CONF T1 - Measuring the Searched Space to Guide Efficiency: The Principle and Evidence on Constraint Satisfaction T2 - Springer Lecture Notes on Computer Science Y1 - 2002 A1 - van Hemert, J. I. A1 - Bäck, T. ED - J. J. Merelo ED - A. Panagiotis ED - H.-G. Beyer ED - Jos{\'e}-Luis Fern{\'a}ndez-Villaca{\~n}as ED - Hans-Paul Schwefel KW - constraint satisfaction KW - resampling ratio AB - In this paper we present a new tool to measure the efficiency of evolutionary algorithms by storing the whole searched space of a run, a process whereby we gain insight into the number of distinct points in the state space an algorithm has visited as opposed to the number of function evaluations done within the run. This investigation demonstrates a certain inefficiency of the classical mutation operator with mutation-rate 1/l, where l is the dimension of the state space. Furthermore we present a model for predicting this inefficiency and verify it empirically using the new tool on binary constraint satisfaction problems. JF - Springer Lecture Notes on Computer Science PB - Springer-Verlag, Berlin SN - 3-540-44139-5 ER - TY - CONF T1 - A Multi-Agent Bioinformatics Integration System with Adjustable Autonomy: An Overview T2 - Proceedings of the First International Conference on Autonomous Agents and Multi-Agent Systems Y1 - 2002 A1 - Karasavvas, K. A1 - Burger, A. A1 - Baldock, R. JF - Proceedings of the First International Conference on Autonomous Agents and Multi-Agent Systems PB - ACM ER - TY - CONF T1 - A Multi-Agent Bioinformatics Integration System with Adjustable Autonomy T2 - Lecture Notes in Computer Science Y1 - 2002 A1 - Karasavvas, K. A1 - Burger, A. A1 - Baldock, R. JF - Lecture Notes in Computer Science VL - 2417 ER - TY - JOUR T1 - A network traffic model based on the cascade process JF - Journal of Mini-Micro Computer System Y1 - 2002 A1 - Liangxiu Han A1 - yan Liu A1 - Zhiwei Cen VL - 23 IS - 12 ER - TY - JOUR T1 - A new multifractal network traffic model JF - Journal of Chaos, solitons & fractals Y1 - 2002 A1 - Liangxiu Han A1 - Zhiwei Ceng A1 - Chuanshan Gao PB - Elsevier Science VL - 13 IS - 7 ER - TY - CONF T1 - Use of Evolutionary Algorithms for Telescope Scheduling T2 - Integrated Modeling of Telescopes Y1 - 2002 A1 - Grim, R. A1 - Jansen, M. L. M. A1 - Baan, A. A1 - van Hemert, J. I. A1 - de Wolf, H. ED - Torben Anderson KW - constraint satisfaction KW - scheduling AB - LOFAR, a new radio telescope, will be designed to observe with up to 8 independent beams, thus allowing several simultaneous observations. Scheduling of multiple observations parallel in time, each having their own constraints, requires a more intelligent and flexible scheduling function then operated before. In support of the LOFAR radio telescope project, and in co-operation with Leiden University, Fokker Space has started a study to investigate the suitability of the use of evolutionary algorithms applied to complex scheduling problems. After a positive familiarisation phase, we now examine the potential use of evolutionary algorithms via a demonstration project. Results of the familiarisation phase, and the first results of the demonstration project are presented in this paper. JF - Integrated Modeling of Telescopes PB - The International Society for Optical Engineering ({SPIE}) VL - 4757 ER - TY - CONF T1 - Adaptive Genetic Programming Applied to New and Existing Simple Regression Problems T2 - Springer Lecture Notes on Computer Science Y1 - 2001 A1 - Eggermont, J. A1 - van Hemert, J. I. ED - J. Miller ED - Tomassini, M. ED - P. L. Lanzi ED - C. Ryan ED - A. G. B. Tettamanzi ED - W. B. Langdon KW - data mining AB - In this paper we continue our study on adaptive genetic pro-gramming. We use Stepwise Adaptation of Weights to boost performance of a genetic programming algorithm on simple symbolic regression problems. We measure the performance of a standard GP and two variants of SAW extensions on two different symbolic regression prob-lems from literature. Also, we propose a model for randomly generating polynomials which we then use to further test all three GP variants. JF - Springer Lecture Notes on Computer Science PB - Springer-Verlag, Berlin SN - 9-783540-418993 ER - TY - JOUR T1 - The characterizing network traffic based on the wavelet technique JF - Journal of Mini-Micro Computer System Y1 - 2001 A1 - Liangxiu Han A1 - Cong Suo VL - 22 IS - 9 ER - TY - CONF T1 - A Database Index to Large Biological Sequences T2 - VLDB Y1 - 2001 A1 - Hunt, Ela A1 - Atkinson, Malcolm P. A1 - Irving, Robert W. JF - VLDB ER - TY - JOUR T1 - An efficient object promotion algorithm for persistent object systems JF - Softw., Pract. Exper. Y1 - 2001 A1 - Printezis, Tony A1 - Atkinson, Malcolm P. VL - 31 ER - TY - CONF T1 - An Engineering Approach to Evolutionary Art T2 - Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001) Y1 - 2001 A1 - van Hemert, J. I. A1 - Jansen, M. L. M. ED - Lee Spector ED - Erik D. Goodman ED - Annie Wu ED - W. B. Langdon ED - Hans-Michael Voigt ED - Mitsuo Gen ED - Sandip Sen ED - Marco Dorigo ED - Shahram Pezeshk ED - Max H. Garzon ED - Edmund Burke KW - evolutionary art AB - We present a general system that evolves art on the Internet. The system runs on a server which enables it to collect information about its usage world wide; its core uses operators and representations from genetic program-ming. We show two types of art that can be evolved using this general system. JF - Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001) PB - Morgan Kaufmann Publishers, San Francisco ER - TY - CONF T1 - Evolutionary Computation in Constraint Satisfaction and Machine Learning --- An abstract of my PhD. T2 - Proceedings of the Brussels Evolutionary Algorithms Day (BEAD-2001) Y1 - 2001 A1 - van Hemert, J. I. ED - Anne Defaweux ED - Bernard Manderick ED - Tom Lenearts ED - Johan Parent ED - Piet van Remortel KW - constraint satisfaction KW - data mining JF - Proceedings of the Brussels Evolutionary Algorithms Day (BEAD-2001) PB - Vrije Universiteit Brussel (VUB) ER - TY - CONF T1 - A ``Futurist'' approach to dynamic environments T2 - Proceedings of the Workshops at the Genetic and Evolutionary Computation Conference, Dynamic Optimization Problems Y1 - 2001 A1 - van Hemert, J. I. A1 - Van Hoyweghen, C. A1 - Lukschandl, E. A1 - Verbeeck, K. ED - J. Branke ED - Th. B{\"a}ck KW - dynamic problems AB - The optimization of dynamic environments has proved to be a difficult area for Evolutionary Algorithms. As standard haploid populations find it difficult to track a moving target, diffKerent schemes have been suggested to improve the situation. We study a novel approach by making use of a meta learner which tries to predict the next state of the environment, i.e. the next value of the goal the individuals have to achieve, by making use of the accumulated knowledge from past performance. JF - Proceedings of the Workshops at the Genetic and Evolutionary Computation Conference, Dynamic Optimization Problems PB - Morgan Kaufmann Publishers, San Francisco ER - TY - CONF T1 - The GRUMPS Architecture: Run-time Evolution in a Large Scale Distributed System T2 - Proceedings of the Workshop on Engineering Complex Object-Oriented Solutions for Evolution (ECOOSE), held as part of OOPSLA 2001. Y1 - 2001 A1 - Evans, Huw A1 - Peter Dickman A1 - Malcolm Atkinson AB - This paper describes the first version of the distributed programming architecture for the Grumps1 project. The architecture consists of objects that communicate in terms of both asynchronous and synchronous events. A novel three-level extensible naming scheme is discussed that allows Grumps developers to deploy systems that can refer to entities not identified at the time when the Grumps system and application-level code were implemented. Examples detailing how the topology of a Grumps system may be changed at run-time and how new object implementations may be distributed during system execution are given. The separation of policy from mechanism is shown to be a major part of how system evolution is supported and this is made even more flexible when expressed through the use of Java interfaces for crucial core concepts. JF - Proceedings of the Workshop on Engineering Complex Object-Oriented Solutions for Evolution (ECOOSE), held as part of OOPSLA 2001. ER - TY - BOOK T1 - GRUMPS Summer Anthology, 2001 Y1 - 2001 A1 - Atkinson, M. A1 - Brown, M. A1 - Cargill, J. A1 - Crease, M. A1 - Draper, S. A1 - Evans, H. A1 - Gray, P. A1 - Mitchell, C. A1 - Ritchie, M. A1 - Thomas, R. AB - This is the first collection of papers from GRUMPS [http://grumps.dcs.gla.ac.uk]. The project only started up in February 2001, and this collection (frozen at 1 Sept 2001) shows that it got off to a productive start. Versions of some of these papers have been submitted to conferences and workshops: the website will have more information on publication status and history. GRUMPS decided to begin with a first study, partly to help the team coalesce. This involved installing two pieces of software in a first year computing science lab: one (the "UAR") to record a large volume of student actions at a low level with a view to mining them later, another (the "LSS") directly designed to assist tutor-student interaction. Some of the papers derive from that, although more are planned. Results from this first study can be found on the website. The project also has a link to UWA in Perth, Western Australia, where related software has already been developed and used as described in one of the papers. Another project strand concerns using handsets in lecture theatres to support interactivity there, as two other papers describe. As yet unrepresented in this collection, GRUMPS will also be entering the bioinformatics application area. The GRUMPS project operates on several levels. It is based in the field of Distributed Information Management (DIM), expecting to cover both mobile and static nodes, synchronous and detached clients, high and low volume data sources. The specific focus of the project (see the original proposal on the web site) is to address records of computational activity (where any such pre-existing usage might have extra record collection installed) and data experimentation, where the questions to be asked of the data emerge concurrently with data collection which will therefore be dynamically modifiable: a requirement that further pushes on the space of DIM. The level above concerns building and making usable tools for asking questions of the data, or rather of the activities that generate the data. Above that again is the application domain level: what the original computational activities serve, education and bioinformatics being two identified cases. The GRUMPS team is therefore multidisciplinary, from DIM architecture researchers to educational evaluators. The mix of papers reflects this. PB - Academic Press ER - TY - CONF T1 - A new multifractal traffic model based on the wavelet transform T2 - ISCA 14th International Conference on Parallel and Distributed Computing systems Y1 - 2001 A1 - Chuanshan Gao A1 - Liangxiu Han JF - ISCA 14th International Conference on Parallel and Distributed Computing systems CY - Texas, USA ER - TY - CHAP T1 - Persistence and Java — A Balancing Act T2 - Objects and Databases Y1 - 2001 A1 - Atkinson, M. ED - Klaus Dittrich ED - Giovanna Guerrini ED - Isabella Merlo ED - Marta Oliva ED - M. Elena Rodriguez AB - Large scale and long-lived application systems, enterprise applications, require persistence, that is provision of storage for many of their data structures. The JavaTM programming language is a typical example of a strongly-typed, object-oriented programming language that is becoming popular for building enterprise applications. It therefore needs persistence. The present options for obtaining this persistence are reviewed. We conclude that the Orthogonal Persistence Hypothesis, OPH, is still persuasive. It states that the universal and automated provision of longevity or brevity for all data will significantly enhance developer productivity and improve applications. This position paper reports on the PJama project with particular reference to its test of the OPH. We review why orthogonal persistence has not been taken up widely, and why the OPH is still incompletely tested. This leads to a more general challenge of how to conduct experiments which reveal large-scale and long-term effects and some thoughts on how that challenge might be addressed by the software research community. JF - Objects and Databases T3 - Lecture Notes in Computer Science PB - Springer VL - 1944 UR - http://www.springerlink.com/content/8t7x3m1ehtdqk4bm/?p=7ece1338fff3480b83520df395784cc6&pi=0 ER - TY - CHAP T1 - Scalable and Recoverable Implementation of Object Evolution for the PJama1 Platform T2 - Persistent Object Systems: Design, Implementation, and Use 9th International Workshop, POS-9 Lillehammer, Norway, September 6–8, 2000 Revised Papers Y1 - 2001 A1 - Atkinson, M. P. A1 - Dmitriev, M. A. A1 - Hamilton, C. A1 - Printezis, T. ED - Graham N. C. ED - Kirby, Alan Dearle ED - Dag I. K. Sjøberg AB - PJama1 is the latest version of an orthogonally persistent platform for Java. It depends on a new persistent object store, Sphere, and provides facilities for class evolution. This evolution technology supports an arbitrary set of changes to the classes, which may have arbitrarily large populations of persistent objects. We verify that the changes are safe. When there are format changes, we also convert all of the instances, while leaving their identities unchanged. We aspire to both very large persistent object stores and freedom for developers to specify arbitrary conversion methods in Java to convey information from old to new formats. Evolution operations must be safe and the evolution cost should be approximately linear in the number of objects that must be reformatted. In order that these conversion methods can be written easily, we continue to present the pre-evolution state consistently to Java executions throughout an evolution. At the completion of applying all of these transformations, we must switch the store state to present only the post-evolution state, with object identity preserved. We present an algorithm that meets these requirements for eager, total conversion. This paper focuses on the mechanisms built into Sphere to support safe, atomic and scalable evolution. We report our experiences in using this technology and include a preliminary set of performance measurements. JF - Persistent Object Systems: Design, Implementation, and Use 9th International Workshop, POS-9 Lillehammer, Norway, September 6–8, 2000 Revised Papers T3 - Lecture Notes in Computer Science PB - Springer VL - 2135 UR - http://www.springerlink.com/content/09hx07h9lw0p1h82/?p=2bc20319905146bab8ba93b2fcc8cc01&pi=23 ER - TY - CONF T1 - Constraint Satisfaction Problems and Evolutionary Algorithms: A Reality Check T2 - Proceedings of the Twelfth Belgium/Netherlands Conference on Artificial Intelligence (BNAIC'00) Y1 - 2000 A1 - van Hemert, J. I. ED - van den Bosch, A. ED - H. Weigand KW - constraint satisfaction AB - Constraint satisfaction has been the subject of many studies. Different areas of research have tried to solve all kind of constraint problems. Here we will look at a general model for constraint satisfaction problems in the form of binary constraint satisfaction. The problems generated from this model are studied in the research area of constraint programming and in the research area of evolutionary computation. This paper provides an empirical comparison of two techniques from each area. Basically, this is a check on how well both areas are doing. It turns out that, although evolutionary algorithms are doing well, classic approaches are still more successful. JF - Proceedings of the Twelfth Belgium/Netherlands Conference on Artificial Intelligence (BNAIC'00) PB - BNVKI, Dutch and the Belgian AI Association ER - TY - JOUR T1 - De Creatieve Computer JF - AIgg Kennisgeving Y1 - 2000 A1 - van Hemert, J. I. KW - evolutionary art AB - Here we show an application that generates images resembling art as it was produced by Mondriaan, a Dutch artist, well known for his minimalistic and pure abstract pieces of art. The current version generates images using a linear chromosome and a recursive function as a decoder. PB - Artifici{\"e}le Intelligentie gebruikers groep VL - 13 N1 - invited article (in Dutch) ER - TY - JOUR T1 - Guest editorial JF - VLDB J. Y1 - 2000 A1 - Atkinson, Malcolm P. VL - 9 ER - TY - CONF T1 - Managing Transparency in Distributed Bioinformatics Systems T2 - European Media Lab Workshop on Management and Integration of Biochemical Data Y1 - 2000 A1 - Karasavvas, K. A1 - Baldock, R. A1 - Burger, A. JF - European Media Lab Workshop on Management and Integration of Biochemical Data ER - TY - CONF T1 - Measurement and analysis of IP network traffic T2 - In Proceedings of the 3th International Asia-Pacific Web Conference Y1 - 2000 A1 - cen, Z A1 - Gao, C A1 - Cong S A1 - Han, L JF - In Proceedings of the 3th International Asia-Pacific Web Conference CY - xi'an China ER - TY - CONF T1 - Persistence and Java - A Balancing Act T2 - Objects and Databases Y1 - 2000 A1 - Atkinson, Malcolm P. JF - Objects and Databases ER - TY - CONF T1 - Scalable and Recoverable Implementation of Object Evolution for the PJama1 Platform T2 - POS Y1 - 2000 A1 - Atkinson, Malcolm P. A1 - Dmitriev, Misha A1 - Hamilton, Craig A1 - Printezis, Tony JF - POS ER - TY - CONF T1 - Stepwise Adaptation of Weights for Symbolic Regression with Genetic Programming T2 - Proceedings of the Twelfth Belgium/Netherlands Conference on Artificial Intelligence (BNAIC'00) Y1 - 2000 A1 - Eggermont, J. A1 - van Hemert, J. I. ED - van den Bosch, A. ED - H. Weigand KW - data mining KW - genetic programming AB - In this paper we continue study on the Stepwise Adaptation of Weights (SAW) technique. Previous studies on constraint satisfaction and data clas-sification have indicated that SAW is a promising technique to boost the performance of evolutionary algorithms. Here we use SAW to boost per-formance of a genetic programming algorithm on simple symbolic regression problems. We measure the performance of a standard GP and two variants of SAW extensions on two different symbolic regression problems. JF - Proceedings of the Twelfth Belgium/Netherlands Conference on Artificial Intelligence (BNAIC'00) PB - BNVKI, Dutch and the Belgian AI Association ER - TY - CONF T1 - Adapting the Fitness Function in GP for Data Mining T2 - Springer Lecture Notes on Computer Science Y1 - 1999 A1 - Eggermont, J. A1 - Eiben, A. E. A1 - van Hemert, J. I. ED - R. Poli ED - P. Nordin ED - W. B. Langdon ED - T. C. Fogarty KW - data mining KW - genetic programming AB - In this paper we describe how the Stepwise Adaptation of Weights (SAW) technique can be applied in genetic programming. The SAW-ing mechanism has been originally developed for and successfully used in EAs for constraint satisfaction problems. Here we identify the very basic underlying ideas behind SAW-ing and point out how it can be used for different types of problems. In particular, SAW-ing is well suited for data mining tasks where the fitness of a candidate solution is composed by `local scores' on data records. We evaluate the power of the SAW-ing mechanism on a number of benchmark classification data sets. The results indicate that extending the GP with the SAW-ing feature increases its performance when different types of misclassifications are not weighted differently, but leads to worse results when they are. JF - Springer Lecture Notes on Computer Science PB - Springer-Verlag, Berlin SN - 3-540-65899-8 ER - TY - CONF T1 - Comparing genetic programming variants for data classification T2 - Proceedings of the Eleventh Belgium/Netherlands Conference on Artificial Intelligence (BNAIC'99) Y1 - 1999 A1 - Eggermont, J. A1 - Eiben, A. E. A1 - van Hemert, J. I. ED - E. Postma ED - M. Gyssens KW - classification KW - data mining KW - genetic programming AB - This article is a combined summary of two papers written by the authors. Binary data classification problems (with exactly two disjoint classes) form an important application area of machine learning techniques, in particular genetic programming (GP). In this study we compare a number of different variants of GP applied to such problems whereby we investigate the effect of two significant changes in a fixed GP setup in combination with two different evolutionary models JF - Proceedings of the Eleventh Belgium/Netherlands Conference on Artificial Intelligence (BNAIC'99) PB - BNVKI, Dutch and the Belgian AI Association ER - TY - CONF T1 - A comparison of genetic programming variants for data classification T2 - Springer Lecture Notes on Computer Science Y1 - 1999 A1 - Eggermont, J. A1 - Eiben, A. E. A1 - van Hemert, J. I. ED - D. J. Hand ED - J. N. Kok ED - M. R. Berthold KW - classification KW - data mining KW - genetic programming AB - In this paper we report the results of a comparative study on different variations of genetic programming applied on binary data classification problems. The first genetic programming variant is weighting data records for calculating the classification error and modifying the weights during the run. Hereby the algorithm is defining its own fitness function in an on-line fashion giving higher weights to `hard' records. Another novel feature we study is the atomic representation, where `Booleanization' of data is not performed at the root, but at the leafs of the trees and only Boolean functions are used in the trees' body. As a third aspect we look at generational and steady-state models in combination of both features. JF - Springer Lecture Notes on Computer Science PB - Springer-Verlag, Berlin SN - 3-540-66332-0 ER - TY - CONF T1 - Defining and Handling Transient Fields in PJama T2 - DBPL Y1 - 1999 A1 - Printezis, Tony A1 - Atkinson, Malcolm P. A1 - Jordan, Mick J. JF - DBPL ER - TY - CONF T1 - Evolutionary Data Conversion in the PJama Persistent Language T2 - ECOOP Workshops Y1 - 1999 A1 - Dmitriev, Misha A1 - Atkinson, Malcolm P. JF - ECOOP Workshops ER - TY - CONF T1 - Evolutionary Data Conversion in the PJama Persistent Language T2 - ECOOP Workshop on Object-Oriented Databases Y1 - 1999 A1 - Dmitriev, Misha A1 - Atkinson, Malcolm P. JF - ECOOP Workshop on Object-Oriented Databases ER - TY - CONF T1 - Issues Raised by Three Years of Developing PJama: An Orthogonally Persistent Platform for Java T2 - ICDT Y1 - 1999 A1 - Atkinson, Malcolm P. A1 - Jordan, Mick J. JF - ICDT ER - TY - CONF T1 - Mondriaan Art by Evolution T2 - Proceedings of the Eleventh Belgium/Netherlands Conference on Artificial Intelligence (BNAIC'99) Y1 - 1999 A1 - van Hemert, J. I. A1 - Eiben, A. E. ED - E. Postma ED - M. Gyssens KW - evolutionary art AB - Here we show an application that generates images resembling art as it was produced by Mondriaan, a Dutch artist, well known for his minimalistic and pure abstract pieces of art. The current version generates images using a linear chromosome and a recursive function as a decoder. JF - Proceedings of the Eleventh Belgium/Netherlands Conference on Artificial Intelligence (BNAIC'99) PB - BNVKI, Dutch and the Belgian AI Association ER - TY - JOUR T1 - Neural network applied to the prediction of the failure stress for pressurized cylinders containing defects JF - International Journal of Pressure Vessels and Piping Y1 - 1999 A1 - Lianghao Han A1 - Liangxiu Han A1 - zengdian Liu PB - Elsevier VL - 76 IS - 4 ER - TY - CONF T1 - Population dynamics and emerging features in AEGIS T2 - Proceedings of the Genetic and Evolutionary Computation Conference Y1 - 1999 A1 - Eiben, A. E. A1 - Elia, D. A1 - van Hemert, J. I. ED - W. Banzhaf ED - J. Daida ED - Eiben, A. E. ED - M. H. Garzon ED - V. Honavar ED - M. Jakiela ED - R. E. Smith KW - dynamic problems AB - We describe an empirical investigation within an artificial world, aegis, where a population of animals and plants is evolving. We compare different system setups in search of an `ideal' world that allows a constantly high number of inhabitants for a long period of time. We observe that high responsiveness at individual level (speed of movement) or population level (high fertility) are `ideal'. Furthermore, we investigate the emergence of the so-called mental features of animals determining their social, consumptional and aggressive behaviour. The tests show that being socially oriented is generally advantageous, while agressive behaviour only emerges under specific circumstances. JF - Proceedings of the Genetic and Evolutionary Computation Conference PB - Morgan Kaufmann Publishers, San Francisco ER - TY - JOUR T1 - Questions considered in object-oriented software quality metrics based on Java environment JF - Journal of East China University of Science and Technology Y1 - 1999 A1 - Liangxiu Han PB - East China University of Science and Technology ER - TY - CHAP T1 - SAW-ing EAs: adapting the fitness function for solving constrained problems T2 - New ideas in optimization Y1 - 1999 A1 - Eiben, A. E. A1 - van Hemert, J. I. ED - D. Corne ED - M. Dorigo ED - F. Glover KW - constraint satisfaction AB - In this chapter we describe a problem independent method for treating constrain ts in an evolutionary algorithm. Technically, this method amounts to changing the defini tion of the fitness function during a run of an EA, based on feedback from the search pr ocess. Obviously, redefining the fitness function means redefining the problem to be sol ved. On the short term this deceives the algorithm making the fitness values deteriorate , but as experiments clearly indicate, on the long run it is beneficial. We illustrate t he power of the method on different constraint satisfaction problems and point out other application areas of this technique. JF - New ideas in optimization PB - McGraw-Hill, London ER - TY - Generic T1 - VLDB'99, Proceedings of 25th International Conference on Very Large Data Bases, September 7-10, 1999, Edinburgh, Scotland, UK Y1 - 1999 A1 - Atkinson, Malcolm P. A1 - Maria E. Orlowska A1 - Patrick Valduriez A1 - Stanley B. Zdonik A1 - Michael L. Brodie ED - Atkinson, Malcolm P. ED - Maria E. Orlowska ED - Patrick Valduriez ED - Stanley B. Zdonik ED - Michael L. Brodie PB - Morgan Kaufmann SN - 1-55860-615-7 ER - TY - CONF T1 - Extended abstract: Solving Binary Constraint Satisfaction Problems using Evolutionary Algorithms with an Adaptive Fitness Function T2 - Proceedings of the Xth Netherlands/Belgium Conference on Artificial Intelligence (NAIC'98) Y1 - 1998 A1 - Eiben, A. E. A1 - van Hemert, J. I. A1 - Marchiori, E. A1 - Steenbeek, A. G. ED - la Poutré, J. A. ED - van den Herik, J. KW - constraint satisfaction JF - Proceedings of the Xth Netherlands/Belgium Conference on Artificial Intelligence (NAIC'98) PB - BNVKI, Dutch and the Belgian AI Association N1 - Abstract of \cite{EHMS98} ER - TY - JOUR T1 - Graph Coloring with Adaptive Evolutionary Algorithms JF - Journal of Heuristics Y1 - 1998 A1 - Eiben, A. E. A1 - van der Hauw, J. K. A1 - van Hemert, J. I. KW - constraint satisfaction KW - graph colouring AB - This paper presents the results of an experimental investigation on solving graph coloring problems with Evolutionary Algorithms (EA). After testing different algorithm variants we conclude that the best option is an asexual EA using order-based representation and an adaptation mechanism that periodically changes the fitness function during the evolution. This adaptive EA is general, using no domain specific knowledge, except, of course, from the decoder (fitness function). We compare this adaptive EA to a powerful traditional graph coloring technique DSatur and the Grouping GA on a wide range of problem instances with different size, topology and edge density. The results show that the adaptive EA is superior to the Grouping GA and outperforms DSatur on the hardest problem instances. Furthermore, it scales up better with the problem size than the other two algorithms and indicates a linear computational complexity. PB - Kluwer Academic Publishers VL - 4 ER - TY - CONF T1 - Solving Binary Constraint Satisfaction Problems using Evolutionary Algorithms with an Adaptive Fitness Function T2 - Springer Lecture Notes on Computer Science Y1 - 1998 A1 - Eiben, A. E. A1 - van Hemert, J. I. A1 - Marchiori, E. A1 - Steenbeek, A. G. ED - Eiben, A. E. ED - Th. B{\"a}ck ED - M. Schoenauer ED - H.-P. Schwefel KW - constraint satisfaction AB - This paper presents a comparative study of Evolutionary Algorithms (EAs) for Constraint Satisfaction Problems (CSPs). We focus on EAs where fitness is based on penalization of constraint violations and the penalties are adapted during the execution. Three different EAs based on this approach are implemented. For highly connected constraint networks, the results provide further empirical support to the theoretical prediction of the phase transition in binary CSPs. JF - Springer Lecture Notes on Computer Science PB - Springer-Verlag, Berlin ER - TY - JOUR T1 - On Zero-symmetric BZ-algebras JF - Journal of East China University of Science and Technology Y1 - 1997 A1 - Wang, Y. A1 - Liangxiu Han PB - East China University of Science and Technology VL - 23 IS - 6 ER -