TY - CONF T1 - C2MS: Dynamic Monitoring and Management of Cloud Infrastructures T2 - IEEE CloudCom Y1 - 2013 A1 - Gary McGilvary A1 - Josep Rius A1 - Íñigo Goiri A1 - Francesc Solsona A1 - Barker, Adam A1 - Atkinson, Malcolm P. AB - Server clustering is a common design principle employed by many organisations who require high availability, scalability and easier management of their infrastructure. Servers are typically clustered according to the service they provide whether it be the application(s) installed, the role of the server or server accessibility for example. In order to optimize performance, manage load and maintain availability, servers may migrate from one cluster group to another making it difficult for server monitoring tools to continuously monitor these dynamically changing groups. Server monitoring tools are usually statically configured and with any change of group membership requires manual reconfiguration; an unreasonable task to undertake on large-scale cloud infrastructures. In this paper we present the Cloudlet Control and Management System (C2MS); a system for monitoring and controlling dynamic groups of physical or virtual servers within cloud infrastructures. The C2MS extends Ganglia - an open source scalable system performance monitoring tool - by allowing system administrators to define, monitor and modify server groups without the need for server reconfiguration. In turn administrators can easily monitor group and individual server metrics on large-scale dynamic cloud infrastructures where roles of servers may change frequently. Furthermore, we complement group monitoring with a control element allowing administrator-specified actions to be performed over servers within service groups as well as introduce further customized monitoring metrics. This paper outlines the design, implementation and evaluation of the C2MS. JF - IEEE CloudCom CY - Bristol, UK ER - TY - BOOK T1 - The DATA Bonanza: Improving Knowledge Discovery in Science, Engineering, and Business T2 - Wiley Series on Parallel and Distributed Computing (Editor: Albert Y. Zomaya) Y1 - 2013 A1 - Atkinson, Malcolm P. A1 - Baxter, Robert M. A1 - Peter Brezany A1 - Oscar Corcho A1 - Michelle Galea A1 - Parsons, Mark A1 - Snelling, David A1 - van Hemert, Jano KW - Big Data KW - Data Intensive KW - data mining KW - Data Streaming KW - Databases KW - Dispel KW - Distributed Computing KW - Knowledge Discovery KW - Workflows AB - With the digital revolution opening up tremendous opportunities in many fields, there is a growing need for skilled professionals who can develop data-intensive systems and extract information and knowledge from them. This book frames for the first time a new systematic approach for tackling the challenges of data-intensive computing, providing decision makers and technical experts alike with practical tools for dealing with our exploding data collections. Emphasising data-intensive thinking and interdisciplinary collaboration, The DATA Bonanza: Improving Knowledge Discovery in Science, Engineering, and Business examines the essential components of knowledge discovery, surveys many of the current research efforts worldwide, and points to new areas for innovation. Complete with a wealth of examples and DISPEL-based methods demonstrating how to gain more from data in real-world systems, the book: * Outlines the concepts and rationale for implementing data-intensive computing in organisations * Covers from the ground up problem-solving strategies for data analysis in a data-rich world * Introduces techniques for data-intensive engineering using the Data-Intensive Systems Process Engineering Language DISPEL * Features in-depth case studies in customer relations, environmental hazards, seismology, and more * Showcases successful applications in areas ranging from astronomy and the humanities to transport engineering * Includes sample program snippets throughout the text as well as additional materials on a companion website The DATA Bonanza is a must-have guide for information strategists, data analysts, and engineers in business, research, and government, and for anyone wishing to be on the cutting edge of data mining, machine learning, databases, distributed systems, or large-scale computing. JF - Wiley Series on Parallel and Distributed Computing (Editor: Albert Y. Zomaya) PB - John Wiley & Sons Inc. SN - 978-1-118-39864-7 ER - TY - CHAP T1 - Data-Intensive Analysis T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Oscar Corcho A1 - van Hemert, Jano ED - Malcolm Atkinson ED - Rob Baxter ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - data mining KW - Data-Analysis Experts KW - Data-Intensive Analysis KW - Knowledge Discovery AB - Part II: "Data-intensive Knowledge Discovery", focuses on the needs of data-analysis experts. It illustrates the problem-solving strategies appropriate for a data-rich world, without delving into the details of underlying technologies. It should engage and inform data-analysis specialists, such as statisticians, data miners, image analysts, bio-informaticians or chemo-informaticians, and generate ideas pertinent to their application areas. Chapter 5: "Data-intensive Analysis", introduces a set of common problems that data-analysis experts often encounter, by means of a set of scenarios of increasing levels of complexity. The scenarios typify knowledge discovery challenges and the presented solutions provide practical methods; a starting point for readers addressing their own data challenges. JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business PB - John Wiley & Sons Ltd. ER - TY - CHAP T1 - Data-Intensive Components and Usage Patterns T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Oscar Corcho ED - Malcolm Atkinson ED - Rob Baxter ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - Data Analysis KW - data mining KW - Data-Intensive Components KW - Registry KW - Workflow Libraries KW - Workflow Sharing AB - Chapter 7: "Data-intensive components and usage patterns", provides a systematic review of the components that are commonly used in knowledge discovery tasks as well as common patterns of component composition. That is, it introduces the processing elements from which knowledge discovery solutions are built and common composition patterns for delivering trustworthy information. It reflects on how these components and patterns are evolving in a data-intensive context. JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business PB - John Wiley & Sons Ltd. ER - TY - CHAP T1 - The Data-Intensive Survival Guide T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Malcolm Atkinson ED - Malcolm Atkinson ED - Rob Baxter ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - Data-Analysis Experts KW - Data-Intensive Architecture KW - Data-intensive Computing KW - Data-Intensive Engineers KW - Datascopes KW - Dispel KW - Domain Experts KW - Intellectual Ramps KW - Knowledge Discovery KW - Workflows AB - Chapter 3: "The data-intensive survival guide", presents an overview of all of the elements of the proposed data-intensive strategy. Sufficient detail is presented for readers to understand the principles and practice that we recommend. It should also provide a good preparation for readers who choose to sample later chapters. It introduces three professional viewpoints: domain experts, data-analysis experts, and data-intensive engineers. Success depends on a balanced approach that develops the capacity of all three groups. A data-intensive architecture provides a flexible framework for that balanced approach. This enables the three groups to build and exploit data-intensive processes that incrementally step from data to results. A language is introduced to describe these incremental data processes from all three points of view. The chapter introduces ‘datascopes’ as the productized data-handling environments and ‘intellectual ramps’ as the ‘on ramps’ for the highways from data to knowledge. JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business PB - John Wiley & Sons Ltd. ER - TY - CHAP T1 - Data-Intensive Thinking with DISPEL T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Malcolm Atkinson ED - Malcolm Atkinson ED - Rob Baxter ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - Data-Intensive Machines KW - Data-Intensive Thinking, Data-intensive Computing KW - Dispel KW - Distributed Computing KW - Knowledge Discovery AB - Chapter 4: "Data-intensive thinking with DISPEL", engages the reader with technical issues and solutions, by working through a sequence of examples, building up from a sketch of a solution to a large-scale data challenge. It uses the DISPEL language extensively, introducing its concepts and constructs. It shows how DISPEL may help designers, data-analysts, and engineers develop solutions to the requirements emerging in any data-intensive application domain. The reader is taken through simple steps initially, this then builds to conceptually complex steps that are necessary to cope with the realities of real data providers, real data, real distributed systems, and long-running processes. JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business PB - John Wiley & Sons Inc. ER - TY - CHAP T1 - Definition of the DISPEL Language T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Paul Martin A1 - Yaikhom, Gagarine ED - Malcolm Atkinson ED - Rob Baxter ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - Data Streaming KW - Data-intensive Computing KW - Dispel AB - Chapter 10: "Definition of the DISPEL language", describes the novel aspects of the DISPEL language: its constructs, capabilities, and anticipated programming style. JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business T3 - {Parallel and Distributed Computing, series editor Albert Y. Zomaya} PB - John Wiley & Sons Inc. ER - TY - CONF T1 - The demand for consistent web-based workflow editors T2 - Proceedings of the 8th Workshop on Workflows in Support of Large-Scale Science Y1 - 2013 A1 - Gesing, Sandra A1 - Atkinson, Malcolm A1 - Klampanos, Iraklis A1 - Galea, Michelle A1 - Berthold, Michael R. A1 - Barbera, Roberto A1 - Scardaci, Diego A1 - Terstyanszky, Gabor A1 - Kiss, Tamas A1 - Kacsuk, Peter KW - web-based workflow editors KW - workflow composition KW - workflow interoperability KW - workflow languages and concepts JF - Proceedings of the 8th Workshop on Workflows in Support of Large-Scale Science PB - ACM CY - New York, NY, USA SN - 978-1-4503-2502-8 UR - http://doi.acm.org/10.1145/2534248.2534260 ER - TY - CHAP T1 - The Digital-Data Challenge T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Malcolm Atkinson A1 - Parsons, Mark ED - Malcolm Atkinson ED - Rob Baxter ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - Big Data KW - Data-intensive Computing, Knowledge Discovery KW - Digital Data KW - Digital-Data Revolution AB - Part I: Strategies for success in the digital-data revolution, provides an executive summary of the whole book to convince strategists, politicians, managers, and educators that our future data-intensive society requires new thinking, new behavior, new culture, and new distribution of investment and effort. This part will introduce the major concepts so that readers are equipped to discuss and steer their organization’s response to the opportunities and obligations brought by the growing wealth of data. It will help readers understand the changing context brought about by advances in digital devices, digital communication, and ubiquitous computing. Chapter 1: The digital-data challenge, will help readers to understand the challenges ahead in making good use of the data and introduce ideas that will lead to helpful strategies. A global digital-data revolution is catalyzing change in the ways in which we live, work, relax, govern, and organize. This is a significant change in society, as important as the invention of printing or the industrial revolution, but more challenging because it is happening globally at lnternet speed. Becoming agile in adapting to this new world is essential. JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business PB - John Wiley & Sons Ltd. ER - TY - CHAP T1 - The Digital-Data Revolution T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Malcolm Atkinson ED - Malcolm Atkinson ED - Rob Baxter ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - Data KW - Information KW - Knowledge KW - Knowledge Discovery KW - Social Impact of Digital Data KW - Wisdom, Data-intensive Computing AB - Chapter 2: "The digital-data revolution", reviews the relationships between data, information, knowledge, and wisdom. It analyses and quantifies the changes in technology and society that are delivering the data bonanza, and then reviews the consequential changes via representative examples in biology, Earth sciences, social sciences, leisure activity, and business. It exposes quantitative details and shows the complexity and diversity of the growing wealth of data, introducing some of its potential benefits and examples of the impediments to successfully realizing those benefits. JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business PB - John Wiley & Sons Ltd. ER - TY - CHAP T1 - DISPEL Development T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Adrian Mouat A1 - Snelling, David ED - Malcolm Atkinson ED - Rob Baxter ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - Diagnostics KW - Dispel KW - IDE KW - Libraries KW - Processing Elements AB - Chapter 11: "DISPEL development", describes the tools and libraries that a DISPEL developer might expect to use. The tools include those needed during process definition, those required to organize enactment, and diagnostic aids for developers of applications and platforms. JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business PB - John Wiley & Sons Inc. ER - TY - CHAP T1 - DISPEL Enactment T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Chee Sun Liew A1 - Krause, Amrey A1 - Snelling, David ED - Malcolm Atkinson ED - Rob Baxter ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - Data Streaming KW - Data-Intensive Engineering KW - Dispel KW - Workflow Enactment AB - Chapter 12: "DISPEL enactment", describes the four stages of DISPEL enactment. It is targeted at the data-intensive engineers who implement enactment services. JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business PB - John Wiley & Sons Inc. ER - TY - JOUR T1 - Exploiting Parallel R in the Cloud with SPRINT JF - Methods of Information in Medicine Y1 - 2013 A1 - Piotrowski, Michal A1 - Gary McGilvary A1 - Sloan, Terence A1 - Mewissen, Muriel A1 - Ashley Lloyd A1 - Forster, Thorsten A1 - Mitchell, Lawrence A1 - Ghazal, Peter A1 - Hill, Jon AB - Background: Advances in DNA Microarray devices and next-generation massively parallel DNA sequencing platforms have led to an exponential growth in data availability but the arising opportunities require adequate computing resources. High Performance Computing (HPC) in the Cloud offers an affordable way of meeting this need. Objectives: Bioconductor, a popular tool for high-throughput genomic data analysis, is distributed as add-on modules for the R statistical programming language but R has no native capabilities for exploiting multi-processor architectures. SPRINT is an R package that enables easy access to HPC for genomics researchers. This paper investigates: setting up and running SPRINT-enabled genomic analyses on Amazon’s Elastic Compute Cloud (EC2), the advantages of submitting applications to EC2 from different parts of the world and, if resource underutilization can improve application performance. Methods: The SPRINT parallel implementations of correlation, permutation testing, partitioning around medoids and the multi-purpose papply have been benchmarked on data sets of various size on Amazon EC2. Jobs have been submitted from both the UK and Thailand to investigate monetary differences. Results: It is possible to obtain good, scalable performance but the level of improvement is dependent upon the nature of algorithm. Resource underutilization can further improve the time to result. End-user’s location impacts on costs due to factors such as local taxation. Conclusions: Although not designed to satisfy HPC requirements, Amazon EC2 and cloud computing in general provides an interesting alternative and provides new possibilities for smaller organisations with limited funds. VL - 52 IS - 1 ER - TY - CHAP T1 - Foreword T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Tony Hey ED - Malcolm Atkinson ED - Rob Baxter ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - Big Data KW - Data-intensive Computing, Knowledge Discovery JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business PB - John Wiley & Sons Ltd. ER - TY - CHAP T1 - Platforms for Data-Intensive Analysis T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Snelling, David ED - Malcolm Atkinson ED - Baxter, Robert M. ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - Data-Intensive Engineering KW - Data-Intensive Systems KW - Dispel KW - Distributed Systems AB - Part III: "Data-intensive engineering", is targeted at technical experts who will develop complex applications, new components, or data-intensive platforms. The techniques introduced may be applied very widely; for example, to any data-intensive distributed application, such as index generation, image processing, sequence comparison, text analysis, and sensor-stream monitoring. The challenges, methods, and implementation requirements are illustrated by making extensive use of DISPEL. Chapter 9: "Platforms for data-intensive analysis", gives a reprise of data-intensive architectures, examines the business case for investing in them, and introduces the stages of data-intensive workflow enactment. JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business PB - John Wiley & Sons Ltd. ER - TY - CHAP T1 - Preface T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Malcolm Atkinson ED - Malcolm Atkinson ED - Rob Baxter ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - Big Data, Data-intensive Computing, Knowledge Discovery AB - Who should read the book and why. The structure and conventions used. Suggested reading paths for different categories of reader. JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business PB - John Wiley & Sons Ltd. ER - TY - CHAP T1 - Problem Solving in Data-Intensive Knowledge Discovery T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Oscar Corcho A1 - van Hemert, Jano ED - Malcolm Atkinson ED - Rob Baxter ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - Data-Analysis Experts KW - Data-Intensive Analysis KW - Design Patterns for Knowledge Discovery KW - Knowledge Discovery AB - Chapter 6: "Problem solving in data-intensive knowledge discovery", on the basis of the previous scenarios, this chapter provides an overview of effective strategies in knowledge discovery, highlighting common problem-solving methods that apply in conventional contexts, and focusing on the similarities and differences of these methods. JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business PB - John Wiley & Sons Ltd. ER - TY - CHAP T1 - Sharing and Reuse in Knowledge Discovery T2 - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business Y1 - 2013 A1 - Oscar Corcho ED - Malcolm Atkinson ED - Rob Baxter ED - Peter Brezany ED - Oscar Corcho ED - Michelle Galea ED - Parsons, Mark ED - Snelling, David ED - van Hemert, Jano KW - Data-Intensive Analysis KW - Knowledge Discovery KW - Ontologies KW - Semantic Web KW - Sharing AB - Chapter 8: "Sharing and re-use in knowledge discovery", introduces more advanced knowledge discovery problems, and shows how improved component and pattern descriptions facilitate re-use. This supports the assembly of libraries of high level components well-adapted to classes of knowledge discovery methods or application domains. The descriptions are made more powerful by introducing notations from the semantic Web. JF - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business PB - John Wiley & Sons Ltd. ER - TY - CONF T1 - Towards Addressing CPU-Intensive Seismological Applications in Europe T2 - International Supercomputing Conference Y1 - 2013 A1 - Michele Carpené A1 - I.A. Klampanos A1 - Siew Hoon Leong A1 - Emanuele Casarotti A1 - Peter Danecek A1 - Graziella Ferini A1 - Andre Gemünd A1 - Amrey Krause A1 - Lion Krischer A1 - Federica Magnoni A1 - Marek Simon A1 - Alessandro Spinuso A1 - Luca Trani A1 - Malcolm Atkinson A1 - Giovanni Erbacci A1 - Anton Frank A1 - Heiner Igel A1 - Andreas Rietbrock A1 - Horst Schwichtenberg A1 - Jean-Pierre Vilotte AB - Advanced application environments for seismic analysis help geoscientists to execute complex simulations to predict the behaviour of a geophysical system and potential surface observations. At the same time data collected from seismic stations must be processed comparing recorded signals with predictions. The EU-funded project VERCE (http://verce.eu/) aims to enable specific seismological use-cases and, on the basis of requirements elicited from the seismology community, provide a service-oriented infrastructure to deal with such challenges. In this paper we present VERCE’s architecture, in particular relating to forward and inverse modelling of Earth models and how the, largely file-based, HPC model can be combined with data streaming operations to enhance the scalability of experiments.We posit that the integration of services and HPC resources in an open, collaborative environment is an essential medium for the advancement of sciences of critical importance, such as seismology. JF - International Supercomputing Conference CY - Leipzig, Germany ER - TY - CONF T1 - User-friendly workflows in quantum chemistry T2 - IWSG 2013 Y1 - 2013 A1 - Herres-Pawlis, Sonja A1 - Balaskó, Ákos A1 - Birkenheuer, Georg A1 - Brinkmann, André A1 - Gesing, Sandra A1 - Grunzke, Richard A1 - Hoffmann, Alexander A1 - Kacsuk, Peter A1 - Krüger, Jens A1 - Packschies, Lars A1 - Terstyansky, Gabor A1 - Weingarten, Noam JF - IWSG 2013 PB - CEUR Workshop Proceedings CY - Zurich, Switzerland UR - http://ceur-ws.org/Vol-993/paper14.pdf ER - TY - BOOK T1 - Web-based Science Gateways for Structural Bioinformatics Y1 - 2013 A1 - Gesing, Sandra PB - University of Tübingen UR - http://nbn-resolving.de/urn:nbn:de:bsz:21-opus-67822 ER - TY - CONF T1 - A Data Driven Science Gateway for Computational Workflows T2 - UNICORE Summit 2012 Y1 - 2012 A1 - Grunzke, Richard A1 - Birkenheuer, G. A1 - Blunk, D. A1 - Breuers, S. A1 - Brinkmann, A. A1 - Gesing, Sandra A1 - Herres-Pawlis, S A1 - Kohlbacher, O. A1 - Krüger, J. A1 - Kruse, M. A1 - Müller-Pfefferkorn, R. A1 - Schäfer, P. A1 - Schuller, B. A1 - Steinke, T. A1 - Zink, A. JF - UNICORE Summit 2012 ER - TY - JOUR T1 - Data-Intensive Architecture for Scientific Knowledge Discovery JF - Distributed and Parallel Databases Y1 - 2012 A1 - Atkinson, Malcolm P. A1 - Chee Sun Liew A1 - Michelle Galea A1 - Paul Martin A1 - Krause, Amrey A1 - Adrian Mouat A1 - Oscar Corcho A1 - Snelling, David KW - Knowledge discovery, workflow management system AB - This paper presents a data-intensive architecture that demonstrates the ability to support applications from a wide range of application domains, and support the different types of users involved in defining, designing and executing data-intensive processing tasks. The prototype architecture is introduced, and the pivotal role of DISPEL as a canonical language is explained. The architecture promotes the exploration and exploitation of distributed and heterogeneous data and spans the complete knowledge discovery process, from data preparation, to analysis, to evaluation and reiteration. The architecture evaluation included large-scale applications from astronomy, cosmology, hydrology, functional genetics, imaging processing and seismology. VL - 30 UR - http://dx.doi.org/10.1007/s10619-012-7105-3 IS - 5 ER - TY - JOUR T1 - EnzML: multi-label prediction of enzyme classes using InterPro signatures. JF - BMC Bioinformatics Y1 - 2012 A1 - De Ferrari, Luna A1 - Stuart Aitken A1 - van Hemert, Jano A1 - Goryanin, Igor AB - BACKGROUND: Manual annotation of enzymatic functions cannot keep up with automatic genome sequencing. In this work we explore the capacity of InterPro sequence signatures to automatically predict enzymatic function. RESULTS: We present EnzML, a multi-label classification method that can efficiently account also for proteins with multiple enzymatic functions: 50,000 in UniProt. EnzML was evaluated using a standard set of 300,747 proteins for which the manually curated Swiss-Prot and KEGG databases have agreeing Enzyme Commission (EC) annotations. EnzML achieved more than 98% subset accuracy (exact match of all correct Enzyme Commission classes of a protein) for the entire dataset and between 87 and 97% subset accuracy in reannotating eight entire proteomes: human, mouse, rat, mouse-ear cress, fruit fly, the S. pombe yeast, the E. coli bacterium and the M. jannaschii archaebacterium. To understand the role played by the dataset size, we compared the cross-evaluation results of smaller datasets, either constructed at random or from specific taxonomic domains such as archaea, bacteria, fungi, invertebrates, plants and vertebrates. The results were confirmed even when the redundancy in the dataset was reduced using UniRef100, UniRef90 or UniRef50 clusters. CONCLUSIONS: InterPro signatures are a compact and powerful attribute space for the prediction of enzymatic function. This representation makes multi-label machine learning feasible in reasonable time (30 minutes to train on 300,747 instances with 10,852 attributes and 2,201 class values) using the Mulan Binary Relevance Nearest Neighbours algorithm implementation (BR-kNN). VL - 13 ER - TY - CONF T1 - Generic User Management for Science Gateways via Virtual Organizations T2 - EGI Technical Forum 2012 Y1 - 2012 A1 - Schlemmer, Tobias A1 - Grunzke, Richard A1 - Gesing, Sandra A1 - Krüger, Jens A1 - Birkenheuer, Georg A1 - Müller-Pfefferkorn, Ralph A1 - Kohlbacher, Oliver JF - EGI Technical Forum 2012 ER - TY - Generic T1 - HealthGrid Applications and Technologies Meet Science Gateways for Life Sciences Y1 - 2012 ED - Gesing, Sandra ED - Glatard, Tristan ED - Krüger, Jens ED - Delgado Olabarriaga, Silvia ED - Solomonides, Tony ED - Silverstein, J. ED - Montagnat, J. ED - Gaignard, A. ED - Krefting, Dagmar PB - IOS Press VL - 175 ER - TY - CONF T1 - The MoSGrid Community - From National to International Scale T2 - EGI Community Forum 2012 Y1 - 2012 A1 - Gesing, Sandra A1 - Herres-Pawlis, Sonja A1 - Birkenheuer, Georg A1 - Brinkmann, André A1 - Grunzke, Richard A1 - Kacsuk, Peter A1 - Kohlbacher, Oliver A1 - Kozlovszky, Miklos A1 - Krüger, Jens A1 - Müller-Pfefferkorn, Ralph A1 - Schäfer, Patrick A1 - Steinke, Thomas JF - EGI Community Forum 2012 ER - TY - CONF T1 - MoSGrid: Progress of Workflow driven Chemical Simulations T2 - Grid Workflow Workshop 2011 Y1 - 2012 A1 - Birkenheuer, Georg A1 - Blunk, Dirk A1 - Breuers, Sebastian A1 - Brinkmann, André A1 - Fels, Gregor A1 - Gesing, Sandra A1 - Grunzke, Richard A1 - Herres-Pawlis, Sonja A1 - Kohlbacher, Oliver A1 - Krüger, Jens A1 - Packschies, Lars A1 - Schäfer, Patrick A1 - Schuller, B. A1 - Schuster, Johannes A1 - Steinke, Thomas A1 - Szikszay Fabri, Anna A1 - Wewior, Martin A1 - Müller-Pfefferkorn, Ralph A1 - Kohlbacher, Oliver JF - Grid Workflow Workshop 2011 PB - CEUR Workshop Proceedings ER - TY - JOUR T1 - Requirements for Provenance on the Web JF - IJDC Y1 - 2012 A1 - Paul T. Groth A1 - Yolanda Gil A1 - James Cheney A1 - Simon Miles VL - 7 ER - TY - CONF T1 - A Science Gateway Getting Ready for Serving the International Molecular Simulation Community T2 - Proceedings of Science Y1 - 2012 A1 - Gesing, Sandra A1 - Herres-Pawlis, Sonja A1 - Birkenheuer, Georg A1 - Brinkmann, André A1 - Grunzke, Richard A1 - Kacsuk, Peter A1 - Kohlbacher, Oliver A1 - Kozlovszky, Miklos A1 - Krüger, Jens A1 - Müller-Pfefferkorn, Ralph A1 - Schäfer, Patrick A1 - Steinke, Thomas JF - Proceedings of Science ER - TY - JOUR T1 - A Single Sign-On Infrastructure for Science Gateways on a Use Case for Structural Bioinformatics JF - Journal of Grid Computing Y1 - 2012 A1 - Gesing, Sandra A1 - Grunzke, Richard A1 - Krüger, Jens A1 - Birkenheuer, Georg A1 - Wewior, Martin A1 - Schäfer, Patrick A1 - Schuller, Bernd A1 - Schuster, Johannes A1 - Herres-Pawlis, Sonja A1 - Breuers, Sebastian A1 - Balaskó, Ákos A1 - Kozlovszky, Miklos A1 - Fabri, AnnaSzikszay A1 - Packschies, Lars A1 - Kacsuk, Peter A1 - Blunk, Dirk A1 - Steinke, Thomas A1 - Brinkmann, André A1 - Fels, Gregor A1 - Müller-Pfefferkorn, Ralph A1 - Jäkel, René A1 - Kohlbacher, Oliver KW - DCIs KW - Science gateway KW - security KW - Single sign-on KW - Structural bioinformatics VL - 10 UR - http://dx.doi.org/10.1007/s10723-012-9247-y ER - TY - CONF T1 - Workflow-enhanced conformational analysis of guanidine zinc complexes via a science gateway T2 - HealthGrid Applications and Technologies Meet Science Gateways for Life Sciences Y1 - 2012 A1 - Herres-Pawlis, Sonja A1 - Birkenheuer, Georg A1 - Brinkmann, André A1 - Gesing, Sandra A1 - Grunzke, Richard A1 - Jäkel, René A1 - Kohlbacher, Oliver A1 - Krüger, Jens A1 - Dos Santos Vieira, Ines JF - HealthGrid Applications and Technologies Meet Science Gateways for Life Sciences PB - IOS Press ER - TY - CONF T1 - Granular Security for a Science Gateway in Structural Bioinformatics T2 - Proceedings of the International Workshop on Science Gateways for Life Sciences (IWSG-Life 2011) Y1 - 2011 A1 - Gesing, Sandra A1 - Grunzke, Richard A1 - Balaskó, Ákos A1 - Birkenheuer, Georg A1 - Blunk, Dirk A1 - Breuers, Sebastian A1 - Brinkmann, André A1 - Fels, Gregor A1 - Herres-Pawlis, Sonja A1 - Kacsuk, Peter A1 - Kozlovszky, Miklos A1 - Krüger, Jens A1 - Packschies, Lars A1 - Schäfer, Patrick A1 - Schuller, Bernd A1 - Schuster, Johannes A1 - Steinke, Thomas A1 - Szikszay Fabri, Anna A1 - Wewior, Martin A1 - Müller-Pfefferkorn, Ralph A1 - Kohlbacher, Oliver JF - Proceedings of the International Workshop on Science Gateways for Life Sciences (IWSG-Life 2011) PB - CEUR Workshop Proceedings ER - TY - JOUR T1 - Managing dynamic enterprise and urgent workloads on clouds using layered queuing and historical performance models JF - Simulation Modelling Practice and Theory Y1 - 2011 A1 - David A. Bacigalupo A1 - van Hemert, Jano I. A1 - Xiaoyu Chen A1 - Asif Usmani A1 - Adam P. Chester A1 - Ligang He A1 - Donna N. Dillenberger A1 - Gary B. Wills A1 - Lester Gilbert A1 - Stephen A. Jarvis KW - e-Science AB - The automatic allocation of enterprise workload to resources can be enhanced by being able to make what–if response time predictions whilst different allocations are being considered. We experimentally investigate an historical and a layered queuing performance model and show how they can provide a good level of support for a dynamic-urgent cloud environment. Using this we define, implement and experimentally investigate the effectiveness of a prediction-based cloud workload and resource management algorithm. Based on these experimental analyses we: (i) comparatively evaluate the layered queuing and historical techniques; (ii) evaluate the effectiveness of the management algorithm in different operating scenarios; and (iii) provide guidance on using prediction-based workload and resource management. VL - 19 ER - TY - CONF T1 - A Science Gateway for Molecular Simulations T2 - EGI User Forum 2011 Y1 - 2011 A1 - Gesing, Sandra A1 - Kacsuk, Peter A1 - Kozlovszky, Miklos A1 - Birkenheuer, Georg A1 - Blunk, Dirk A1 - Breuers, Sebastian A1 - Brinkmann, André A1 - Fels, Gregor A1 - Grunzke, Richard A1 - Herres-Pawlis, Sonja A1 - Krüger, Jens A1 - Packschies, Lars A1 - Müller-Pfefferkorn, Ralph A1 - Schäfer, Patrick A1 - Steinke, Thomas A1 - Szikszay Fabri, Anna A1 - Warzecha, Klaus A1 - Wewior, Martin A1 - Kohlbacher, Oliver JF - EGI User Forum 2011 SN - 978 90 816927 1 7 ER - TY - JOUR T1 - Special Issue: Portals for life sciences---Providing intuitive access to bioinformatic tools JF - Concurrency and Computation: Practice and Experience Y1 - 2011 A1 - Gesing, Sandra A1 - van Hemert, J. A1 - Kacsuk, P. A1 - Kohlbacher, O. AB - The topic ‘Portals for life sciences’ includes various research fields, on the one hand many different topics out of life sciences, e.g. mass spectrometry, on the other hand portal technologies and different aspects of computer science, such as usability of user interfaces and security of systems. The main aspect about portals is to simplify the user’s interaction with computational resources that are concerted to a supported application domain. PB - Wiley VL - 23 IS - 23 ER - TY - JOUR T1 - Comparing Clinical Decision Support Systems for Recruitment in Clinical Trials JF - Journal of Medical Informatics Y1 - 2010 A1 - Marc Cuggia A1 - Paolo Besana A1 - David Glasspool. ER - TY - Generic T1 - Federated Enactment of Workflow Patterns T2 - Lecture Notes in Computer Science Y1 - 2010 A1 - Yaikhom, Gagarine A1 - Liew, Chee A1 - Liangxiu Han A1 - van Hemert, Jano A1 - Malcolm Atkinson A1 - Krause, Amy ED - D’Ambra, Pasqua ED - Guarracino, Mario ED - Talia, Domenico AB - In this paper we address two research questions concerning workflows: 1) how do we abstract and catalogue recurring workflow patterns?; and 2) how do we facilitate optimisation of the mapping from workflow patterns to actual resources at runtime? Our aim here is to explore techniques that are applicable to large-scale workflow compositions, where the resources could change dynamically during the lifetime of an application. We achieve this by introducing a registry-based mechanism where pattern abstractions are catalogued and stored. In conjunction with an enactment engine, which communicates with this registry, concrete computational implementations and resources are assigned to these patterns, conditional to the execution parameters. Using a data mining application from the life sciences, we demonstrate this new approach. JF - Lecture Notes in Computer Science PB - Springer Berlin / Heidelberg VL - 6271 UR - http://dx.doi.org/10.1007/978-3-642-15277-1_31 N1 - 10.1007/978-3-642-15277-1_31 ER - TY - CONF T1 - Grid-Workflows in Molecular Science T2 - Software Engineering 2010, Grid Workflow Workshop Y1 - 2010 A1 - Birkenheuer, Georg A1 - Breuers, Sebastian A1 - Brinkmann, André A1 - Blunk, Dirk A1 - Fels, Gregor A1 - Gesing, Sandra A1 - Herres-Pawlis, Sonja A1 - Kohlbacher, Oliver A1 - Krüger, Jens A1 - Packschies, Lars JF - Software Engineering 2010, Grid Workflow Workshop PB - GI-Edition - Lecture Notes in Informatics (LNI) ER - TY - JOUR T1 - Integrating distributed data sources with OGSA--DAI DQP and Views JF - Philosophical Transactions A Y1 - 2010 A1 - Dobrzelecki, B. A1 - Krause, A. A1 - Hume, A. C. A1 - Grant, A. A1 - Antonioletti, M. A1 - Alemu, T. Y. A1 - Atkinson, M. A1 - Jackson, M. A1 - Theocharopoulos, E. AB - OGSA-DAI (Open Grid Services Architecture Data Access and Integration) is a framework for building distributed data access and integration systems. Until recently, it lacked the built-in functionality that would allow easy creation of federations of distributed data sources. The latest release of the OGSA-DAI framework introduced the OGSA-DAI DQP (Distributed Query Processing) resource. The new resource encapsulates a distributed query processor, that is able to orchestrate distributed data sources when answering declarative user queries. The query processor has many extensibility points, making it easy to customize. We have also introduced a new OGSA-DAI Views resource that provides a flexible method for defining views over relational data. The interoperability of the two new resources, together with the flexibility of the OGSA-DAI framework, allows the building of highly customized data integration solutions. VL - 368 ER - TY - CONF T1 - The MoSGrid Gaussian Portlet – Technologies for the Implementation of Portlets for Molecular Simulations T2 - Proceedings of the International Workshop on Science Gateways (IWSG10) Y1 - 2010 A1 - Wewior, Martin A1 - Packschies, Lars A1 - Blunk, Dirk A1 - Wickeroth, D. A1 - Warzecha, Klaus A1 - Herres-Pawlis, Sonja A1 - Gesing, Sandra A1 - Breuers, Sebastian A1 - Krüger, Jens A1 - Birkenheuer, Georg A1 - Lang, Ulrich ED - Barbera, Roberto ED - Andronico, Giuseppe ED - La Rocca, Giuseppe JF - Proceedings of the International Workshop on Science Gateways (IWSG10) PB - Consorzio COMETA ER - TY - CONF T1 - TOPP goes Rapid T2 - Cluster Computing and the Grid, IEEE International Symposium on Y1 - 2010 A1 - Gesing, Sandra A1 - van Hemert, Jano A1 - Jos Koetsier A1 - Bertsch, Andreas A1 - Kohlbacher, Oliver AB - Proteomics, the study of all the proteins contained in a particular sample, e.g., a cell, is a key technology in current biomedical research. The complexity and volume of proteomics data sets produced by mass spectrometric methods clearly suggests the use of grid-based high-performance computing for analysis. TOPP and OpenMS are open-source packages for proteomics data analysis; however, they do not provide support for Grid computing. In this work we present a portal interface for high-throughput data analysis with TOPP. The portal is based on Rapid, a tool for efficiently generating standardized portlets for a wide range of applications. The web-based interface allows the creation and editing of user-defined pipelines and their execution and monitoring on a Grid infrastructure. The portal also supports several file transfer protocols for data staging. It thus provides a simple and complete solution to high-throughput proteomics data analysis for inexperienced users through a convenient portal interface. JF - Cluster Computing and the Grid, IEEE International Symposium on PB - IEEE Computer Society CY - Los Alamitos, CA, USA SN - 978-0-7695-4039-9 ER - TY - CONF T1 - Workflow Interoperability in a Grid Portal for Molecular Simulations T2 - Proceedings of the International Workshop on Science Gateways (IWSG10) Y1 - 2010 A1 - Gesing, Sandra A1 - Marton, Istvan A1 - Birkenheuer, Georg A1 - Schuller, Bernd A1 - Grunzke, Richard A1 - Krüger, Jens A1 - Breuers, Sebastian A1 - Blunk, Dirk A1 - Fels, Gregor A1 - Packschies, Lars A1 - Brinkmann, André A1 - Kohlbacher, Oliver A1 - Kozlovszky, Miklos JF - Proceedings of the International Workshop on Science Gateways (IWSG10) PB - Consorzio COMETA ER - TY - CONF T1 - A model of social collaboration in Molecular Biology knowledge bases T2 - Proceedings of the 6th Conference of the European Social Simulation Association (ESSA'09) Y1 - 2009 A1 - De Ferrari, Luna A1 - Stuart Aitken A1 - van Hemert, Jano A1 - Goryanin, Igor AB - Manual annotation of biological data cannot keep up with data production. Open annotation models using wikis have been proposed to address this problem. In this empirical study we analyse 36 years of knowledge collection by 738 authors in two Molecular Biology wikis (EcoliWiki and WikiPathways) and two knowledge bases (OMIM and Reactome). We first investigate authorship metrics (authors per entry and edits per author) which are power-law distributed in Wikipedia and we find they are heavy-tailed in these four systems too. We also find surprising similarities between the open (editing open to everyone) and the closed systems (expert curators only). Secondly, to discriminate between driving forces in the measured distributions, we simulate the curation process and find that knowledge overlap among authors can drive the number of authors per entry, while the time the users spend on the knowledge base can drive the number of contributions per author. JF - Proceedings of the 6th Conference of the European Social Simulation Association (ESSA'09) PB - European Social Simulation Association ER - TY - JOUR T1 - An Open Grid Services Architecture Primer JF - Computer Y1 - 2009 A1 - Grimshaw, Andrew A1 - Morgan, Mark A1 - Merrill, Duane A1 - Kishimoto, Hiro A1 - Savva, Andreas A1 - Snelling, David A1 - Smith, Chris A1 - Dave Berry PB - IEEE Computer Society Press CY - Los Alamitos, CA, USA VL - 42 ER - TY - CONF T1 - Portals for Life Sciences—a Brief Introduction T2 - Proceedings of the IWPLS09 International Workshop on Portals for Life Sciences Y1 - 2009 A1 - Gesing, Sandra A1 - Kohlbacher, O. A1 - van Hemert, J. I. AB - The topic ”‘Portals for Life Sciences”’ includes various research fields, on the one hand many different topics out of life sciences, e.g. mass spectrometry, on the other hand portal technologies and different aspects of computer science, such as usability of user interfaces and security of systems. The main aspect about portals is to simplify the user’s interaction with computational resources which are concer- ted to a supported application domain. JF - Proceedings of the IWPLS09 International Workshop on Portals for Life Sciences T3 - CEUR Workshop Proceedings UR - http://ceur-ws.org/Vol-513/paper01.pdf ER - TY - Generic T1 - Proceedings of the 1st International Workshop on Portals for Life Sciences T2 - IWPLS09 International Workshop on Portals for Life Sciences Y1 - 2009 A1 - Gesing, Sandra A1 - van Hemert, Jano I. JF - IWPLS09 International Workshop on Portals for Life Sciences T3 - CEUR Workshop Proceedings CY - e-Science Institute, Edinburgh, UK UR - http://ceur-ws.org/Vol-513 ER - TY - CONF T1 - Rapid development of computational science portals T2 - Proceedings of the IWPLS09 International Workshop on Portals for Life Sciences Y1 - 2009 A1 - Koetsier, J. A1 - van Hemert, J. I. ED - Gesing, S. ED - van Hemert, J. I. KW - portal JF - Proceedings of the IWPLS09 International Workshop on Portals for Life Sciences T3 - CEUR Workshop Proceedings PB - e-Science Institute CY - Edinburgh UR - http://ceur-ws.org/Vol-513/paper05.pdf ER - TY - JOUR T1 - Simultaneous alignment of short reads against multiple genomes JF - Genome Biol Y1 - 2009 A1 - Schneeberger, Korbinian A1 - Hagmann, Jörg A1 - Ossowski, Stephan A1 - Warthmann, Norman A1 - Gesing, Sandra A1 - Kohlbacher, Oliver A1 - Weigel, Detlef VL - 10 UR - http://www.biomedsearch.com/nih/Simultaneous-alignment-short-reads-against/19761611.html ER - TY - JOUR T1 - A Strategy for Research and Innovation in the Century of Information JF - Prometheus Y1 - 2009 A1 - e-Science Directors’ Forum Strategy Working Group A1 - Atkinson, M. A1 - Britton, D. A1 - Coveney, P. A1 - De Roure, D A1 - Garnett, N. A1 - Geddes, N. A1 - Gurney, R. A1 - Haines, K. A1 - Hughes, L. A1 - Ingram, D. A1 - Jeffreys, P. A1 - Lyon, L. A1 - Osborne, I. A1 - Perrott, P. A1 - Procter. R. A1 - Rusbridge, C. AB - More data will be produced in the next five years than in the entire history of human kind, a digital deluge that marks the beginning of the Century of Information. Through a year‐long consultation with UK researchers, a coherent strategy has been developed, which will nurture Century‐of‐Information Research (CIR); it crystallises the ideas developed by the e‐Science Directors’ Forum Strategy Working Group. This paper is an abridged version of their latest report which can be found at: http://wikis.nesc.ac.uk/escienvoy/Century_of_Information_Research_Strategy which also records the consultation process and the affiliations of the authors. This document is derived from a paper presented at the Oxford e‐Research Conference 2008 and takes into account suggestions made in the ensuing panel discussion. The goals of the CIR Strategy are to facilitate the growth of UK research and innovation that is data and computationally intensive and to develop a new culture of ‘digital‐systems judgement’ that will equip research communities, businesses, government and society as a whole, with the skills essential to compete and prosper in the Century of Information. The CIR Strategy identifies a national requirement for a balanced programme of coordination, research, infrastructure, translational investment and education to empower UK researchers, industry, government and society. The Strategy is designed to deliver an environment which meets the needs of UK researchers so that they can respond agilely to challenges, can create knowledge and skills, and can lead new kinds of research. It is a call to action for those engaged in research, those providing data and computational facilities, those governing research and those shaping education policies. The ultimate aim is to help researchers strengthen the international competitiveness of the UK research base and increase its contribution to the economy. The objectives of the Strategy are to better enable UK researchers across all disciplines to contribute world‐leading fundamental research; to accelerate the translation of research into practice; and to develop improved capabilities, facilities and context for research and innovation. It envisages a culture that is better able to grasp the opportunities provided by the growing wealth of digital information. Computing has, of course, already become a fundamental tool in all research disciplines. The UK e‐Science programme (2001–06)—since emulated internationally—pioneered the invention and use of new research methods, and a new wave of innovations in digital‐information technologies which have enabled them. The Strategy argues that the UK must now harness and leverage its own, plus the now global, investment in digital‐information technology in order to spread the benefits as widely as possible in research, education, industry and government. Implementing the Strategy would deliver the computational infrastructure and its benefits as envisaged in the Science & Innovation Investment Framework 2004–2014 (July 2004), and in the reports developing those proposals. To achieve this, the Strategy proposes the following actions: 1. support the continuous innovation of digital‐information research methods; 2. provide easily used, pervasive and sustained e‐Infrastructure for all research; 3. enlarge the productive research community which exploits the new methods efficiently; 4. generate capacity, propagate knowledge and develop skills via new curricula; and 5. develop coordination mechanisms to improve the opportunities for interdisciplinary research and to make digital‐infrastructure provision more cost effective. To gain the best value for money strategic coordination is required across a broad spectrum of stakeholders. A coherent strategy is essential in order to establish and sustain the UK as an international leader of well‐curated national data assets and computational infrastructure, which is expertly used to shape policy, support decisions, empower researchers and to roll out the results to the wider benefit of society. The value of data as a foundation for wellbeing and a sustainable society must be appreciated; national resources must be more wisely directed to the collection, curation, discovery, widening access, analysis and exploitation of these data. Every researcher must be able to draw on skills, tools and computational resources to develop insights, test hypotheses and translate inventions into productive use, or to extract knowledge in support of governmental decision making. This foundation plus the skills developed will launch significant advances in research, in business, in professional practice and in government with many consequent benefits for UK citizens. The Strategy presented here addresses these complex and interlocking requirements. VL - 27 ER - TY - JOUR T1 - Distributed Computing Education, Part 4: Training Infrastructure JF - Distributed Systems Online Y1 - 2008 A1 - Fergusson, D. A1 - Barbera, R. A1 - Giorgio, E. A1 - Fargetta, M. A1 - Sipos, G. A1 - Romano, D. A1 - Atkinson, M. A1 - Vander Meer, E. AB - In the first article of this series (see http://doi.ieeecomputersociety.org/10.1109/MDSO.2008.16), we identified the need for teaching environments that provide infrastructure to support education and training in distributed computing. Training infrastructure, or t-infrastructure, is analogous to the teaching laboratory in biology and is a vital tool for educators and students. In practice, t-infrastructure includes the computing equipment, digital communications, software, data, and support staff necessary to teach a course. The International Summer Schools in Grid Computing (ISSGC) series and the first International Winter School on Grid Computing (IWSGC 08) used the Grid INFN Laboratory of Dissemination Activities (GILDA) infrastructure so students could gain hands-on experience with middleware. Here, we describe GILDA, related summer and winter school experiences, multimiddleware integration, t-infrastructure, and academic courses, concluding with an analysis and recommendations. PB - IEEE Computer Society VL - 9 UR - http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4752926 IS - 10 ER - TY - Generic T1 - European Graduate Student Workshop on Evolutionary Computation Y1 - 2008 A1 - Di Chio, Cecilia A1 - Giacobini, Mario A1 - van Hemert, Jano ED - Di Chio, Cecilia ED - Giacobini, Mario ED - van Hemert, Jano KW - evolutionary computation AB - Evolutionary computation involves the study of problem-solving and optimization techniques inspired by principles of evolution and genetics. As any other scientific field, its success relies on the continuity provided by new researchers joining the field to help it progress. One of the most important sources for new researchers is the next generation of PhD students that are actively studying a topic relevant to this field. It is from this main observation the idea arose of providing a platform exclusively for PhD students. ER - TY - JOUR T1 - A Grid infrastructure for parallel and interactive applications JF - Computing and Informatics Y1 - 2008 A1 - Gomes, J. A1 - Borges, B. A1 - Montecelo, M. A1 - David, M. A1 - Silva, B. A1 - Dias, N. A1 - Martins, JP A1 - Fernandez, C. A1 - Garcia-Tarres, L. , A1 - Veiga, C. A1 - Cordero, D. A1 - Lopez, J. A1 - J Marco A1 - Campos, I. A1 - Rodríguez, David A1 - Marco, R. A1 - Lopez, A. A1 - Orviz, P. A1 - Hammad, A. VL - 27 IS - 2 ER - TY - JOUR T1 - The interactive European Grid: Project objectives and achievements JF - Computing and Informatics Y1 - 2008 A1 - J Marco A1 - Campos, I. A1 - Coterillo, I. A1 - Diaz, I. A1 - Lopez, A. A1 - Marco, R. A1 - Martinez-Rivero, C. A1 - Orviz, P. A1 - Rodríguez, David A1 - Gomes, J. A1 - Borges, G. A1 - Montecelo, M. A1 - David, M. A1 - Silva, B. A1 - Dias, N. A1 - Martins, JP A1 - Fernandez, C. A1 - Garcia-Tarres, L. VL - 27 IS - 2 ER - TY - CONF T1 - OGSA-DAI: Middleware for Data Integration: Selected Applications T2 - ESCIENCE '08: Proceedings of the 2008 Fourth IEEE International Conference on eScience Y1 - 2008 A1 - Grant, Alistair A1 - Antonioletti, Mario A1 - Hume, Alastair C. A1 - Krause, Amy A1 - Dobrzelecki, Bartosz A1 - Jackson, Michael J. A1 - Parsons, Mark A1 - Atkinson, Malcolm P. A1 - Theocharopoulos, Elias JF - ESCIENCE '08: Proceedings of the 2008 Fourth IEEE International Conference on eScience PB - IEEE Computer Society CY - Washington, DC, USA SN - 978-0-7695-3535-7 ER - TY - CONF T1 - WikiSim: simulating knowledge collection and curation in structured wikis. T2 - Proceedings of the 2008 International Symposium on Wikis in Porto, Portugal Y1 - 2008 A1 - De~Ferrari, Luna A1 - Stuart Aitken A1 - van Hemert, Jano A1 - Goryanin, Igor AB - The aim of this work is to model quantitatively one of the main properties of wikis: how high quality knowledge can emerge from the individual work of independent volunteers. The approach chosen is to simulate knowledge collection and curation in wikis. The basic model represents the wiki as a set of of true/false values, added and edited at each simulation round by software agents (users) following a fixed set of rules. The resulting WikiSim simulations already manage to reach distributions of edits and user contributions very close to those reported for Wikipedia. WikiSim can also span conditions not easily measurable in real-life wikis, such as the impact of various amounts of user mistakes. WikiSim could be extended to model wiki software features, such as discussion pages and watch lists, while monitoring the impact they have on user actions and consensus, and their effect on knowledge quality. The method could also be used to compare wikis with other curation scenarios based on centralised editing by experts. The future challenges for WikiSim will be to find appropriate ways to evaluate and validate the models and to keep them simple while still capturing relevant properties of wiki systems. JF - Proceedings of the 2008 International Symposium on Wikis in Porto, Portugal PB - ACM CY - New York, NY, USA ER - TY - Generic T1 - European Graduate Student Workshop on Evolutionary Computation Y1 - 2007 A1 - Giacobini, Mario A1 - van Hemert, Jano ED - Mario Giacobini ED - van Hemert, Jano KW - evolutionary computation AB - Evolutionary computation involves the study of problem-solving and optimization techniques inspired by principles of evolution and genetics. As any other scientific field, its success relies on the continuity provided by new researchers joining the field to help it progress. One of the most important sources for new researchers is the next generation of PhD students that are actively studying a topic relevant to this field. It is from this main observation the idea arose of providing a platform exclusively for PhD students. CY - Valencia, Spain ER - TY - CONF T1 - Interaction as a Grounding for Peer to Peer Knowledge Sharing T2 - Advances in Web Semantics Y1 - 2007 A1 - Robertson, D. A1 - Walton, C. A1 - Barker, A. A1 - Besana, P. A1 - Chen-Burger, Y. A1 - Hassan, F. A1 - Lambert, D. A1 - Li, G. A1 - McGinnis, J A1 - Osman, N. A1 - Bundy, A. A1 - McNeill, F. A1 - van Harmelen, F. A1 - Sierra, C. A1 - Giunchiglia, F. JF - Advances in Web Semantics PB - LNCS-IFIP VL - 1 ER - TY - JOUR T1 - Mining co-regulated gene profiles for the detection of functional associations in gene expression data JF - Bioinformatics Y1 - 2007 A1 - Gyenesei, Attila A1 - Wagner, Ulrich A1 - Barkow-Oesterreicher, Simon A1 - Stolte, Etzard A1 - Schlapbach, Ralph VL - 23 ER - TY - Generic T1 - European Graduate Student Workshop on Evolutionary Computation Y1 - 2006 A1 - Giacobini, Mario A1 - van Hemert, Jano ED - Giacobini, Mario ED - van Hemert, Jano KW - evolutionary computation AB - Evolutionary computation involves the study of problem-solving and optimization techniques inspired by principles of evolution and genetics. As any other scientific field, its success relies on the continuity provided by new researchers joining the field to help it progress. One of the most important sources for new researchers is the next generation of PhD students that are actively studying a topic relevant to this field. It is from this main observation the idea arose of providing a platform exclusively for PhD students. CY - Budapest, Hungary ER - TY - CONF T1 - Improving Graph Colouring Algorithms and Heuristics Using a Novel Representation T2 - Springer Lecture Notes on Computer Science Y1 - 2006 A1 - Juhos, I. A1 - van Hemert, J. I. ED - J. Gottlieb ED - G. Raidl KW - constraint satisfaction KW - graph colouring AB - We introduce a novel representation for the graph colouring problem, called the Integer Merge Model, which aims to reduce the time complexity of an algorithm. Moreover, our model provides useful information for guiding heuristics as well as a compact description for algorithms. To verify the potential of the model, we use it in dsatur, in an evolutionary algorithm, and in the same evolutionary algorithm extended with heuristics. An empiricial investigation is performed to show an increase in efficiency on two problem suites , a set of practical problem instances and a set of hard problem instances from the phase transition. JF - Springer Lecture Notes on Computer Science PB - Springer-Verlag ER - TY - CONF T1 - Neighborhood Searches for the Bounded Diameter Minimum Spanning Tree Problem Embedded in a VNS, EA, and ACO T2 - Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2006) Y1 - 2006 A1 - Gruber, M. A1 - van Hemert, J. I. A1 - Raidl, G. R. ED - Maarten Keijzer ED - et al KW - constraint satisfaction KW - evolutionary computation KW - variable neighbourhood search AB - We consider the Bounded Diameter Minimum Spanning Tree problem and describe four neighbourhood searches for it. They are used as local improvement strategies within a variable neighbourhood search (VNS), an evolutionary algorithm (EA) utilising a new encoding of solutions, and an ant colony optimisation (ACO).We compare the performance in terms of effectiveness between these three hybrid methods on a suite f popular benchmark instances, which contains instances too large to solve by current exact methods. Our results show that the EA and the ACO outperform the VNS on almost all used benchmark instances. Furthermore, the ACO yields most of the time better solutions than the EA in long-term runs, whereas the EA dominates when the computation time is strongly restricted. JF - Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2006) PB - ACM CY - Seattle, USA VL - 2 ER - TY - CONF T1 - The Digital Curation Centre: a vision for digital curation T2 - 2005 IEEE International Symposium on Mass Storage Systems and Technology Y1 - 2005 A1 - Rusbridge, C. A1 - P. Burnhill A1 - S. Ross A1 - P. Buneman A1 - D. Giaretta A1 - Lyon, L. A1 - Atkinson, M. AB - We describe the aims and aspirations for the Digital Curation Centre (DCC), the UK response to the realisation that digital information is both essential and fragile. We recognise the equivalence of preservation as "interoperability with the future", asserting that digital curation is concerned with "communication across time". We see the DCC as having relevance for present day data curation and for continuing data access for generations to come. We describe the structure and plans of the DCC, designed to support these aspirations and based on a view of world class research being developed into curation services, all of which are underpinned by outreach to the broadest community. JF - 2005 IEEE International Symposium on Mass Storage Systems and Technology PB - IEEE Computer Society CY - Sardinia, Italy SN - 0-7803-9228-0 ER - TY - Generic T1 - Experience with the international testbed in the crossgrid project T2 - Advances in Grid Computing-EGC 2005 Y1 - 2005 A1 - Gomes, J. A1 - David, M. A1 - Martins, J. A1 - Bernardo, L. A1 - A García A1 - Hardt, M. A1 - Kornmayer, H. A1 - Marco, Jesus A1 - Marco, Rafael A1 - Rodríguez, David A1 - Diaz, Irma A1 - Cano, Daniel A1 - Salt, J. A1 - Gonzalez, S. A1 - J Sánchez A1 - Fassi, F. A1 - Lara, V. A1 - Nyczyk, P. A1 - Lason, P. A1 - Ozieblo, A. A1 - Wolniewicz, P. A1 - Bluj, M. A1 - K Nawrocki A1 - A Padee A1 - W Wislicki ED - Peter M. A. Sloot, Alfons G. Hoekstra, Thierry Priol, Alexander Reinefeld ED - Marian Bubak JF - Advances in Grid Computing-EGC 2005 T3 - LNCS PB - Springer Berlin/Heidelberg CY - Amsterdam VL - 3470 ER - TY - CONF T1 - Heuristic Colour Assignment Strategies for Merge Models in Graph Colouring T2 - Springer Lecture Notes on Computer Science Y1 - 2005 A1 - Juhos, I. A1 - Tóth, A. A1 - van Hemert, J. I. ED - G. Raidl ED - J. Gottlieb KW - constraint satisfaction KW - graph colouring AB - In this paper, we combine a powerful representation for graph colouring problems with different heuristic strategies for colour assignment. Our novel strategies employ heuristics that exploit information about the partial colouring in an aim to improve performance. An evolutionary algorithm is used to drive the search. We compare the different strategies to each other on several very hard benchmarks and on generated problem instances, and show where the novel strategies improve the efficiency. JF - Springer Lecture Notes on Computer Science PB - Springer-Verlag, Berlin ER - TY - CONF T1 - Organization of the International Testbed of the CrossGrid Project T2 - Cracow Grid Workshop 2005 Y1 - 2005 A1 - Gomes, J. A1 - David, M. A1 - Martins, J. A1 - Bernardo, L. A1 - Garcia, A. A1 - Hardt, M. A1 - Kornmayer, H. A1 - Marco, Rafael A1 - Rodríguez, David A1 - Diaz, Irma A1 - Cano, Daniel A1 - Salt, J. A1 - Gonzalez, S. A1 - Sanchez, J. A1 - Fassi, F. A1 - Lara, V. A1 - Nyczyk, P. A1 - Lason, P. A1 - Ozieblo, A. A1 - Wolniewicz, P. A1 - Bluj, M. JF - Cracow Grid Workshop 2005 ER - TY - CONF T1 - Property analysis of symmetric travelling salesman problem instances acquired through evolution T2 - Springer Lecture Notes on Computer Science Y1 - 2005 A1 - van Hemert, J. I. ED - G. Raidl ED - J. Gottlieb KW - problem evolving KW - travelling salesman AB - We show how an evolutionary algorithm can successfully be used to evolve a set of difficult to solve symmetric travelling salesman problem instances for two variants of the Lin-Kernighan algorithm. Then we analyse the instances in those sets to guide us towards deferring general knowledge about the efficiency of the two variants in relation to structural properties of the symmetric travelling salesman problem. JF - Springer Lecture Notes on Computer Science PB - Springer-Verlag, Berlin ER - TY - JOUR T1 - Specifying use case behavior with interaction models JF - Journal of Object Technology Y1 - 2005 A1 - José Daniel Garcia A1 - Jesús Carretero A1 - José Maria Pérez A1 - Félix García Carballeira A1 - Rosa Filgueira VL - 4 ER - TY - CONF T1 - Binary Merge Model Representation of the Graph Colouring Problem T2 - Springer Lecture Notes on Computer Science Y1 - 2004 A1 - Juhos, I. A1 - Tóth, A. A1 - van Hemert, J. I. ED - J. Gottlieb ED - G. Raidl KW - constraint satisfaction KW - graph colouring AB - This paper describes a novel representation and ordering model that aided by an evolutionary algorithm, is used in solving the graph \emph{k}-colouring problem. Its strength lies in reducing the search space by breaking symmetry. An empirical comparison is made with two other algorithms on a standard suit of problem instances and on a suit of instances in the phase transition where it shows promising results. JF - Springer Lecture Notes on Computer Science PB - Springer-Verlag, Berlin SN - 3-540-21367-8 ER - TY - Generic T1 - Grid Services Supporting the Usage of Secure Federated, Distributed Biomedical Data T2 - All Hands Meeting 2004 Y1 - 2004 A1 - Richard Sinnott A1 - Malcolm Atkinson A1 - Micha Bayer A1 - Dave Berry A1 - Anna Dominiczak A1 - Magnus Ferrier A1 - David Gilbert A1 - Neil Hanlon A1 - Derek Houghton A1 - Hunt, Ela A1 - David White AB - The BRIDGES project is a UK e-Science project that provides grid based support for biomedical research into the genetics of hypertension – the Cardiovascular Functional Genomics Project (CFG). Its main goal is to provide an effective environment for CFG, and biomedical research in general, including access to integrated data, analysis and visualization, with appropriate authorisation and privacy, as well as grid based computational tools and resources. It also aims to provide an improved understanding of the requirements of academic biomedical research virtual organizations and to evaluate the utility of existing data federation tools. JF - All Hands Meeting 2004 CY - Nottingham, UK UR - http://www.allhands.org.uk/2004/proceedings/papers/87.pdf ER - TY - CONF T1 - A Study into Ant Colony Optimization, Evolutionary Computation and Constraint Programming on Binary Constraint Satisfaction Problems T2 - Springer Lecture Notes on Computer Science Y1 - 2004 A1 - van Hemert, J. I. A1 - Solnon, C. ED - J. Gottlieb ED - G. Raidl KW - ant colony optimisation KW - constraint programming KW - constraint satisfaction KW - evolutionary computation AB - We compare two heuristic approaches, evolutionary computation and ant colony optimisation, and a complete tree-search approach, constraint programming, for solving binary constraint satisfaction problems. We experimentally show that, if evolutionary computation is far from being able to compete with the two other approaches, ant colony optimisation nearly always succeeds in finding a solution, so that it can actually compete with constraint programming. The resampling ratio is used to provide insight into heuristic algorithms performances. Regarding efficiency, we show that if constraint programming is the fastest when instances have a low number of variables, ant colony optimisation becomes faster when increasing the number of variables. JF - Springer Lecture Notes on Computer Science PB - Springer-Verlag, Berlin SN - 3-540-21367-8 ER - TY - RPRT T1 - Computer Challenges to emerge from e-Science. Y1 - 2003 A1 - Atkinson, M. A1 - Crowcroft, J. A1 - Goble, C. A1 - Gurd, J. A1 - Rodden, T. A1 - Shadbolt, N. A1 - Sloman, M. A1 - Sommerville, I. A1 - Storey, T. AB - The UK e-Science programme has initiated significant developments that allow networked grid technology to be used to form virtual colaboratories. The e-Science vision of a globally connected community has broader application than science with the same fundamental technologies being used to support eCommerce and e-Government. The broadest vision of e-Science outlines a challenging research agenda for the computing community. New theories and models will be needed to provide a sound foundation for the tools used to specify, design, analyse and prove the properties of future grid technologies and applications. Fundamental research is needed in order to build a future e-Science infrastructure and to understand how to exploit the infrastructure to best effect. A future infrastructure needs to be dynamic, universally available and promote trust. Realising this infrastructure will need new theories, methods and techniques to be developed and deployed. Although often not directly visible these fundamental infrastructure advances will provide the foundation for future scientific advancement, wealth generation and governance. • We need to move from the current data focus to a semantic grid with facilities for the generation, support and traceability of knowledge. • We need to make the infrastructure more available and more trusted by developing trusted ubiquitous systems. • We need to reduce the cost of development by enabling the rapid customised assembly of services. • We need to reduce the cost and complexity of managing the infrastructure by realising autonomic computing systems. JF - EPSRC ER - TY - RPRT T1 - Grid Database Access and Integration: Requirements and Functionalities Y1 - 2003 A1 - Atkinson, M. P. A1 - Dialani, V. A1 - Guy, L. A1 - Narang, I. A1 - Paton, N. W. A1 - Pearson, D. A1 - Storey, T. A1 - Watson, P. AB - This document is intended to provide the context for developing Grid data service standard recommendations within the Global Grid Forum. It defines the generic requirements for accessing and integrating persistent structured and semi-structured data. In addition, it defines the generic functionalities which a Grid data service needs to provide in supporting discovery of and controlled access to data, in performing data manipulation operations, and in virtualising data resources. The document also defines the scope of Grid data service standard recommendations which are presented in a separate document. JF - Global Grid Forum ER - TY - JOUR T1 - The pervasiveness of evolution in GRUMPS software JF - Softw., Pract. Exper. Y1 - 2003 A1 - Evans, Huw A1 - Atkinson, Malcolm P. A1 - Brown, Margaret A1 - Cargill, Julie A1 - Crease, Murray A1 - Draper, Steve A1 - Gray, Philip D. A1 - Thomas, Richard VL - 33 ER - TY - CONF T1 - Comparing Classical Methods for Solving Binary Constraint Satisfaction Problems with State of the Art Evolutionary Computation T2 - Springer Lecture Notes on Computer Science Y1 - 2002 A1 - van Hemert, J. I. ED - S. Cagnoni ED - J. Gottlieb ED - E. Hart ED - M. Middendorf ED - G. Raidl KW - constraint satisfaction AB - Constraint Satisfaction Problems form a class of problems that are generally computationally difficult and have been addressed with many complete and heuristic algorithms. We present two complete algorithms, as well as two evolutionary algorithms, and compare them on randomly generated instances of binary constraint satisfaction prob-lems. We find that the evolutionary algorithms are less effective than the classical techniques. JF - Springer Lecture Notes on Computer Science PB - Springer-Verlag, Berlin ER - TY - JOUR T1 - A new multifractal network traffic model JF - Journal of Chaos, solitons & fractals Y1 - 2002 A1 - Liangxiu Han A1 - Zhiwei Ceng A1 - Chuanshan Gao PB - Elsevier Science VL - 13 IS - 7 ER - TY - CONF T1 - Use of Evolutionary Algorithms for Telescope Scheduling T2 - Integrated Modeling of Telescopes Y1 - 2002 A1 - Grim, R. A1 - Jansen, M. L. M. A1 - Baan, A. A1 - van Hemert, J. I. A1 - de Wolf, H. ED - Torben Anderson KW - constraint satisfaction KW - scheduling AB - LOFAR, a new radio telescope, will be designed to observe with up to 8 independent beams, thus allowing several simultaneous observations. Scheduling of multiple observations parallel in time, each having their own constraints, requires a more intelligent and flexible scheduling function then operated before. In support of the LOFAR radio telescope project, and in co-operation with Leiden University, Fokker Space has started a study to investigate the suitability of the use of evolutionary algorithms applied to complex scheduling problems. After a positive familiarisation phase, we now examine the potential use of evolutionary algorithms via a demonstration project. Results of the familiarisation phase, and the first results of the demonstration project are presented in this paper. JF - Integrated Modeling of Telescopes PB - The International Society for Optical Engineering ({SPIE}) VL - 4757 ER - TY - CONF T1 - An Engineering Approach to Evolutionary Art T2 - Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001) Y1 - 2001 A1 - van Hemert, J. I. A1 - Jansen, M. L. M. ED - Lee Spector ED - Erik D. Goodman ED - Annie Wu ED - W. B. Langdon ED - Hans-Michael Voigt ED - Mitsuo Gen ED - Sandip Sen ED - Marco Dorigo ED - Shahram Pezeshk ED - Max H. Garzon ED - Edmund Burke KW - evolutionary art AB - We present a general system that evolves art on the Internet. The system runs on a server which enables it to collect information about its usage world wide; its core uses operators and representations from genetic program-ming. We show two types of art that can be evolved using this general system. JF - Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001) PB - Morgan Kaufmann Publishers, San Francisco ER - TY - BOOK T1 - GRUMPS Summer Anthology, 2001 Y1 - 2001 A1 - Atkinson, M. A1 - Brown, M. A1 - Cargill, J. A1 - Crease, M. A1 - Draper, S. A1 - Evans, H. A1 - Gray, P. A1 - Mitchell, C. A1 - Ritchie, M. A1 - Thomas, R. AB - This is the first collection of papers from GRUMPS [http://grumps.dcs.gla.ac.uk]. The project only started up in February 2001, and this collection (frozen at 1 Sept 2001) shows that it got off to a productive start. Versions of some of these papers have been submitted to conferences and workshops: the website will have more information on publication status and history. GRUMPS decided to begin with a first study, partly to help the team coalesce. This involved installing two pieces of software in a first year computing science lab: one (the "UAR") to record a large volume of student actions at a low level with a view to mining them later, another (the "LSS") directly designed to assist tutor-student interaction. Some of the papers derive from that, although more are planned. Results from this first study can be found on the website. The project also has a link to UWA in Perth, Western Australia, where related software has already been developed and used as described in one of the papers. Another project strand concerns using handsets in lecture theatres to support interactivity there, as two other papers describe. As yet unrepresented in this collection, GRUMPS will also be entering the bioinformatics application area. The GRUMPS project operates on several levels. It is based in the field of Distributed Information Management (DIM), expecting to cover both mobile and static nodes, synchronous and detached clients, high and low volume data sources. The specific focus of the project (see the original proposal on the web site) is to address records of computational activity (where any such pre-existing usage might have extra record collection installed) and data experimentation, where the questions to be asked of the data emerge concurrently with data collection which will therefore be dynamically modifiable: a requirement that further pushes on the space of DIM. The level above concerns building and making usable tools for asking questions of the data, or rather of the activities that generate the data. Above that again is the application domain level: what the original computational activities serve, education and bioinformatics being two identified cases. The GRUMPS team is therefore multidisciplinary, from DIM architecture researchers to educational evaluators. The mix of papers reflects this. PB - Academic Press ER - TY - CONF T1 - A new multifractal traffic model based on the wavelet transform T2 - ISCA 14th International Conference on Parallel and Distributed Computing systems Y1 - 2001 A1 - Chuanshan Gao A1 - Liangxiu Han JF - ISCA 14th International Conference on Parallel and Distributed Computing systems CY - Texas, USA ER - TY - CHAP T1 - Persistence and Java — A Balancing Act T2 - Objects and Databases Y1 - 2001 A1 - Atkinson, M. ED - Klaus Dittrich ED - Giovanna Guerrini ED - Isabella Merlo ED - Marta Oliva ED - M. Elena Rodriguez AB - Large scale and long-lived application systems, enterprise applications, require persistence, that is provision of storage for many of their data structures. The JavaTM programming language is a typical example of a strongly-typed, object-oriented programming language that is becoming popular for building enterprise applications. It therefore needs persistence. The present options for obtaining this persistence are reviewed. We conclude that the Orthogonal Persistence Hypothesis, OPH, is still persuasive. It states that the universal and automated provision of longevity or brevity for all data will significantly enhance developer productivity and improve applications. This position paper reports on the PJama project with particular reference to its test of the OPH. We review why orthogonal persistence has not been taken up widely, and why the OPH is still incompletely tested. This leads to a more general challenge of how to conduct experiments which reveal large-scale and long-term effects and some thoughts on how that challenge might be addressed by the software research community. JF - Objects and Databases T3 - Lecture Notes in Computer Science PB - Springer VL - 1944 UR - http://www.springerlink.com/content/8t7x3m1ehtdqk4bm/?p=7ece1338fff3480b83520df395784cc6&pi=0 ER - TY - CONF T1 - Measurement and analysis of IP network traffic T2 - In Proceedings of the 3th International Asia-Pacific Web Conference Y1 - 2000 A1 - cen, Z A1 - Gao, C A1 - Cong S A1 - Han, L JF - In Proceedings of the 3th International Asia-Pacific Web Conference CY - xi'an China ER - TY - CONF T1 - Comparing genetic programming variants for data classification T2 - Proceedings of the Eleventh Belgium/Netherlands Conference on Artificial Intelligence (BNAIC'99) Y1 - 1999 A1 - Eggermont, J. A1 - Eiben, A. E. A1 - van Hemert, J. I. ED - E. Postma ED - M. Gyssens KW - classification KW - data mining KW - genetic programming AB - This article is a combined summary of two papers written by the authors. Binary data classification problems (with exactly two disjoint classes) form an important application area of machine learning techniques, in particular genetic programming (GP). In this study we compare a number of different variants of GP applied to such problems whereby we investigate the effect of two significant changes in a fixed GP setup in combination with two different evolutionary models JF - Proceedings of the Eleventh Belgium/Netherlands Conference on Artificial Intelligence (BNAIC'99) PB - BNVKI, Dutch and the Belgian AI Association ER - TY - CONF T1 - Mondriaan Art by Evolution T2 - Proceedings of the Eleventh Belgium/Netherlands Conference on Artificial Intelligence (BNAIC'99) Y1 - 1999 A1 - van Hemert, J. I. A1 - Eiben, A. E. ED - E. Postma ED - M. Gyssens KW - evolutionary art AB - Here we show an application that generates images resembling art as it was produced by Mondriaan, a Dutch artist, well known for his minimalistic and pure abstract pieces of art. The current version generates images using a linear chromosome and a recursive function as a decoder. JF - Proceedings of the Eleventh Belgium/Netherlands Conference on Artificial Intelligence (BNAIC'99) PB - BNVKI, Dutch and the Belgian AI Association ER - TY - CONF T1 - Population dynamics and emerging features in AEGIS T2 - Proceedings of the Genetic and Evolutionary Computation Conference Y1 - 1999 A1 - Eiben, A. E. A1 - Elia, D. A1 - van Hemert, J. I. ED - W. Banzhaf ED - J. Daida ED - Eiben, A. E. ED - M. H. Garzon ED - V. Honavar ED - M. Jakiela ED - R. E. Smith KW - dynamic problems AB - We describe an empirical investigation within an artificial world, aegis, where a population of animals and plants is evolving. We compare different system setups in search of an `ideal' world that allows a constantly high number of inhabitants for a long period of time. We observe that high responsiveness at individual level (speed of movement) or population level (high fertility) are `ideal'. Furthermore, we investigate the emergence of the so-called mental features of animals determining their social, consumptional and aggressive behaviour. The tests show that being socially oriented is generally advantageous, while agressive behaviour only emerges under specific circumstances. JF - Proceedings of the Genetic and Evolutionary Computation Conference PB - Morgan Kaufmann Publishers, San Francisco ER - TY - CHAP T1 - SAW-ing EAs: adapting the fitness function for solving constrained problems T2 - New ideas in optimization Y1 - 1999 A1 - Eiben, A. E. A1 - van Hemert, J. I. ED - D. Corne ED - M. Dorigo ED - F. Glover KW - constraint satisfaction AB - In this chapter we describe a problem independent method for treating constrain ts in an evolutionary algorithm. Technically, this method amounts to changing the defini tion of the fitness function during a run of an EA, based on feedback from the search pr ocess. Obviously, redefining the fitness function means redefining the problem to be sol ved. On the short term this deceives the algorithm making the fitness values deteriorate , but as experiments clearly indicate, on the long run it is beneficial. We illustrate t he power of the method on different constraint satisfaction problems and point out other application areas of this technique. JF - New ideas in optimization PB - McGraw-Hill, London ER -