TY  - CHAP
T1  - Evolutionary Computation and Constraint Satisfaction
Y1  - 2015
A1  - van Hemert, J.
ED  - Kacpryk, J.
ED  - Pedrycz, W.
KW  - constraint satisfaction
KW  - evolutionary computation
AB  - In this chapter we will focus on the combination of evolutionary computation techniques and constraint satisfaction problems. Constraint Programming (CP) is another approach to deal with constraint satisfaction problems. In fact, it is an important prelude to the work covered here as it advocates itself as an alternative approach to programming (Apt). The first step is to formulate a problem as a CSP such that techniques from CP, EC, combinations of the two (c.f., Hybrid) or other approaches can be deployed to solve the problem. The formulation of a problem has an impact on its complexity in terms of effort required to either find a solution or proof no solution exists. It is therefore vital to spend time on getting this right.     Main differences between CP and EC. CP defines search as iterative steps over a search tree where nodes are partial solutions to the problem where not all variables are assigned values. The search then maintain a partial solution that satisfies all variables assigned values. Instead, in EC most often solver sample a space of candidate solutions where variables are all assigned values. None of these candidate solutions will satisfy all constraints in the problem until a solution is found. Another major difference is that many constraint solvers from CP are sound whereas EC solvers are not. A solver is sound if it always finds a solution if it exists.
PB  - Springer
ER  - 

TY  - JOUR
T1  - Precise montaging and metric quantification of retinal surface area from ultra-widefield fundus photography and fluorescein angiography
JF  - Ophthalmic Surg Lasers Imaging Retina
Y1  - 2014
A1  - Croft, D.E.
A1  - van Hemert, J.
A1  - Wykoff, C.C.
A1  - Clifton, D.
A1  - Verhoek, M.
A1  - Fleming, A.
A1  - Brown, D.M.
KW  - medical
KW  - retinal imaging
AB  - BACKGROUND AND OBJECTIVE: Accurate quantification of retinal surface area from ultra-widefield (UWF) images is challenging due to warping produced when the retina is projected onto a two-dimensional plane for analysis. By accounting for this, the authors sought to precisely montage and accurately quantify retinal surface area in square millimeters. PATIENTS AND METHODS: Montages were created using Optos 200Tx (Optos, Dunfermline, U.K.) images taken at different gaze angles. A transformation projected the images to their correct location on a three-dimensional model. Area was quantified with spherical trigonometry. Warping, precision, and accuracy were assessed. RESULTS: Uncorrected, posterior pixels represented up to 79% greater surface area than peripheral pixels. Assessing precision, a standard region was quantified across 10 montages of the same eye (RSD: 0.7%; mean: 408.97 mm(2); range: 405.34-413.87 mm(2)). Assessing accuracy, 50 patients' disc areas were quantified (mean: 2.21 mm(2); SE: 0.06 mm(2)), and the results fell within the normative range. CONCLUSION: By accounting for warping inherent in UWF images, precise montaging and accurate quantification of retinal surface area in square millimeters were achieved. [Ophthalmic Surg Lasers Imaging Retina. 2014;45:312-317.].
VL  - 45
ER  - 

TY  - JOUR
T1  - Quantification of Ultra-Widefield Retinal Images
JF  - Retina Today
Y1  - 2014
A1  - D.E. Croft
A1  - C.C. Wykoff
A1  - D.M. Brown
A1  - van Hemert, J.
A1  - M. Verhoek
KW  - medical
KW  - retinal imaging
AB  - Advances in imaging periodically lead to dramatic changes in the diagnosis, management, and study of retinal disease. For example, the innovation and wide-spread application of fluorescein angiography and optical coherence tomography (OCT) have had tremendous impact on the management of retinal disorders.1,2 Recently, ultra-widefield (UWF) imaging has opened a new window into the retina, allowing the capture of greater than 80% of the fundus with a single shot.3 With montaging, much of the remaining retinal surface area can be captured.4,5 However, to maximize the potential of these new modalities, accurate quantification of the pathology they capture is critical.
UR  - http://www.bmctoday.net/retinatoday/pdfs/0514RT_imaging_Croft.pdf
ER  - 

TY  - JOUR
T1  - Automatic extraction of retinal features from colour retinal images for glaucoma diagnosis: A review
JF  - Computerized Medical Imaging and Graphics
Y1  - 2013
A1  - Haleem, M.S.
A1  - Han, L.
A1  - van Hemert, J.
A1  - Li, B.
KW  - retinal imaging
AB  - Glaucoma is a group of eye diseases that have common traits such as, high eye pressure, damage to the Optic Nerve Head and gradual vision loss. It affects peripheral vision and eventually leads to blindness if left untreated. The current common methods of pre-diagnosis of Glaucoma include measurement of Intra-Ocular Pressure (IOP) using Tonometer, Pachymetry, Gonioscopy; which are performed manually by the clinicians. These tests are usually followed by Optic Nerve Head (ONH) Appearance examination for the confirmed diagnosis of Glaucoma. The diagnoses require regular monitoring, which is costly and time consuming. The accuracy and reliability of diagnosis is limited by the domain knowledge of different ophthalmologists. Therefore automatic diagnosis of Glaucoma attracts a lot of attention. This paper surveys the state-of-the-art of automatic extraction of anatomical features from retinal images to assist early diagnosis of the Glaucoma. We have conducted critical evaluation of the existing automatic extraction methods based on features including Optic Cup to Disc Ratio (CDR), Retinal Nerve Fibre Layer (RNFL), Peripapillary Atrophy (PPA), Neuroretinal Rim Notching, Vasculature Shift, etc., which adds value on efficient feature extraction related to Glaucoma diagnosis.
VL  - 37
SN  - 0895-6111
UR  - http://linkinghub.elsevier.com/retrieve/pii/S0895611113001468?showall=true
ER  - 

TY  - CONF
T1  - Automatic Extraction of the Optic Disc Boundary for Detecting Retinal Diseases
T2  - 14th {IASTED} International Conference on Computer Graphics and Imaging (CGIM)
Y1  - 2013
A1  - M.S. Haleem
A1  - L. Han
A1  - B. Li
A1  - A. Nisbet
A1  - van Hemert, J.
A1  - M. Verhoek
ED  - L. Linsen
ED  - M. Kampel
KW  - retinal imaging
AB  - In this paper, we propose an algorithm based on active shape model for the extraction of Optic Disc boundary. The determination of Optic Disc boundary is fundamental to the automation of retinal eye disease diagnosis because the Optic Disc Center is typically used as a reference point to locate other retinal structures, and any structural change in Optic Disc, whether textural or geometrical, can be used to determine the occurrence of retinal diseases such as Glaucoma. The algorithm is based on determining a model for the Optic Disc boundary by learning patterns of variability from a training set of annotated Optic Discs. The model can be deformed so as to reflect the boundary of Optic Disc in any feasible shape. The algorithm provides some initial steps towards automation of the diagnostic process for retinal eye disease in order that more patients can be screened with consistent diagnoses. The overall accuracy of the algorithm was 92% on a set of 110 images.
JF  - 14th {IASTED} International Conference on Computer Graphics and Imaging (CGIM)
PB  - {ACTA} Press
ER  - 

TY  - BOOK
T1  - The DATA Bonanza: Improving Knowledge Discovery in Science, Engineering, and Business
T2  - Wiley Series on Parallel and Distributed Computing (Editor: Albert Y. Zomaya)
Y1  - 2013
A1  - Atkinson, Malcolm P.
A1  - Baxter, Robert M.
A1  - Peter Brezany
A1  - Oscar Corcho
A1  - Michelle Galea
A1  - Parsons, Mark
A1  - Snelling, David
A1  - van Hemert, Jano
KW  - Big Data
KW  - Data Intensive
KW  - data mining
KW  - Data Streaming
KW  - Databases
KW  - Dispel
KW  - Distributed Computing
KW  - Knowledge Discovery
KW  - Workflows
AB  - With the digital revolution opening up tremendous opportunities in many fields, there is a growing need for skilled professionals who can develop data-intensive systems and extract information and knowledge from them. This book frames for the first time a new systematic approach for tackling the challenges of data-intensive computing, providing decision makers and technical experts alike with practical tools for dealing with our exploding data collections.  Emphasising data-intensive thinking and interdisciplinary collaboration,  The DATA Bonanza: Improving Knowledge Discovery in Science, Engineering, and Business examines the essential components of knowledge discovery, surveys many of the current research efforts worldwide, and points to new areas for innovation. Complete with a wealth of examples and DISPEL-based methods demonstrating how to gain more from data in real-world systems, the book:  * Outlines the concepts and rationale for implementing data-intensive computing in organisations  * Covers from the ground up problem-solving strategies for data analysis in a data-rich world  * Introduces techniques for data-intensive engineering using the Data-Intensive Systems Process Engineering Language DISPEL  * Features in-depth case studies in customer relations, environmental hazards, seismology, and more  * Showcases successful applications in areas ranging from astronomy and the humanities to transport engineering  * Includes sample program snippets throughout the text as well as additional materials on a companion website  The DATA Bonanza is a must-have guide for information strategists, data analysts, and engineers in business, research, and government, and for anyone wishing to be on the cutting edge of data mining, machine learning, databases, distributed systems, or large-scale computing.
JF  - Wiley Series on Parallel and Distributed Computing (Editor: Albert Y. Zomaya)
PB  - John Wiley & Sons Inc.
SN  - 978-1-118-39864-7
ER  - 

TY  - CHAP
T1  - Data-Intensive Analysis
T2  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
Y1  - 2013
A1  - Oscar Corcho
A1  - van Hemert, Jano
ED  - Malcolm Atkinson
ED  - Rob Baxter
ED  - Peter Brezany
ED  - Oscar Corcho
ED  - Michelle Galea
ED  - Parsons, Mark
ED  - Snelling, David
ED  - van Hemert, Jano
KW  - data mining
KW  - Data-Analysis Experts
KW  - Data-Intensive Analysis
KW  - Knowledge Discovery
AB  - Part II: "Data-intensive Knowledge Discovery", focuses on the needs of data-analysis experts. It illustrates the problem-solving strategies appropriate for a data-rich world, without delving into the details of underlying technologies. It should engage and inform data-analysis specialists, such as statisticians, data miners, image analysts, bio-informaticians or chemo-informaticians, and generate  ideas pertinent to their application areas.    Chapter 5: "Data-intensive Analysis", introduces a set of common problems that data-analysis experts often encounter, by means of a set of scenarios of increasing levels of complexity. The scenarios typify knowledge discovery challenges and the presented solutions provide practical methods; a starting point for readers addressing their own data challenges.
JF  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
PB  - John Wiley & Sons Ltd.
ER  - 

TY  - CHAP
T1  - Data-Intensive Components and Usage Patterns
T2  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
Y1  - 2013
A1  - Oscar Corcho
ED  - Malcolm Atkinson
ED  - Rob Baxter
ED  - Peter Brezany
ED  - Oscar Corcho
ED  - Michelle Galea
ED  - Parsons, Mark
ED  - Snelling, David
ED  - van Hemert, Jano
KW  - Data Analysis
KW  - data mining
KW  - Data-Intensive Components
KW  - Registry
KW  - Workflow Libraries
KW  - Workflow Sharing
AB  - Chapter 7: "Data-intensive components and usage patterns", provides a systematic review of the components that are commonly used in knowledge discovery tasks as well as common patterns of component composition. That is, it introduces the processing elements from which knowledge discovery solutions are built and common composition patterns for delivering trustworthy information.  It reflects on how these components and patterns are evolving in a data-intensive context.
JF  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
PB  - John Wiley & Sons Ltd.
ER  - 

TY  - CHAP
T1  - The Data-Intensive Survival Guide
T2  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
Y1  - 2013
A1  - Malcolm Atkinson
ED  - Malcolm Atkinson
ED  - Rob Baxter
ED  - Peter Brezany
ED  - Oscar Corcho
ED  - Michelle Galea
ED  - Parsons, Mark
ED  - Snelling, David
ED  - van Hemert, Jano
KW  - Data-Analysis Experts
KW  - Data-Intensive Architecture
KW  - Data-intensive Computing
KW  - Data-Intensive Engineers
KW  - Datascopes
KW  - Dispel
KW  - Domain Experts
KW  - Intellectual Ramps
KW  - Knowledge Discovery
KW  - Workflows
AB  - Chapter 3: "The data-intensive survival guide", presents an overview of all of the elements of the proposed data-intensive strategy. Sufficient detail is presented for readers to understand the principles and practice that we recommend. It should also provide a good preparation for readers who choose to sample later chapters. It introduces three professional viewpoints: domain experts, data-analysis experts, and data-intensive engineers. Success depends on a balanced approach that develops the capacity of all three groups. A data-intensive architecture provides a flexible framework for that balanced approach. This enables the three groups to build and exploit data-intensive processes that incrementally step from data to results. A language is introduced to describe these incremental data processes from all three points of view. The chapter introduces ‘datascopes’ as the productized data-handling environments and ‘intellectual ramps’ as the ‘on ramps’  for the highways from data to knowledge.
JF  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
PB  - John Wiley & Sons Ltd.
ER  - 

TY  - CHAP
T1  - Data-Intensive Thinking with DISPEL
T2  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
Y1  - 2013
A1  - Malcolm Atkinson
ED  - Malcolm Atkinson
ED  - Rob Baxter
ED  - Peter Brezany
ED  - Oscar Corcho
ED  - Michelle Galea
ED  - Parsons, Mark
ED  - Snelling, David
ED  - van Hemert, Jano
KW  - Data-Intensive Machines
KW  - Data-Intensive Thinking, Data-intensive Computing
KW  - Dispel
KW  - Distributed Computing
KW  - Knowledge Discovery
AB  - Chapter 4: "Data-intensive thinking with DISPEL", engages the reader with technical issues and solutions, by working through a sequence of examples, building up from a sketch of a solution to a large-scale data challenge. It uses the DISPEL language extensively, introducing its concepts and constructs. It shows how DISPEL may help designers, data-analysts, and engineers develop  solutions to the requirements emerging in any data-intensive application domain. The reader is taken through simple steps initially, this then builds to conceptually complex steps that are necessary to cope with the realities of real data providers, real data, real distributed systems, and long-running processes.
JF  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
PB  - John Wiley & Sons Inc.
ER  - 

TY  - CHAP
T1  - Definition of the DISPEL Language
T2  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
Y1  - 2013
A1  - Paul Martin
A1  - Yaikhom, Gagarine
ED  - Malcolm Atkinson
ED  - Rob Baxter
ED  - Peter Brezany
ED  - Oscar Corcho
ED  - Michelle Galea
ED  - Parsons, Mark
ED  - Snelling, David
ED  - van Hemert, Jano
KW  - Data Streaming
KW  - Data-intensive Computing
KW  - Dispel
AB  - Chapter 10: "Definition of the DISPEL language", describes the novel aspects of the DISPEL language: its constructs, capabilities, and anticipated programming style.
JF  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
T3  - {Parallel and Distributed Computing, series editor Albert Y. Zomaya}
PB  - John Wiley & Sons Inc.
ER  - 

TY  - CHAP
T1  - The Digital-Data Challenge
T2  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
Y1  - 2013
A1  - Malcolm Atkinson
A1  - Parsons, Mark
ED  - Malcolm Atkinson
ED  - Rob Baxter
ED  - Peter Brezany
ED  - Oscar Corcho
ED  - Michelle Galea
ED  - Parsons, Mark
ED  - Snelling, David
ED  - van Hemert, Jano
KW  - Big Data
KW  - Data-intensive Computing, Knowledge Discovery
KW  - Digital Data
KW  - Digital-Data Revolution
AB  - Part I: <em>Strategies for success in the digital-data revolution</em>, provides an executive summary of the whole book to convince strategists, politicians, managers, and educators that our future data-intensive society requires new thinking, new behavior, new culture, and new distribution of investment and effort. This part will introduce the major concepts so that readers are equipped to discuss and steer their organization’s response to the opportunities and obligations brought by the growing wealth of data. It will help readers understand the changing context brought about by advances in digital devices, digital communication, and ubiquitous computing.    Chapter 1: <em>The digital-data challenge</em>, will help readers to understand the challenges ahead in making good use of the data and introduce ideas that will lead to helpful strategies. A global digital-data revolution is catalyzing change in the ways in which we live, work, relax, govern, and organize. This is a significant change in society, as important as the invention of printing or the industrial revolution, but more challenging because it is happening globally at lnternet speed. Becoming agile in adapting to this new world is essential.
JF  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
PB  - John Wiley & Sons Ltd.
ER  - 

TY  - CHAP
T1  - The Digital-Data Revolution
T2  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
Y1  - 2013
A1  - Malcolm Atkinson
ED  - Malcolm Atkinson
ED  - Rob Baxter
ED  - Peter Brezany
ED  - Oscar Corcho
ED  - Michelle Galea
ED  - Parsons, Mark
ED  - Snelling, David
ED  - van Hemert, Jano
KW  - Data
KW  - Information
KW  - Knowledge
KW  - Knowledge Discovery
KW  - Social Impact of Digital Data
KW  - Wisdom, Data-intensive Computing
AB  - Chapter 2: "The digital-data revolution", reviews the relationships between data, information, knowledge, and wisdom. It analyses and quantifies the changes in technology and society that are delivering the data bonanza, and then reviews the consequential changes via representative examples in biology, Earth sciences, social sciences, leisure activity, and business. It exposes quantitative details and shows the complexity and diversity of the growing wealth of data, introducing some of its potential benefits and examples of the impediments to successfully realizing those benefits.
JF  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
PB  - John Wiley & Sons Ltd.
ER  - 

TY  - CHAP
T1  - DISPEL Development
T2  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
Y1  - 2013
A1  - Adrian Mouat
A1  - Snelling, David
ED  - Malcolm Atkinson
ED  - Rob Baxter
ED  - Peter Brezany
ED  - Oscar Corcho
ED  - Michelle Galea
ED  - Parsons, Mark
ED  - Snelling, David
ED  - van Hemert, Jano
KW  - Diagnostics
KW  - Dispel
KW  - IDE
KW  - Libraries
KW  - Processing Elements
AB  - Chapter 11: "DISPEL development", describes the tools and libraries that a DISPEL developer might expect to use. The tools include those needed during process definition, those required to organize enactment, and diagnostic aids for developers of applications and platforms.
JF  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
PB  - John Wiley & Sons Inc.
ER  - 

TY  - CHAP
T1  - DISPEL Enactment
T2  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
Y1  - 2013
A1  - Chee Sun Liew
A1  - Krause, Amrey
A1  - Snelling, David
ED  - Malcolm Atkinson
ED  - Rob Baxter
ED  - Peter Brezany
ED  - Oscar Corcho
ED  - Michelle Galea
ED  - Parsons, Mark
ED  - Snelling, David
ED  - van Hemert, Jano
KW  - Data Streaming
KW  - Data-Intensive Engineering
KW  - Dispel
KW  - Workflow Enactment
AB  - Chapter 12: "DISPEL enactment", describes the four stages of DISPEL enactment. It is targeted at the data-intensive engineers who implement enactment services.
JF  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
PB  - John Wiley & Sons Inc.
ER  - 

TY  - CHAP
T1  - Foreword
T2  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
Y1  - 2013
A1  - Tony Hey
ED  - Malcolm Atkinson
ED  - Rob Baxter
ED  - Peter Brezany
ED  - Oscar Corcho
ED  - Michelle Galea
ED  - Parsons, Mark
ED  - Snelling, David
ED  - van Hemert, Jano
KW  - Big Data
KW  - Data-intensive Computing, Knowledge Discovery
JF  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
PB  - John Wiley & Sons Ltd.
ER  - 

TY  - CHAP
T1  - Platforms for Data-Intensive Analysis
T2  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
Y1  - 2013
A1  - Snelling, David
ED  - Malcolm Atkinson
ED  - Baxter, Robert M.
ED  - Peter Brezany
ED  - Oscar Corcho
ED  - Michelle Galea
ED  - Parsons, Mark
ED  - Snelling, David
ED  - van Hemert, Jano
KW  - Data-Intensive Engineering
KW  - Data-Intensive Systems
KW  - Dispel
KW  - Distributed Systems
AB  - Part III: "Data-intensive engineering", is targeted at technical experts who will develop complex applications, new components, or data-intensive platforms.  The techniques introduced may be applied very widely; for example, to any data-intensive distributed application, such as index generation, image processing, sequence comparison, text analysis, and sensor-stream monitoring. The challenges, methods, and implementation requirements are illustrated by making extensive use of DISPEL.    Chapter 9: "Platforms for data-intensive analysis", gives a reprise of data-intensive architectures, examines the business case for investing in them, and introduces the stages of data-intensive workflow enactment.
JF  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
PB  - John Wiley & Sons Ltd.
ER  - 

TY  - CHAP
T1  - Preface
T2  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
Y1  - 2013
A1  - Malcolm Atkinson
ED  - Malcolm Atkinson
ED  - Rob Baxter
ED  - Peter Brezany
ED  - Oscar Corcho
ED  - Michelle Galea
ED  - Parsons, Mark
ED  - Snelling, David
ED  - van Hemert, Jano
KW  - Big Data, Data-intensive Computing, Knowledge Discovery
AB  - Who should read the book and why.  The structure and conventions used.  Suggested reading paths for different categories of reader.
JF  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
PB  - John Wiley & Sons Ltd.
ER  - 

TY  - CHAP
T1  - Problem Solving in Data-Intensive Knowledge Discovery
T2  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
Y1  - 2013
A1  - Oscar Corcho
A1  - van Hemert, Jano
ED  - Malcolm Atkinson
ED  - Rob Baxter
ED  - Peter Brezany
ED  - Oscar Corcho
ED  - Michelle Galea
ED  - Parsons, Mark
ED  - Snelling, David
ED  - van Hemert, Jano
KW  - Data-Analysis Experts
KW  - Data-Intensive Analysis
KW  - Design Patterns for Knowledge Discovery
KW  - Knowledge Discovery
AB  - Chapter 6: "Problem solving in data-intensive knowledge discovery", on the basis of the previous scenarios, this chapter provides an overview of effective strategies in knowledge discovery, highlighting common problem-solving methods that apply in conventional contexts, and focusing on the similarities and differences of these methods.
JF  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
PB  - John Wiley & Sons Ltd.
ER  - 

TY  - CHAP
T1  - Sharing and Reuse in Knowledge Discovery
T2  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
Y1  - 2013
A1  - Oscar Corcho
ED  - Malcolm Atkinson
ED  - Rob Baxter
ED  - Peter Brezany
ED  - Oscar Corcho
ED  - Michelle Galea
ED  - Parsons, Mark
ED  - Snelling, David
ED  - van Hemert, Jano
KW  - Data-Intensive Analysis
KW  - Knowledge Discovery
KW  - Ontologies
KW  - Semantic Web
KW  - Sharing
AB  - Chapter 8: "Sharing and re-use in knowledge discovery", introduces more advanced knowledge discovery problems, and shows how improved component and pattern descriptions facilitate re-use. This supports the assembly of libraries of high level components well-adapted to classes of knowledge discovery methods or application domains. The descriptions are made more powerful by introducing notations from the semantic Web.
JF  - THE DATA BONANZA: Improving Knowledge Discovery for Science, Engineering and Business
PB  - John Wiley & Sons Ltd.
ER  - 

TY  - CONF
T1  - Towards automatic detection of abnormal retinal capillaries in ultra-widefield-of-view retinal angiographic exams
T2  - Conf Proc IEEE Eng Med Biol Soc
Y1  - 2013
A1  - Zutis, K.
A1  - Trucco, E.
A1  - Hubschman, J. P.
A1  - Reed, D.
A1  - Shah, S.
A1  - van Hemert, J.
KW  - retinal imaging
AB  - Retinal capillary abnormalities include small, leaky, severely tortuous blood vessels that are associated with a variety of retinal pathologies. We present a prototype image-processing system for detecting abnormal retinal capillary regions in ultra-widefield-of-view (UWFOV) fluorescein angiography exams of the human retina. The algorithm takes as input an UWFOV FA frame and returns the candidate regions identified. An SVM classifier is trained on regions traced by expert ophthalmologists. Tests with a variety of feature sets indicate that edge features and allied properties differentiate best between normal and abnormal retinal capillary regions. Experiments with an initial set of images from patients showing branch retinal vein occlusion (BRVO) indicate promising area under the ROC curve of 0.950 and a weighted Cohen's Kappa value of 0.822.
JF  - Conf Proc IEEE Eng Med Biol Soc
ER  - 

TY  - JOUR
T1  - EnzML: multi-label prediction of enzyme classes using InterPro signatures.
JF  - BMC Bioinformatics
Y1  - 2012
A1  - De Ferrari, Luna
A1  - Stuart Aitken
A1  - van Hemert, Jano
A1  - Goryanin, Igor
AB  - BACKGROUND: Manual annotation of enzymatic functions cannot keep up with automatic genome sequencing. In this work we explore the capacity of InterPro sequence signatures to automatically predict enzymatic function.    RESULTS: We present EnzML, a multi-label classification method that can efficiently account also for proteins with multiple enzymatic functions: 50,000 in UniProt. EnzML was evaluated using a standard set of 300,747 proteins for which the manually curated Swiss-Prot and KEGG databases have agreeing Enzyme Commission (EC) annotations. EnzML achieved more than 98% subset accuracy (exact match of all correct Enzyme Commission classes of a protein) for the entire dataset and between 87 and 97% subset accuracy in reannotating eight entire proteomes: human, mouse, rat, mouse-ear cress, fruit fly, the S. pombe yeast, the E. coli bacterium and the M. jannaschii archaebacterium. To understand the role played by the dataset size, we compared the cross-evaluation results of smaller datasets, either constructed at random or from specific taxonomic domains such as archaea, bacteria, fungi, invertebrates, plants and vertebrates. The results were confirmed even when the redundancy in the dataset was reduced using UniRef100, UniRef90 or UniRef50 clusters.    CONCLUSIONS: InterPro signatures are a compact and powerful attribute space for the prediction of enzymatic function. This representation makes multi-label machine learning feasible in reasonable time (30 minutes to train on 300,747 instances with 10,852 attributes and 2,201 class values) using the Mulan Binary Relevance Nearest Neighbours algorithm implementation (BR-kNN).
VL  - 13
ER  - 

TY  - JOUR
T1  - Automatically Identifying and Annotating Mouse Embryo Gene Expression Patterns
JF  - Bioinformatics
Y1  - 2011
A1  - Liangxiu Han
A1  - van Hemert, Jano
A1  - Richard Baldock
KW  - classification
KW  - e-Science
AB  - Motivation: Deciphering the regulatory and developmental mechanisms for multicellular organisms requires detailed knowledge of gene interactions and gene expressions. The availability of large datasets with both spatial and ontological annotation of the spatio-temporal patterns of gene-expression in mouse embryo provides a powerful resource to discover the biological function of embryo organisation. Ontological annotation of gene expressions consists of labelling images with terms from the anatomy ontology for mouse development. If the spatial genes of an anatomical component are expressed in an image, the image is then tagged with a term of that anatomical component. The current annotation is done manually by domain experts, which is both time consuming and costly. In addition, the level of detail is variable and inevitably, errors arise from the tedious nature of the task. In this paper, we present a new method to automatically identify and annotate gene expression patterns in the mouse embryo with anatomical terms.    Results: The method takes images from in situ hybridisation studies and the ontology for the developing mouse embryo, it then combines machine learning and image processing techniques to produce classifiers that automatically identify and annotate gene expression patterns in these images.We evaluate our method on image data from the EURExpress-II study where we use it to automatically classify nine anatomical terms: humerus, handplate, fibula, tibia, femur, ribs, petrous part, scapula and head mesenchyme. The accuracy of our method lies between 70–80% with few exceptions.     Conclusions: We show that other known methods have lower classification performance than ours. We have investigated the images misclassified by our method and found several cases where the original annotation was not correct. This shows our method is robust against this kind of noise.     Availability: The annotation result and the experimental dataset in the paper can be freely accessed at http://www2.docm.mmu.ac.uk/STAFF/L.Han/geneannotation/    Contact: l.han@mmu.ac.uk, j.vanhemert@ed.ac.uk and Richard.Baldock@hgu.mrc.ac.uk
VL  - 27
UR  - http://bioinformatics.oxfordjournals.org/content/early/2011/02/25/bioinformatics.btr105.abstract
ER  - 

TY  - JOUR
T1  - Discovering the suitability of optimisation algorithms by learning from evolved instances
JF  - Annals of Mathematics and Artificial Intelligence
Y1  - 2011
A1  - K. Smith-Miles
A1  - {van Hemert}, J. I.
KW  - problem evolving
VL  - Online Fir
UR  - http://www.springerlink.com/content/6x83q3201gg71554/
ER  - 

TY  - JOUR
T1  - Generating web-based user interfaces for computational science
JF  - Concurrency and Computation: Practice and Experience
Y1  - 2011
A1  - van Hemert, J.
A1  - Koetsier, J.
A1  - Torterolo, L.
A1  - Porro, I.
A1  - Melato, M.
A1  - Barbera, R.
AB  - Scientific gateways in the form of web portals are becoming the popular approach to share knowledge  and resources around a topic in a community of researchers. Unfortunately, the development of web  portals is expensive and requires specialists skills. Commercial and more generic web portals have a much  larger user base and can afford this kind of development. Here we present two solutions that address this  problem in the area of portals for scientific computing; both take the same approach. The whole process  of designing, delivering and maintaining a portal can be made more cost-effective by generating a portal  from a description rather than programming in the traditional sense. We show four successful use cases to  show how this process works and the results it can deliver.
PB  - Wiley
VL  - 23
ER  - 

TY  - JOUR
T1  - A Generic Parallel Processing Model for Facilitating Data Mining and Integration
JF  - Parallel Computing
Y1  - 2011
A1  - Liangxiu Han
A1  - Chee Sun Liew
A1  - van Hemert, Jano
A1  - Malcolm Atkinson
KW  - Data Mining and Data Integration (DMI)
KW  - Life Sciences
KW  - OGSA-DAI
KW  - Parallelism
KW  - Pipeline Streaming
KW  - workflow
AB  - To facilitate Data Mining and Integration (DMI) processes in a generic way, we investigate a parallel pipeline streaming model. We model a DMI task as a streaming data-flow graph: a directed acyclic graph (DAG) of Processing Elements PEs. The composition mechanism links PEs via data streams, which may be in memory, buffered via disks or inter-computer data-flows. This makes it possible to build arbitrary DAGs with pipelining and both data and task parallelisms, which provides room for performance enhancement. We have applied this approach to a real DMI case in the Life Sciences and implemented a prototype. To demonstrate feasibility of the modelled DMI task and assess the efficiency of the prototype, we have also built a performance evaluation model. The experimental evaluation results show that a linear speedup has been achieved with the increase of the number of distributed computing nodes in this case study.
PB  - Elsevier
VL  - 37
IS  - 3
ER  - 

TY  - JOUR
T1  - Managing dynamic enterprise and urgent workloads on clouds using layered queuing and historical performance models
JF  - Simulation Modelling Practice and Theory
Y1  - 2011
A1  - David A. Bacigalupo
A1  - van Hemert, Jano I.
A1  - Xiaoyu Chen
A1  - Asif Usmani
A1  - Adam P. Chester
A1  - Ligang He
A1  - Donna N. Dillenberger
A1  - Gary B. Wills
A1  - Lester Gilbert
A1  - Stephen A. Jarvis
KW  - e-Science
AB  - The automatic allocation of enterprise workload to resources can be enhanced by being able to make what–if response time predictions whilst different allocations are being considered. We experimentally investigate an historical and a layered queuing performance model and show how they can provide a good level of support for a dynamic-urgent cloud environment. Using this we define, implement and experimentally investigate the effectiveness of a prediction-based cloud workload and resource management algorithm. Based on these experimental analyses we: (i) comparatively evaluate the layered queuing and historical techniques; (ii) evaluate the effectiveness of the management algorithm in different operating scenarios; and (iii) provide guidance on using prediction-based workload and resource management.
VL  - 19
ER  - 

TY  - JOUR
T1  - Performance database: capturing data for optimizing distributed streaming workflows
JF  - Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
Y1  - 2011
A1  - Chee Sun Liew
A1  - Atkinson, Malcolm P.
A1  - Radoslaw Ostrowski
A1  - Murray Cole
A1  - van Hemert, Jano I.
A1  - Liangxiu Han
KW  - measurement framework
KW  - performance data
KW  - streaming workflows
AB  - The performance database (PDB) stores performance-related data gathered during workflow enactment. We argue that by carefully understanding and manipulating this data, we can improve efficiency when enacting workflows. This paper describes the rationale behind the PDB, and proposes a systematic way to implement it. The prototype is built as part of the Advanced Data Mining and Integration Research for Europe project. We use workflows from real-world experiments to demonstrate the usage of PDB.
VL  - 369
IS  - 1949
ER  - 

TY  - CONF
T1  - RapidBrain: Developing a Portal for Brain Research Imaging
T2  - All Hands Meeting 2011, York
Y1  - 2011
A1  - Kenton D'Mellow
A1  - Rodríguez, David
A1  - Carpenter, Trevor
A1  - Jos Koetsier
A1  - Dominic Job
A1  - van Hemert, Jano
A1  - Wardlaw, Joanna
A1  - Fan Zhu
AB  - Brain imaging researchers execute complex multistep workflows in their computational analysis. Those workflows often include applications that have very different user interfaces and sometimes use different data formats. A good example is the brain perfusion quantification workflow used at the BRIC (Brain Research Imaging Centre) in Edinburgh.  Rapid provides an easy method for creating portlets for computational jobs, and at the same it is extensible.  We have exploited this extensibility with additions that stretch the functionality beyond the original limits. These changes can be used by other projects to create their own portals, but it should be noted that the development of such portals involve a greater effort than the required in the regular use of Rapid for creating portlets. In our case it has been used to provide a user-friendly interface for perfusion analysis that covers from volume
JF  - All Hands Meeting 2011, York
CY  - York
ER  - 

TY  - JOUR
T1  - Special Issue: Portals for life sciences---Providing intuitive access to bioinformatic tools
JF  - Concurrency and Computation: Practice and Experience
Y1  - 2011
A1  - Gesing, Sandra
A1  - van Hemert, J.
A1  - Kacsuk, P.
A1  - Kohlbacher, O.
AB  - The topic ‘Portals for life sciences’ includes various research fields, on the one hand many different  topics out of life sciences, e.g. mass spectrometry, on the other hand portal technologies and different  aspects of computer science, such as usability of user interfaces and security of systems. The main aspect  about portals is to simplify the user’s interaction with computational resources that are concerted to a  supported application domain.
PB  - Wiley
VL  - 23
IS  - 23
ER  - 

TY  - JOUR
T1  - A user-friendly web portal for T-Coffee on supercomputers
JF  - BMC Bioinformatics
Y1  - 2011
A1  - J. Rius
A1  - F. Cores
A1  - F. Solsona
A1  - van Hemert, J. I.
A1  - Koetsier, J.
A1  - C. Notredame
KW  - e-Science
KW  - portal
KW  - rapid
AB  - Background Parallel T-Coffee (PTC) was the first parallel implementation of the T-Coffee multiple sequence alignment tool. It is based on MPI and RMA mechanisms. Its purpose is to reduce the execution time of the large-scale sequence alignments. It can be run on distributed memory clusters allowing users to align data sets consisting of hundreds of proteins within a reasonable time. However, most of the potential users of this tool are not familiar with the use of grids or supercomputers. Results In this paper we show how PTC can be easily deployed and controlled on a super computer architecture using a web portal developed using Rapid. Rapid is a tool for efficiently generating standardized portlets for a wide range of applications and the approach described here is generic enough to be applied to other applications, or to deploy PTC on different HPC environments. Conclusions The PTC portal allows users to upload a large number of sequences to be aligned by the parallel version of TC that cannot be aligned by a single machine due to memory and execution time constraints. The web portal provides a user-friendly solution.
VL  - 12
UR  - http://www.biomedcentral.com/1471-2105/12/150
ER  - 

TY  - JOUR
T1  - Validation and mismatch repair of workflows through typed data streams
JF  - Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
Y1  - 2011
A1  - Yaikhom, Gagarine
A1  - Malcolm Atkinson
A1  - van Hemert, Jano
A1  - Oscar Corcho
A1  - Krause, Amy
AB  - The type system of a language guarantees that all of the operations on a set of data comply with the rules and conditions set by the language. While language typing is a fundamental requirement for any programming language, the typing of data that flow between processing elements within a workflow is currently being treated as optional. In this paper, we introduce a three-level type system for typing workflow data streams. These types are parts of the Data Intensive System Process Engineering Language programming language, which empowers users with the ability to validate the connections inside a workflow composition, and apply appropriate data type conversions when necessary. Furthermore, this system enables the enactment engine in carrying out type-directed workflow optimizations.
VL  - 369
IS  - 1949
ER  - 

TY  - CONF
T1  - Accelerating Data-Intensive Applications: a Cloud Computing Approach Image Pattern Recognition Tasks
T2  - The Fourth International Conference on Advanced Engineering Computing and Applications in Sciences
Y1  - 2010
A1  - Han, L
A1  - Saengngam, T.
A1  - van Hemert, J.
JF  - The Fourth International Conference on Advanced Engineering Computing and Applications in Sciences
ER  - 

TY  - JOUR
T1  - Correcting for intra-experiment variation in Illumina BeadChip data is necessary to generate robust gene-expression profiles
JF  - BMC Genomics
Y1  - 2010
A1  - R. R. Kitchen
A1  - V. S. Sabine
A1  - A. H. Sims
A1  - E. J. Macaskill
A1  - L. Renshaw
A1  - J. S. Thomas
A1  - van Hemert, J. I.
A1  - J. M. Dixon
A1  - J. M. S. Bartlett
AB  - Background    Microarray technology is a popular means of producing whole genome transcriptional profiles, however high cost and scarcity of mRNA has led many studies to be conducted based on the analysis of single samples. We exploit the design of the Illumina platform, specifically multiple arrays on each chip, to evaluate intra-experiment technical variation using repeated hybridisations of universal human reference RNA (UHRR) and duplicate hybridisations of primary breast tumour samples from a clinical study.    Results    A clear batch-specific bias was detected in the measured expressions of both the UHRR and clinical samples. This bias was found to persist following standard microarray normalisation techniques. However, when mean-centering or empirical Bayes batch-correction methods (ComBat) were applied to the data, inter-batch variation in the UHRR and clinical samples were greatly reduced. Correlation between replicate UHRR samples improved by two orders of magnitude following batch-correction using ComBat (ranging from 0.9833-0.9991 to 0.9997-0.9999) and increased the consistency of the gene-lists from the duplicate clinical samples, from 11.6% in quantile normalised data to 66.4% in batch-corrected data. The use of UHRR as an inter-batch calibrator provided a small additional benefit when used in conjunction with ComBat, further increasing the agreement between the two gene-lists, up to 74.1%.    Conclusion    In the interests of practicalities and cost, these results suggest that single samples can generate reliable data, but only after careful compensation for technical bias in the experiment. We recommend that investigators appreciate the propensity for such variation in the design stages of a microarray experiment and that the use of suitable correction methods become routine during the statistical analysis of the data.
VL  - 11
UR  - http://www.biomedcentral.com/1471-2164/11/134
IS  - 134
ER  - 

TY  - RPRT
T1  - Data-Intensive Research Workshop (15-19 March 2010) Report
Y1  - 2010
A1  - Malcolm Atkinson
A1  - Roure, David De
A1  - van Hemert, Jano
A1  - Shantenu Jha
A1  - Ruth McNally
A1  - Robert Mann
A1  - Stratis Viglas
A1  - Chris Williams
KW  - Data-intensive Computing
KW  - Data-Intensive Machines
KW  - Machine Learning
KW  - Scientific Databases
AB  - We met at the National e-Science Institute in Edinburgh on 15-19 March 2010 to develop our understanding of DIR. Approximately 100 participants (see Appendix A) worked together to develop their own understanding, and we are offering this report as the first step in communicating that to a wider community.  We present this in turns of our developing/emerging understanding of "What is DIR?" and "Why it is important?'". We then review the status of the field, report what the workshop achieved and what remains as open questions.
JF  - National e-Science Centre
PB  - Data-Intensive Research Group, School of Informatics, University of Edinburgh
CY  - Edinburgh
ER  - 

TY  - Generic
T1  - Federated Enactment of Workflow Patterns
T2  - Lecture Notes in Computer Science
Y1  - 2010
A1  - Yaikhom, Gagarine
A1  - Liew, Chee
A1  - Liangxiu Han
A1  - van Hemert, Jano
A1  - Malcolm Atkinson
A1  - Krause, Amy
ED  - D’Ambra, Pasqua
ED  - Guarracino, Mario
ED  - Talia, Domenico
AB  - In this paper we address two research questions concerning workflows: 1) how do we abstract and catalogue recurring workflow patterns?; and 2) how do we facilitate optimisation of the mapping from workflow patterns to actual resources at runtime? Our aim here is to explore techniques that are applicable to large-scale workflow compositions, where the resources could change dynamically during the lifetime of an application. We achieve this by introducing a registry-based mechanism where pattern abstractions are catalogued and stored. In conjunction with an enactment engine, which communicates with this registry, concrete computational implementations and resources are assigned to these patterns, conditional to the execution parameters. Using a data mining application from the life sciences, we demonstrate this new approach.
JF  - Lecture Notes in Computer Science
PB  - Springer Berlin / Heidelberg
VL  - 6271
UR  - http://dx.doi.org/10.1007/978-3-642-15277-1_31
N1  - 10.1007/978-3-642-15277-1_31
ER  - 

TY  - CHAP
T1  - Molecular Orbital Calculations of Inorganic Compounds
T2  - Inorganic Experiments
Y1  - 2010
A1  - C. A. Morrison
A1  - N. Robertson
A1  - Turner, A.
A1  - van Hemert, J.
A1  - Koetsier, J.
ED  - J. Derek Woollins
JF  - Inorganic Experiments
PB  - Wiley-VCH
SN  - 978-3527292530
ER  - 

TY  - JOUR
T1  - An open source toolkit for medical imaging de-identification
JF  - European Radiology
Y1  - 2010
A1  - Rodríguez, David
A1  - Carpenter, Trevor K.
A1  - van Hemert, Jano I.
A1  - Wardlaw, Joanna M.
KW  - Anonymisation
KW  - Data Protection Act (DPA)
KW  - De-identification
KW  - Digital Imaging and Communications in Medicine (DICOM)
KW  - Privacy policies
KW  - Pseudonymisation
KW  - Toolkit
AB  - Objective    Medical imaging acquired for clinical purposes can have several legitimate secondary uses in research projects and teaching libraries. No commonly accepted solution for anonymising these images exists because the amount of personal data that should be preserved varies case by case. Our objective is to provide a flexible mechanism for anonymising Digital Imaging and Communications in Medicine (DICOM) data that meets the requirements for deployment in multicentre trials.  Methods    We reviewed our current de-identification practices and defined the relevant use cases to extract the requirements for the de-identification process. We then used these requirements in the design and implementation of the toolkit. Finally, we tested the toolkit taking as a reference those requirements, including a multicentre deployment.  Results    The toolkit successfully anonymised DICOM data from various sources. Furthermore, it was shown that it could forward anonymous data to remote destinations, remove burned-in annotations, and add tracking information to the header. The toolkit also implements the DICOM standard confidentiality mechanism.  Conclusion    A DICOM de-identification toolkit that facilitates the enforcement of privacy policies was developed. It is highly extensible, provides the necessary flexibility to account for different de-identification requirements and has a low adoption barrier for new users.
VL  - 20
UR  - http://www.springerlink.com/content/j20844338623m167/
IS  - 8
ER  - 

TY  - CONF
T1  - Resource management of enterprise cloud systems using layered queuing and historical performance models
T2  - IEEE International Symposium on Parallel Distributed Processing
Y1  - 2010
A1  - Bacigalupo, D. A.
A1  - van Hemert, J.
A1  - Usmani, A.
A1  - Dillenberger, D. N.
A1  - Wills, G. B.
A1  - Jarvis, S. A.
KW  - e-Science
AB  - The automatic allocation of enterprise workload to resources can be enhanced by being able to make `what-if' response time predictions, whilst different allocations are being considered. It is important to quantitatively compare the effectiveness of different prediction techniques for use in cloud infrastructures. To help make the comparison of relevance to a wide range of possible cloud environments it is useful to consider the following. 1.) urgent cloud customers such as the emergency services that can demand cloud resources at short notice (e.g. for our FireGrid emergency response software). 2.) dynamic enterprise systems, that must rapidly adapt to frequent changes in workload, system configuration and/or available cloud servers. 3.) The use of the predictions in a coordinated manner by both the cloud infrastructure and cloud customer management systems. 4.) A broad range of criteria for evaluating each technique. However, there have been no previous comparisons meeting these requirements. This paper, meeting the above requirements, quantitatively compares the layered queuing and (\^A¿HYDRA\^A¿) historical techniques - including our initial thoughts on how they could be combined. Supporting results and experiments include the following: i.) defining, investigating and hence providing guidelines on the use of a historical and layered queuing model; ii.) using these guidelines showing that both techniques can make low overhead and typically over 70% accurate predictions, for new server architectures for which only a small number of benchmarks have been run; and iii.) defining and investigating tuning a prediction-based cloud workload and resource management algorithm.
JF  - IEEE International Symposium on Parallel Distributed Processing
ER  - 

TY  - CONF
T1  - TOPP goes Rapid
T2  - Cluster Computing and the Grid, IEEE International Symposium on
Y1  - 2010
A1  - Gesing, Sandra
A1  - van Hemert, Jano
A1  - Jos Koetsier
A1  - Bertsch, Andreas
A1  - Kohlbacher, Oliver
AB  - Proteomics, the study of all the proteins contained in a particular sample, e.g., a cell, is a key technology in current biomedical research. The complexity and volume of proteomics data sets produced by mass spectrometric methods clearly suggests the use of grid-based high-performance computing for analysis. TOPP and OpenMS are open-source packages for proteomics data analysis; however, they do not provide support for Grid computing. In this work we present a portal interface for high-throughput data analysis with TOPP. The portal is based on Rapid, a tool for efficiently generating standardized portlets for a wide range of applications. The web-based interface allows the creation and editing of user-defined pipelines and their execution and monitoring on a Grid infrastructure. The portal also supports several file transfer protocols for data staging. It thus provides a simple and complete solution to high-throughput proteomics data analysis for inexperienced users through a convenient portal interface.
JF  - Cluster Computing and the Grid, IEEE International Symposium on
PB  - IEEE Computer Society
CY  - Los Alamitos, CA, USA
SN  - 978-0-7695-4039-9
ER  - 

TY  - CONF
T1  - Towards Optimising Distributed Data Streaming Graphs using Parallel Streams
T2  - Data Intensive Distributed Computing (DIDC'10), in conjunction with the 19th International Symposium on High Performance Distributed Computing
Y1  - 2010
A1  - Chee Sun Liew
A1  - Atkinson, Malcolm P.
A1  - van Hemert, Jano
A1  - Liangxiu Han
KW  - Data-intensive Computing
KW  - Distributed Computing
KW  - Optimisation
KW  - Parallel Stream
KW  - Scientific Workflows
AB  - Modern scientific collaborations have opened up the opportunity of solving complex problems that involve multi- disciplinary expertise and large-scale computational experiments. These experiments usually involve large amounts of data that are located in distributed data repositories running various software systems, and managed by different organisations. A common strategy to make the experiments more manageable is executing the processing steps as a workflow. In this paper, we look into the implementation of fine-grained data-flow between computational elements in a scientific workflow as streams. We model the distributed computation as a directed acyclic graph where the nodes represent the processing elements that incrementally implement specific subtasks. The processing elements are connected in a pipelined streaming manner, which allows task executions to overlap. We further optimise the execution by splitting pipelines across processes and by introducing extra parallel streams. We identify performance metrics and design a measurement tool to evaluate each enactment. We conducted ex- periments to evaluate our optimisation strategies with a real world problem in the Life Sciences—EURExpress-II. The paper presents our distributed data-handling model, the optimisation and instrumentation strategies and the evaluation experiments. We demonstrate linear speed up and argue that this use of data-streaming to enable both overlapped pipeline and parallelised enactment is a generally applicable optimisation strategy.
JF  - Data Intensive Distributed Computing (DIDC'10), in conjunction with the 19th International Symposium on High Performance Distributed Computing
PB  - ACM
CY  - Chicago, Illinois
UR  - http://www.cct.lsu.edu/~kosar/didc10/index.php
ER  - 

TY  - CONF
T1  - Understanding TSP Difficulty by Learning from Evolved Instances
T2  - Lecture Notes in Computer Science
Y1  - 2010
A1  - Smith-Miles, Kate
A1  - van Hemert, Jano
A1  - Lim, Xin
ED  - Blum, Christian
ED  - Battiti, Roberto
AB  - Whether the goal is performance prediction, or insights into the relationships between algorithm performance and instance characteristics, a comprehensive set of meta-data from which relationships can be learned is needed. This paper provides a methodology to determine if the meta-data is sufficient, and demonstrates the critical role played by instance generation methods. Instances of the Travelling Salesman Problem (TSP) are evolved using an evolutionary algorithm to produce distinct classes of instances that are intentionally easy or hard for certain algorithms. A comprehensive set of features is used to characterise instances of the TSP, and the impact of these features on difficulty for each algorithm is analysed. Finally, performance predictions are achieved with high accuracy on unseen instances for predicting search effort as well as identifying the algorithm likely to perform best.
JF  - Lecture Notes in Computer Science
PB  - Springer Berlin / Heidelberg
VL  - 6073
UR  - http://dx.doi.org/10.1007/978-3-642-13800-3_29
N1  - 10.1007/978-3-642-13800-3_29
ER  - 

TY  - RPRT
T1  - ADMIRE D1.5 – Report defining an iteration of the model and language: PM3 and DL3
Y1  - 2009
A1  - Peter Brezany
A1  - Ivan Janciak
A1  - Alexander Woehrer
A1  - Carlos Buil Aranda
A1  - Malcolm Atkinson
A1  - van Hemert, Jano
AB  - This document is the third deliverable to report on the progress of the model, language and ontology research conducted within Workpackage 1 of the ADMIRE project. Significant progress has been made on each of the above areas. The new results that we achieved are recorded against the targets defined for project month 18 and are reported in four sections of this document
PB  - ADMIRE project
UR  - http://www.admire-project.eu/docs/ADMIRE-D1.5-model-language-ontology.pdf
ER  - 

TY  - CONF
T1  - Advanced Data Mining and Integration Research for Europe
T2  - All Hands Meeting 2009
Y1  - 2009
A1  - Atkinson, M.
A1  - Brezany, P.
A1  - Corcho, O.
A1  - Han, L
A1  - van Hemert, J.
A1  - Hluchy, L.
A1  - Hume, A.
A1  - Janciak, I.
A1  - Krause, A.
A1  - Snelling, D.
A1  - Wöhrer, A.
AB  - There is a rapidly growing wealth of data [1]. The number of sources of data is increasing,  while, at the same time, the diversity, complexity and scale of these data resources are also increasing  dramatically. This cornucopia of data oers much potential; a combinatorial explosion of opportunities  for knowledge discovery, improved decisions and better policies. Today, most of these opportunities are  not realised because composing data from multiple sources and extracting information is too dicult.  Every business, organisation and government faces problems that can only be addressed successfully if  we improve our techniques for exploiting the data we gather.
JF  - All Hands Meeting 2009
CY  - Oxford
ER  - 

TY  - CONF
T1  - Automating Gene Expression Annotation for Mouse Embryo
T2  - Lecture Notes in Computer Science (Advanced Data Mining and Applications, 5th International Conference)
Y1  - 2009
A1  - Liangxiu Han
A1  - van Hemert, Jano
A1  - Richard Baldock
A1  - Atkinson, Malcolm P.
ED  - Ronghuai Huang
ED  - Qiang Yang
ED  - Jian Pei
ED  - et al
JF  - Lecture Notes in Computer Science (Advanced Data Mining and Applications, 5th International Conference)
PB  - Springer
VL  - LNAI 5678
ER  - 

TY  - JOUR
T1  - The Circulate Architecture: Avoiding Workflow Bottlenecks Caused By Centralised Orchestration
JF  - Cluster Computing
Y1  - 2009
A1  - Barker, A.
A1  - Weissman, J.
A1  - van Hemert, J. I.
KW  - grid computing
KW  - workflow
VL  - 12
UR  - http://www.springerlink.com/content/080q5857711w2054/?p=824749739c6a432ea95a0c3b59f4025f&pi=1
ER  - 

TY  - CONF
T1  - A Distributed Architecture for Data Mining and Integration
T2  - Data-Aware Distributed Computing (DADC'09), in conjunction with the 18th International Symposium on High Performance Distributed Computing
Y1  - 2009
A1  - Atkinson, Malcolm P.
A1  - van Hemert, Jano
A1  - Liangxiu Han
A1  - Ally Hume
A1  - Chee Sun Liew
AB  - This paper presents the rationale for a new architecture to support a signiﬁcant increase in the scale of data integration and data mining. It proposes the composition into one framework of (1) data mining and (2) data access and integration. We name the combined activity “DMI”. It supports enactment of DMI processes across heterogeneous and distributed data resources and data mining services. It posits that a useful division can be made between the facilities established to support the deﬁnition of DMI processes and the computational infrastructure provided to enact DMI processes. Communication between those two divisions is restricted to requests submitted to gateway services in a canonical DMI language. Larger-scale processes are enabled by incremental reﬁnement of DMI-process deﬁnitions often by recomposition of lower-level deﬁnitions. Autonomous types and descriptions which will support detection of inconsistencies and semi-automatic insertion of adaptations.These architectural ideas are being evaluated in a feasibility study that involves an application scenario and representatives of the community.
JF  - Data-Aware Distributed Computing (DADC'09), in conjunction with the 18th International Symposium on High Performance Distributed Computing
PB  - ACM
ER  - 

TY  - RPRT
T1  - An e-Infrastructure for Collaborative Research in Human Embryo Development
Y1  - 2009
A1  - Barker, Adam
A1  - van Hemert, Jano I.
A1  - Baldock, Richard A.
A1  - Atkinson, Malcolm P.
AB  - Within the context of the EU Design Study Developmental Gene Expression Map, we identify a set of challenges when facilitating collaborative research on early human embryo development. These challenges bring forth requirements, for which we have identified solutions and technology. We summarise our solutions and demonstrate how they integrate to form an e-infrastructure to support collaborative research in this area of developmental biology.
UR  - http://arxiv.org/pdf/0901.2310v1
ER  - 

TY  - CONF
T1  - An E-infrastructure to Support Collaborative Embryo Research
T2  - Cluster Computing and the Grid
Y1  - 2009
A1  - Barker, Adam
A1  - van Hemert, Jano I.
A1  - Baldock, Richard A.
A1  - Atkinson, Malcolm P.
JF  - Cluster Computing and the Grid
PB  - IEEE Computer Society
SN  - 978-0-7695-3622-4
ER  - 

TY  - CHAP
T1  - Exploiting Fruitful Regions in Dynamic Routing using Evolutionary Computation
T2  - Studies in Computational Intelligence
Y1  - 2009
A1  - van Hemert, J. I.
A1  - la Poutré, J. A.
ED  - Pereira Babtista, F.
ED  - Tavares, J.
JF  - Studies in Computational Intelligence
PB  - Springer
VL  - 161
SN  - 978-3-540-85151-6
N1  - Awaiting publication (due October 2008)
ER  - 

TY  - JOUR
T1  - Giving Computational Science a Friendly Face
JF  - Zero-In
Y1  - 2009
A1  - van Hemert, J. I.
A1  - Koetsier, J.
AB  - Today, most researchers from any discipline will successfully use web-based e-commerce systems to book flights to attend their conferences. But when these same researchers are confronted with compute-intensive problems, they cannot expect elaborate web-based systems to enable their domain-specific tasks.
VL  - 1
UR  - http://www.beliefproject.org/zero-in/zero-in-third-edition/zero-in-issue-3
IS  - 3
ER  - 

TY  - CONF
T1  - A model of social collaboration in Molecular Biology knowledge bases
T2  - Proceedings of the 6th Conference of the European Social Simulation    Association (ESSA'09)
Y1  - 2009
A1  - De Ferrari, Luna
A1  - Stuart Aitken
A1  - van Hemert, Jano
A1  - Goryanin, Igor
AB  - Manual annotation of biological data cannot keep up with data production.     Open annotation models using wikis have been proposed to address     this problem. In this empirical study we analyse 36 years of knowledge     collection by 738 authors in two Molecular Biology wikis (EcoliWiki     and WikiPathways) and two knowledge bases (OMIM and Reactome). We     first investigate authorship metrics (authors per entry and edits     per author) which are power-law distributed in Wikipedia and we find     they are heavy-tailed in these four systems too. We also find surprising     similarities between the open (editing open to everyone) and the     closed systems (expert curators only). Secondly, to discriminate     between driving forces in the measured distributions, we simulate     the curation process and find that knowledge overlap among authors     can drive the number of authors per entry, while the time the users     spend on the knowledge base can drive the number of contributions     per author.
JF  - Proceedings of the 6th Conference of the European Social Simulation    Association (ESSA'09)
PB  - European Social Simulation Association
ER  - 

TY  - CONF
T1  - Portals for Life Sciences—a Brief Introduction
T2  - Proceedings of the IWPLS09 International Workshop on Portals for Life Sciences
Y1  - 2009
A1  - Gesing, Sandra
A1  - Kohlbacher, O.
A1  - van Hemert, J. I.
AB  - The topic ”‘Portals for Life Sciences”’ includes various research fields, on the one hand many different topics out of life sciences, e.g. mass spectrometry, on the other hand portal technologies and different aspects of computer science, such as usability of user interfaces and security of systems. The main aspect about portals is to simplify the user’s interaction with computational resources which are concer- ted to a supported application domain.
JF  - Proceedings of the IWPLS09 International Workshop on Portals for Life Sciences
T3  - CEUR Workshop Proceedings
UR  - http://ceur-ws.org/Vol-513/paper01.pdf
ER  - 

TY  - Generic
T1  - Proceedings of the 1st International Workshop on Portals for Life Sciences
T2  - IWPLS09 International Workshop on Portals for Life Sciences
Y1  - 2009
A1  - Gesing, Sandra
A1  - van Hemert, Jano I.
JF  - IWPLS09 International Workshop on Portals for Life Sciences
T3  - CEUR Workshop Proceedings
CY  - e-Science Institute, Edinburgh, UK
UR  - http://ceur-ws.org/Vol-513
ER  - 

TY  - CONF
T1  - Rapid chemistry portals through engaging researchers
T2  - Fifth IEEE International Conference on e-Science
Y1  - 2009
A1  - Koetsier, J.
A1  - Turner, A.
A1  - Richardson, P.
A1  - van Hemert, J. I.
ED  - Trefethen, A
ED  - De Roure, D
AB  - In this study, we apply a methodology for rapid development of portlets for scientific computing to the domain of computational chemistry. We report results in terms of the portals delivered, the changes made to our methodology and the experience gained in terms of interaction with domain-specialists. Our major contributions are: several web portals for teaching and research in computational chemistry; a successful transition to having our development tool used by the domain specialist as opposed by us, the developers; and an updated version of our methodology and technology for rapid development of portlets for computational science, which is free for anyone to pick up and use.
JF  - Fifth IEEE International Conference on e-Science
CY  - Oxford, UK
ER  - 

TY  - CONF
T1  - Rapid development of computational science portals
T2  - Proceedings of the IWPLS09 International Workshop on Portals for Life Sciences
Y1  - 2009
A1  - Koetsier, J.
A1  - van Hemert, J. I.
ED  - Gesing, S.
ED  - van Hemert, J. I.
KW  - portal
JF  - Proceedings of the IWPLS09 International Workshop on Portals for Life Sciences
T3  - CEUR Workshop Proceedings
PB  - e-Science Institute
CY  - Edinburgh
UR  - http://ceur-ws.org/Vol-513/paper05.pdf
ER  - 

TY  - JOUR
T1  - Towards a Virtual Fly Brain
JF  - Philosophical Transactions A
Y1  - 2009
A1  - Armstrong, J. D.
A1  - van Hemert, J. I.
KW  - e-Science
AB  - Models of the brain that simulate sensory input, behavioural output and information processing in a biologically plausible manner pose significant challenges to both Computer Science and Biology. Here we investigated strategies that could be used to create a model of the insect brain, specifically that of Drosophila melanogaster which is very widely used in laboratory research. The scale of the problem is an order of magnitude above the most complex of the current simulation projects and it is further constrained by the relative sparsity of available electrophysiological recordings from the fly nervous system. However, fly brain research at the anatomical and behavioural level offers some interesting opportunities that could be exploited to create a functional simulation. We propose to exploit these strengths of Drosophila CNS research to focus on a functional model that maps biologically plausible network architecture onto phenotypic data from neuronal inhibition and stimulation studies, leaving aside biophysical modelling of individual neuronal activity for future models until more data is available.
VL  - 367
UR  - http://rsta.royalsocietypublishing.org/content/367/1896/2387.abstract
ER  - 

TY  - CONF
T1  - Using architectural simulation models to aid the design of data intensive application
T2  - The Third International Conference on Advanced Engineering Computing and Applications in Sciences (ADVCOMP 2009)
Y1  - 2009
A1  - Javier Fernández
A1  - Liangxiu Han
A1  - Alberto Nuñez
A1  - Jesus Carretero
A1  - van Hemert, Jano
JF  - The Third International Conference on Advanced Engineering Computing and Applications in Sciences (ADVCOMP 2009)
PB  - IEEE Computer Society
CY  - Sliema, Malta
ER  - 

TY  - JOUR
T1  - Using the DCC Lifecycle Model to Curate a Gene Expression Database: A Case Study
JF  - International Journal of Digital Curation
Y1  - 2009
A1  - O’Donoghue, J.
A1  - van Hemert, J. I.
AB  - Developmental Gene Expression Map (DGEMap) is an EU-funded Design Study, which will accelerate an integrated European approach to gene expression in early human development. As part of this design study, we have had to address the challenges and issues raised by the long-term curation of such a resource. As this project is primarily one of data creators, learning about curation, we have been looking at some of the models and tools that are already available in the digital curation field in order to inform our thinking on how we should proceed with curating DGEMap. This has led us to uncover a wide range of resources for data creators and curators alike. Here we will discuss the future curation of DGEMap as a case study. We believe our experience could be instructive to other projects looking to improve the curation and management of their data.
PB  - UKOLN
VL  - 4
UR  - http://www.ijdc.net/index.php/ijdc/article/view/134
IS  - 3
ER  - 

TY  - CHAP
T1  - Contraction-Based Heuristics to Improve the Efficiency of Algorithms Solving the Graph Colouring Problem
T2  - Studies in Computational Intelligence
Y1  - 2008
A1  - Juhos, I.
A1  - van Hemert, J. I.
ED  - Cotta, C.
ED  - van Hemert, J. I.
KW  - constraint satisfaction
KW  - evolutionary computation
KW  - graph colouring
JF  - Studies in Computational Intelligence
PB  - Springer
ER  - 

TY  - CONF
T1  - Eliminating the Middle Man: Peer-to-Peer Dataflow
T2  - HPDC '08: Proceedings of the 17th International Symposium on High Performance Distributed Computing
Y1  - 2008
A1  - Barker, Adam
A1  - Weissman, Jon B.
A1  - van Hemert, Jano
KW  - grid computing
KW  - workflow
JF  - HPDC '08: Proceedings of the 17th International Symposium on High Performance Distributed Computing
PB  - ACM
ER  - 

TY  - Generic
T1  - European Graduate Student Workshop on Evolutionary Computation
Y1  - 2008
A1  - Di Chio, Cecilia
A1  - Giacobini, Mario
A1  - van Hemert, Jano
ED  - Di Chio, Cecilia
ED  - Giacobini, Mario
ED  - van Hemert, Jano
KW  - evolutionary computation
AB  - Evolutionary computation involves the study of problem-solving and optimization techniques inspired by principles of evolution and genetics. As any other scientific field, its success relies on the continuity provided by new researchers joining the field to help it progress. One of the most important sources for new researchers is the next generation of PhD students that are actively studying a topic relevant to this field. It is from this main observation the idea arose of providing a platform exclusively for PhD students.
ER  - 

TY  - Generic
T1  - Evolutionary Computation in Combinatorial Optimization, 8th European Conference
T2  - Lecture Notes in Computer Science
Y1  - 2008
A1  - van Hemert, Jano
A1  - Cotta, Carlos
ED  - van Hemert, Jano
ED  - Cotta, Carlos
KW  - evolutionary computation
AB  - Metaheuristics have shown to be effective for difficult combinatorial  optimization problems appearing in various industrial, economical, and  scientific domains. Prominent examples of metaheuristics are evolutionary  algorithms, tabu search, simulated annealing, scatter search, memetic  algorithms, variable neighborhood search, iterated local search, greedy  randomized adaptive search procedures, ant colony optimization and estimation  of distribution algorithms. Problems solved successfully include scheduling,  timetabling, network design, transportation and distribution, vehicle routing,  the travelling salesman problem,  packing and cutting, satisfiability and  general mixed integer programming.    EvoCOP began in 2001 and has been held annually since then. It is  the first event specifically dedicated to the application of  evolutionary computation and related methods to combinatorial  optimization problems. Originally held as a workshop, EvoCOP became  a conference in 2004. The events gave researchers an excellent  opportunity to present their latest research and to discuss current  developments and applications. Following  the general trend of hybrid metaheuristics and diminishing  boundaries between the different classes of metaheuristics, EvoCOP  has broadened its scope over the last years and invited submissions  on any kind of metaheuristic for combinatorial optimization.
JF  - Lecture Notes in Computer Science
PB  - Springer
VL  - LNCS 4972
ER  - 

TY  - CONF
T1  - Graph Colouring Heuristics Guided by Higher Order Graph Properties
T2  - Lecture Notes in Computer Science
Y1  - 2008
A1  - Juhos, Istv\'{a}n
A1  - van Hemert, Jano
ED  - van Hemert, Jano
ED  - Cotta, Carlos
KW  - evolutionary computation
KW  - graph colouring
AB  - Graph vertex colouring can be defined in such a way where colour assignments are substituted by vertex contractions. We present various hyper-graph representations for the graph colouring problem all based on the approach where vertices are merged into groups. In this paper, we show this provides a uniform and compact way to define algorithms, both of a complete or a heuristic nature. Moreover, the representation provides information useful to guide algorithms during their search. In this paper we focus on the quality of solutions obtained by graph colouring heuristics that make use of higher order properties derived during the search. An evolutionary algorithm is used to search permutations of possible merge orderings.
JF  - Lecture Notes in Computer Science
PB  - Springer
VL  - 4972
ER  - 

TY  - CONF
T1  - Matching Spatial Regions with Combinations of Interacting Gene Expression Patterns
T2  - Communications in Computer and Information Science
Y1  - 2008
A1  - van Hemert, J. I.
A1  - Baldock, R. A.
ED  - M. Elloumi
ED  - \emph
ED  - et al
KW  - biomedical
KW  - data mining
KW  - DGEMap
KW  - e-Science
AB  - The Edinburgh Mouse Atlas aims to capture in-situ gene expression patterns in a common spatial framework. In this study, we construct a grammar to define spatial regions by combinations of these patterns. Combinations are formed by applying operators to curated gene expression patterns from the atlas, thereby resembling gene interactions in a spatial context. The space of combinations is searched using an evolutionary algorithm with the objective of finding the best match to a given target pattern. We evaluate the method by testing its robustness and the statistical significance of the results it finds.
JF  - Communications in Computer and Information Science
PB  - Springer Verlag
ER  - 

TY  - CONF
T1  - Orchestrating Data-Centric Workflows
T2  - The 8th IEEE International Symposium on Cluster Computing and the Grid (CCGrid)
Y1  - 2008
A1  - Barker, Adam
A1  - Weissman, Jon B.
A1  - van Hemert, Jano
KW  - grid computing
KW  - workflow
JF  - The 8th IEEE International Symposium on Cluster Computing and the Grid (CCGrid)
PB  - IEEE Computer Society
ER  - 

TY  - BOOK
T1  - Recent Advances in Evolutionary Computation for Combinatorial Optimization
T2  - Studies in Computational Intelligence
Y1  - 2008
A1  - Cotta, Carlos
A1  - van Hemert, Jano
AB  - Combinatorial optimisation is a ubiquitous discipline whose usefulness spans vast applications domains. The intrinsic complexity of most combinatorial optimisation problems makes classical methods unaffordable in many cases. To acquire practical solutions to these problems requires the use of metaheuristic approaches that trade completeness for pragmatic effectiveness. Such approaches are able to provide optimal or quasi-optimal solutions to a plethora of difficult combinatorial optimisation problems.    The application of metaheuristics to combinatorial optimisation is an active field in which new theoretical developments, new algorithmic models, and new application areas are continuously emerging. This volume presents recent advances in the area of metaheuristic combinatorial optimisation, with a special focus on evolutionary computation methods. Moreover, it addresses local search methods and hybrid approaches. In this sense, the book includes cutting-edge theoretical, methodological, algorithmic and applied developments in the field, from respected experts and with a sound perspective.
JF  - Studies in Computational Intelligence
PB  - Springer
VL  - 153
SN  - 978-3-540-70806-3
UR  - http://www.springer.com/engineering/book/978-3-540-70806-3
ER  - 

TY  - CONF
T1  - Scientific Workflow: A Survey and Research Directions
T2  - Lecture Notes in Computer Science
Y1  - 2008
A1  - Barker, Adam
A1  - van Hemert, Jano
KW  - e-Science
KW  - workflow
AB  - Workflow technologies are emerging as the dominant approach to coordinate groups of distributed services. However with a space filled with competing specifications, standards and frameworks from multiple domains, choosing the right tool for the job is not always a straightforward task. Researchers are often unaware of the range of technology that already exists and focus on implementing yet another proprietary workflow system. As an antidote to this common problem, this paper presents a concise survey of existing workflow technology from the business and scientific domain and makes a number of key suggestions towards the future development of scientific workflow systems.
JF  - Lecture Notes in Computer Science
PB  - Springer
VL  - 4967
UR  - http://dx.doi.org/10.1007/978-3-540-68111-3_78
ER  - 

TY  - CONF
T1  - WikiSim: simulating knowledge collection and curation in structured    wikis.
T2  - Proceedings of the 2008 International Symposium on Wikis in Porto,    Portugal
Y1  - 2008
A1  - De~Ferrari, Luna
A1  - Stuart Aitken
A1  - van Hemert, Jano
A1  - Goryanin, Igor
AB  - The aim of this work is to model quantitatively one of the main properties     of wikis: how high quality knowledge can emerge from the individual     work of independent volunteers. The approach chosen is to simulate     knowledge collection and curation in wikis. The basic model represents     the wiki as a set of of true/false values, added and edited at each     simulation round by software agents (users) following a fixed set     of rules. The resulting WikiSim simulations already manage to reach     distributions of edits and user contributions very close to those     reported for Wikipedia. WikiSim can also span conditions not easily     measurable in real-life wikis, such as the impact of various amounts     of user mistakes. WikiSim could be extended to model wiki software     features, such as discussion pages and watch lists, while monitoring     the impact they have on user actions and consensus, and their effect     on knowledge quality. The method could also be used to compare wikis     with other curation scenarios based on centralised editing by experts.     The future challenges for WikiSim will be to find appropriate ways     to evaluate and validate the models and to keep them simple while     still capturing relevant properties of wiki systems.
JF  - Proceedings of the 2008 International Symposium on Wikis in Porto,    Portugal
PB  - ACM
CY  - New York, NY, USA
ER  - 

TY  - CONF
T1  - Data Integration in eHealth: A Domain/Disease Specific Roadmap
T2  - Studies in Health Technology and Informatics
Y1  - 2007
A1  - Ure, J.
A1  - Proctor, R.
A1  - Martone, M.
A1  - Porteous, D.
A1  - Lloyd, S.
A1  - Lawrie, S.
A1  - Job, D.
A1  - Baldock, R.
A1  - Philp, A.
A1  - Liewald, D.
A1  - Rakebrand, F.
A1  - Blaikie, A.
A1  - McKay, C.
A1  - Anderson, S.
A1  - Ainsworth, J.
A1  - van Hemert, J.
A1  - Blanquer, I.
A1  - Sinno
ED  - N. Jacq
ED  - Y. Legr{\'e}
ED  - H. Muller
ED  - I. Blanquer
ED  - V. Breton
ED  - D. Hausser
ED  - V. Hern{\'a}ndez
ED  - T. Solomonides
ED  - M. Hofman-Apitius
KW  - e-Science
AB  - The paper documents a series of data integration workshops held in 2006 at the UK National e-Science Centre, summarizing a range of the problem/solution scenarios in multi-site and multi-scale data integration with six HealthGrid projects using schizophrenia as a domain-specific test case. It outlines   emerging strategies, recommendations and objectives for collaboration on shared ontology-building and harmonization of data for multi-site trials in this domain.
JF  - Studies in Health Technology and Informatics
PB  - IOPress
VL  - 126
SN  - 978-1-58603-738-3
ER  - 

TY  - Generic
T1  - European Graduate Student Workshop on Evolutionary Computation
Y1  - 2007
A1  - Giacobini, Mario
A1  - van Hemert, Jano
ED  - Mario Giacobini
ED  - van Hemert, Jano
KW  - evolutionary computation
AB  - Evolutionary computation involves the study of problem-solving and optimization techniques inspired by principles of evolution and genetics. As any other scientific field, its success relies on the continuity provided by new researchers joining the field to help it progress. One of the most important sources for new researchers is the next generation of PhD students that are actively studying a topic relevant to this field. It is from this main observation the idea arose of providing a platform exclusively for PhD students.
CY  - Valencia, Spain
ER  - 

TY  - Generic
T1  - Evolutionary Computation in Combinatorial Optimization, 7th European Conference
T2  - Lecture Notes in Computer Science
Y1  - 2007
A1  - Cotta, Carlos
A1  - van Hemert, Jano
ED  - Carlos Cotta
ED  - van Hemert, Jano
KW  - evolutionary computation
AB  - Metaheuristics have often been shown to be effective for difficult combinatorial   optimization problems appearing in various industrial, economical, and scientific   domains. Prominent examples of metaheuristics are evolutionary algorithms,   simulated annealing, tabu search, scatter search, memetic algorithms, variable   neighborhood search, iterated local search, greedy randomized adaptive search   procedures, estimation of distribution algorithms, and ant colony optimization.   Successfully solved problems include scheduling, timetabling, network design,   transportation and distribution, vehicle routing, the traveling salesman problem,   satisfiability, packing and cutting, and general mixed integer programming.     EvoCOP began in 2001 and has been held annually since then. It was the   first event specifically dedicated to the application of evolutionary computation   and related methods to combinatorial optimization problems. Originally held as   a workshop, EvoCOP became a conference in 2004. The events gave researchers   an excellent opportunity to present their latest research and to discuss current   developments and applications as well as providing for improved interaction   between members of this scientific community. Following the general trend of   hybrid metaheuristics and diminishing boundaries between the different classes   of metaheuristics, EvoCOP has broadened its scope over the last years and   invited submissions on any kind of metaheuristic for combinatorial optimization.
JF  - Lecture Notes in Computer Science
PB  - Springer
VL  - LNCS 4446
UR  - http://springerlink.metapress.com/content/105633/
ER  - 

TY  - CONF
T1  - Mining spatial gene expression data for association rules
T2  - Lecture Notes in Bioinformatics
Y1  - 2007
A1  - van Hemert, J. I.
A1  - Baldock, R. A.
ED  - S. Hochreiter
ED  - R. Wagner
KW  - biomedical
KW  - data mining
KW  - DGEMap
KW  - e-Science
AB  - We analyse data from the Edinburgh Mouse Atlas Gene-Expression Database (EMAGE) which is a high quality data source for spatio-temporal gene expression patterns. Using a novel process whereby generated patterns are used to probe spatially-mapped gene expression domains, we are able to get unbiased results as opposed to using annotations based predefined anatomy regions. We describe two processes to form association rules based on spatial configurations, one that associates spatial regions, the other associates genes.
JF  - Lecture Notes in Bioinformatics
PB  - Springer Verlag
UR  - http://dx.doi.org/10.1007/978-3-540-71233-6_6
ER  - 

TY  - Generic
T1  - European Graduate Student Workshop on Evolutionary Computation
Y1  - 2006
A1  - Giacobini, Mario
A1  - van Hemert, Jano
ED  - Giacobini, Mario
ED  - van Hemert, Jano
KW  - evolutionary computation
AB  - Evolutionary computation involves the study of problem-solving and optimization techniques inspired by principles of evolution and genetics. As any other scientific field, its success relies on the continuity provided by new researchers joining the field to help it progress. One of the most important sources for new researchers is the next generation of PhD students that are actively studying a topic relevant to this field. It is from this main observation the idea arose of providing a platform exclusively for PhD students.
CY  - Budapest, Hungary
ER  - 

TY  - JOUR
T1  - Evolving combinatorial problem instances that are difficult to solve
JF  - Evolutionary Computation
Y1  - 2006
A1  - van Hemert, J. I.
KW  - constraint programming
KW  - constraint satisfaction
KW  - evolutionary computation
KW  - problem evolving
KW  - satisfiability
KW  - travelling salesman
AB  - In this paper we demonstrate how evolutionary computation can be used to acquire difficult to solve combinatorial problem instances, thereby stress-testing the corresponding algorithms used to solve these instances. The technique is applied in three important domains of combinatorial optimisation, binary constraint satisfaction, Boolean satisfiability, and the travelling salesman problem. Problem instances acquired through this technique are more difficult than ones found in popular benchmarks. We analyse these evolved instances with the aim to explain their difficulty in terms of structural properties, thereby exposing the weaknesses of corresponding algorithms.
VL  - 14
UR  - http://www.mitpressjournals.org/toc/evco/14/4
ER  - 

TY  - CONF
T1  - Improving Graph Colouring Algorithms and Heuristics Using a Novel Representation
T2  - Springer Lecture Notes on Computer Science
Y1  - 2006
A1  - Juhos, I.
A1  - van Hemert, J. I.
ED  - J. Gottlieb
ED  - G. Raidl
KW  - constraint satisfaction
KW  - graph colouring
AB  - We introduce a novel representation for the graph colouring problem, called the Integer Merge Model, which aims to reduce the time complexity of an algorithm. Moreover, our model provides useful information for guiding heuristics as well as a compact description for algorithms. To verify the potential of the model, we use it in dsatur, in an evolutionary algorithm, and in the same evolutionary algorithm extended with heuristics. An empiricial investigation is performed to show an increase in efficiency on two problem suites , a set of practical problem instances and a set of hard problem instances from the phase transition.
JF  - Springer Lecture Notes on Computer Science
PB  - Springer-Verlag
ER  - 

TY  - JOUR
T1  - Increasing the efficiency of graph colouring algorithms with a representation based on vector operations
JF  - Journal of Software
Y1  - 2006
A1  - Juhos, I.
A1  - van Hemert, J. I.
KW  - graph colouring
AB  - We introduce a novel representation for the graph colouring problem, called the Integer Merge Model, which aims to reduce the time complexity of graph colouring algorithms. Moreover, this model provides useful information to aid in the creation of heuristics that can make the colouring process even faster. It also serves as a compact definition for the description of graph colouring algorithms. To verify the potential of the model, we use it in the complete algorithm DSATUR, and in two version of an incomplete approximation algorithm; an evolutionary algorithm and the same evolutionary algorithm extended with guiding heuristics. Both theoretical and empirical results are provided investigation is performed to show an increase in the efficiency of solving graph colouring problems. Two problem suites were used for the empirical evidence: a set of practical problem instances and a set of hard problem instances from the phase transition.
VL  - 1
ER  - 

TY  - CONF
T1  - Neighborhood Searches for the Bounded Diameter Minimum Spanning Tree Problem Embedded in a VNS, EA, and ACO
T2  - Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2006)
Y1  - 2006
A1  - Gruber, M.
A1  - van Hemert, J. I.
A1  - Raidl, G. R.
ED  - Maarten Keijzer
ED  - et al
KW  - constraint satisfaction
KW  - evolutionary computation
KW  - variable neighbourhood search
AB  - We consider the Bounded Diameter Minimum Spanning Tree problem and describe four neighbourhood searches for it. They are used as local improvement strategies within a variable neighbourhood search (VNS), an evolutionary algorithm (EA) utilising a new encoding of solutions, and an ant colony optimisation (ACO).We compare the performance in terms of effectiveness between these three hybrid methods on a suite  f popular benchmark instances, which contains instances too large to solve by current exact methods. Our results show that the EA and the ACO outperform the VNS on almost all used benchmark instances. Furthermore, the ACO yields most of the time better solutions than the EA in long-term runs, whereas the EA dominates when the computation time is strongly restricted.
JF  - Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2006)
PB  - ACM
CY  - Seattle, USA
VL  - 2
ER  - 

TY  - CONF
T1  - Complexity Transitions in Evolutionary Algorithms: Evaluating the impact of the initial population
T2  - Proceedings of the Congress on Evolutionary Computation
Y1  - 2005
A1  - Defaweux, A.
A1  - Lenaerts, T.
A1  - van Hemert, J. I.
A1  - Parent, J.
KW  - constraint satisfaction
KW  - transition models
AB  - This paper proposes an evolutionary approach for the composition of solutions in an incremental way. The approach is based on the metaphor of transitions in complexity discussed in the context of evolutionary biology. Partially defined solutions interact and evolve into aggregations until a full solution for the problem at hand is found. The impact of the initial population on the outcome and the dynamics of the process is evaluated using the domain of binary constraint satisfaction problems.
JF  - Proceedings of the Congress on Evolutionary Computation
PB  - {IEEE} Press
ER  - 

TY  - CONF
T1  - Evolutionary Transitions as a Metaphor for Evolutionary Optimization
T2  - LNAI 3630
Y1  - 2005
A1  - Defaweux, A.
A1  - Lenaerts, T.
A1  - van Hemert, J. I.
ED  - M. Capcarrere
ED  - A. A. Freitas
ED  - P. J. Bentley
ED  - C. G. Johnson
ED  - J. Timmis
KW  - constraint satisfaction
KW  - transition models
AB  - This paper proposes a computational model for solving optimisation problems that mimics the principle of evolutionary transitions in individual complexity. More specifically it incorporates mechanisms for the emergence of increasingly complex individuals from the interaction of  more simple ones. The biological principles for transition are outlined and mapped onto  an evolutionary computation context.  The class of binary constraint satisfaction problems is used to illustrate the transition mechanism.
JF  - LNAI 3630
PB  - Springer-Verlag
SN  - 3-540-28848-1
ER  - 

TY  - Generic
T1  - Genetic Programming, Proceedings of the 8th European Conference
T2  - Lecture Notes in Computer Science
Y1  - 2005
A1  - Keijzer, M.
A1  - Tettamanzi, A.
A1  - Collet, P.
A1  - van Hemert, J.
A1  - Tomassini, M.
ED  - M. Keijzer
ED  - A. Tettamanzi
ED  - P. Collet
ED  - van Hemert, J.
ED  - M. Tomassini
KW  - evolutionary computation
JF  - Lecture Notes in Computer Science
PB  - Springer
VL  - 3447
SN  - 3-540-25436-6
UR  - http://www.springeronline.com/sgw/cda/frontpage/0,11855,3-40100-22-45347265-0,00.html?changeHeader=true
ER  - 

TY  - CONF
T1  - Heuristic Colour Assignment Strategies for Merge Models in Graph Colouring
T2  - Springer Lecture Notes on Computer Science
Y1  - 2005
A1  - Juhos, I.
A1  - Tóth, A.
A1  - van Hemert, J. I.
ED  - G. Raidl
ED  - J. Gottlieb
KW  - constraint satisfaction
KW  - graph colouring
AB  - In this paper, we combine a powerful representation for graph colouring problems with different heuristic strategies for colour assignment. Our novel strategies employ heuristics that exploit information about the partial colouring in an aim to improve performance. An evolutionary algorithm is used to drive the search. We compare the different strategies to each other on several very hard benchmarks and on generated problem instances, and show where the novel strategies improve the efficiency.
JF  - Springer Lecture Notes on Computer Science
PB  - Springer-Verlag, Berlin
ER  - 

TY  - CONF
T1  - Property analysis of symmetric travelling salesman problem instances acquired through evolution
T2  - Springer Lecture Notes on Computer Science
Y1  - 2005
A1  - van Hemert, J. I.
ED  - G. Raidl
ED  - J. Gottlieb
KW  - problem evolving
KW  - travelling salesman
AB  - We show how an evolutionary algorithm can successfully be used to evolve a set of difficult to solve symmetric travelling salesman problem instances for two variants of the Lin-Kernighan algorithm. Then we analyse the instances in those sets to guide us towards deferring general knowledge about the efficiency of the two variants in relation to structural properties of the symmetric travelling salesman problem.
JF  - Springer Lecture Notes on Computer Science
PB  - Springer-Verlag, Berlin
ER  - 

TY  - CONF
T1  - Transition Models as an incremental approach for problem solving in Evolutionary Algorithms
T2  - Proceedings of the Genetic and Evolutionary Computation Conference
Y1  - 2005
A1  - Defaweux, A.
A1  - Lenaerts, T.
A1  - van Hemert, J. I.
A1  - Parent, J.
ED  - H.-G. Beyer
ED  - et al
KW  - constraint satisfaction
KW  - transition models
AB  - This paper proposes an incremental approach for building solutions using evolutionary computation. It presents a simple evolutionary model called a Transition model. It lets building units of a solution interact and then uses an evolutionary process to merge these units toward a full solution for the problem at hand. The paper provides a preliminary study on the evolutionary dynamics of this model as well as an empirical comparison with other evolutionary techniques on binary constraint satisfaction.
JF  - Proceedings of the Genetic and Evolutionary Computation Conference
PB  - {ACM} Press
ER  - 

TY  - CONF
T1  - Binary Merge Model Representation of the Graph Colouring Problem
T2  - Springer Lecture Notes on Computer Science
Y1  - 2004
A1  - Juhos, I.
A1  - Tóth, A.
A1  - van Hemert, J. I.
ED  - J. Gottlieb
ED  - G. Raidl
KW  - constraint satisfaction
KW  - graph colouring
AB  - This paper describes a novel representation and ordering model that aided by an evolutionary algorithm, is used in solving the graph \emph{k}-colouring problem. Its strength lies in reducing the search space by breaking symmetry. An empirical comparison is made with two other algorithms on a standard suit of problem instances and on a suit of instances in the phase transition where it shows promising results.
JF  - Springer Lecture Notes on Computer Science
PB  - Springer-Verlag, Berlin
SN  - 3-540-21367-8
ER  - 

TY  - CONF
T1  - Dynamic Routing Problems with Fruitful Regions: Models and Evolutionary Computation
T2  - LNCS
Y1  - 2004
A1  - van Hemert, J. I.
A1  - la Poutré, J. A.
ED  - Xin Yao
ED  - Edmund Burke
ED  - Jose A. Lozano
ED  - Jim Smith
ED  - Juan J. Merelo-Guerv\'os
ED  - John A. Bullinaria
ED  - Jonathan Rowe
ED  - Peter Ti\v{n}o Ata Kab\'an
ED  - Hans-Paul Schwefel
KW  - dynamic problems
KW  - evolutionary computation
KW  - vehicle routing
AB  - We introduce the concept of fruitful regions in a dynamic routing context: regions that have a high potential of generating loads to be transported. The objective is to maximise the number of loads transported, while keeping to capacity and time constraints. Loads arrive while the problem is being solved, which makes it a real-time routing problem. The solver is a self-adaptive evolutionary algorithm that ensures feasible solutions at all times. We investigate under what conditions the exploration of fruitful regions improves the effectiveness of the evolutionary algorithm.
JF  - LNCS
PB  - Springer-Verlag
CY  - Birmingham, UK
VL  - 3242
SN  - 3-540-23092-0
ER  - 

TY  - CONF
T1  - Phase transition properties of clustered travelling salesman problem instances generated with evolutionary computation
T2  - LNCS
Y1  - 2004
A1  - van Hemert, J. I.
A1  - Urquhart, N. B.
ED  - Xin Yao
ED  - Edmund Burke
ED  - Jose A. Lozano
ED  - Jim Smith
ED  - Juan J. Merelo-Guerv\'os
ED  - John A. Bullinaria
ED  - Jonathan Rowe
ED  - Peter Ti\v{n}o Ata Kab\'an
ED  - Hans-Paul Schwefel
KW  - evolutionary computation
KW  - problem evolving
KW  - travelling salesman
AB  - This paper introduces a generator that creates problem instances for the Euclidean symmetric travelling salesman problem. To fit real world problems, we look at maps consisting of clustered nodes. Uniform random sampling methods do not result in maps where the nodes are spread out to form identifiable clusters. To improve upon this, we propose an evolutionary algorithm that uses the layout of nodes on a map as its genotype. By optimising the spread until a set of constraints is satisfied, we are able to produce better clustered maps, in a more robust way. When varying the number of clusters in these maps and, when solving the Euclidean symmetric travelling salesman person using Chained Lin-Kernighan, we observe a phase transition in the form of an easy-hard-easy pattern.
JF  - LNCS
PB  - Springer-Verlag
CY  - Birmingham, UK
VL  - 3242
SN  - 3-540-23092-0
UR  - http://www.vanhemert.co.uk/files/clustered-phase-transition-tsp.tar.gz
ER  - 

TY  - JOUR
T1  - Robust parameter settings for variation operators by measuring the resampling ratio: A study on binary constraint satisfaction problems
JF  - Journal of Heuristics
Y1  - 2004
A1  - van Hemert, J. I.
A1  - Bäck, T.
KW  - constraint satisfaction
KW  - evolutionary computation
KW  - resampling ratio
AB  - In this article, we try to provide insight into the consequence of mutation and crossover rates when solving binary constraint satisfaction problems. This insight is based on a measurement of the space searched by an evolutionary algorithm. From data empirically acquired we describe the relation between the success ratio and the searched space. This is achieved using the resampling ratio, which is a measure for the amount of points revisited by a search algorithm. This relation is based on combinations of parameter settings for the variation operators. We then show that the resampling ratio is useful for identifying the quality of parameter settings, and provide a range that corresponds to robust parameter settings.
VL  - 10
ER  - 

TY  - CONF
T1  - A Study into Ant Colony Optimization, Evolutionary Computation and Constraint Programming on Binary Constraint Satisfaction Problems
T2  - Springer Lecture Notes on Computer Science
Y1  - 2004
A1  - van Hemert, J. I.
A1  - Solnon, C.
ED  - J. Gottlieb
ED  - G. Raidl
KW  - ant colony optimisation
KW  - constraint programming
KW  - constraint satisfaction
KW  - evolutionary computation
AB  - We compare two heuristic approaches, evolutionary computation and ant colony optimisation, and a complete tree-search approach, constraint programming, for solving binary constraint satisfaction problems. We experimentally show that, if evolutionary computation is far from being able to compete with the two other approaches, ant colony optimisation nearly always succeeds in finding a solution, so that it can actually compete with constraint programming.  The resampling ratio is used to provide insight into heuristic algorithms performances. Regarding efficiency, we show that if constraint programming is the fastest when instances have a low number of variables, ant colony optimisation becomes faster when increasing the number of variables.
JF  - Springer Lecture Notes on Computer Science
PB  - Springer-Verlag, Berlin
SN  - 3-540-21367-8
ER  - 

TY  - JOUR
T1  - Comparing Evolutionary Algorithms on Binary Constraint Satisfaction Problems
JF  - IEEE Transactions on Evolutionary Computation
Y1  - 2003
A1  - Craenen, B. G. W.
A1  - Eiben, A. E.
A1  - van Hemert, J. I.
KW  - constraint satisfaction
AB  - Constraint handling is not straightforward in evolutionary algorithms (EA) since the usual search operators, mutation and recombination, are `blind' to constraints. Nevertheless, the issue is highly relevant, for many challenging problems involve constraints. Over the last decade numerous EAs for solving constraint satisfaction problems (CSP) have been introduced and studied on various problems. The diversity of approaches and the variety of problems used to study the resulting algorithms prevents a fair and accurate comparison of these algorithms. This paper aligns related work by presenting a concise overview and an extensive performance comparison of all these EAs on a systematically generated test suite of random binary CSPs. The random problem instance generator is based on a theoretical model that fixes deficiencies of models and respective generators that have been formerly used in the Evolutionary Computing (EC) field.
VL  - 7
UR  - http://ieeexplore.ieee.org/xpl/abs_free.jsp?isNumber=27734&prod=JNL&arnumber=1237162&arSt=+424&ared=+444&arAuthor=+Craenen%2C+B.G.W.%3B++Eiben%2C+A.E.%3B++van+Hemert%2C+J.I.&arNumber=1237162&a_id0=1237161&a_id1=1237162&a_id2=1237163&a_id3=1237164&a_id4=12
ER  - 

TY  - CONF
T1  - Evolving binary constraint satisfaction problem instances that are difficult to solve
T2  - Proceedings of the IEEE 2003 Congress on Evolutionary Computation
Y1  - 2003
A1  - van Hemert, J. I.
KW  - constraint satisfaction
KW  - problem evolving
AB  - We present a study on the difficulty of solving binary constraint satisfaction problems where an evolutionary algorithm is used to explore the space of problem instances. By directly altering the structure of problem instances and by evaluating the effort it takes to solve them using a complete algorithm we show that the evolutionary algorithm is able to detect problem instances that are harder to solve than those produced with conventional methods. Results from the search of the evolutionary algorithm confirm conjectures about where the most difficult to solve problem instances can be found with respect to the tightness.
JF  - Proceedings of the IEEE 2003 Congress on Evolutionary Computation
PB  - IEEE Press
SN  - 0-7803-7804-0
ER  - 

TY  - CONF
T1  - A new permutation model for solving the graph k-coloring problem
T2  - Kalmàr Workshop on Logic and Computer Science
Y1  - 2003
A1  - Juhos, I.
A1  - Tóth, A.
A1  - Tezuka, M.
A1  - Tann, P.
A1  - van Hemert, J. I.
KW  - constraint satisfaction
KW  - graph colouring
AB  - This paper describes a novel representation and ordering model, that is aided by an evolutionary algorithm, is used in solving the graph k-coloring. A comparison is made between the new representation and an improved version of the traditional graph coloring technique DSATUR on an extensive list of graph k-coloring problem instances with different properties. The results show that our model outperforms the improved DSATUR on most of the problem instances.
JF  - Kalmàr Workshop on Logic and Computer Science
ER  - 

TY  - CONF
T1  - Comparing Classical Methods for Solving Binary Constraint Satisfaction Problems with State of the Art Evolutionary Computation
T2  - Springer Lecture Notes on Computer Science
Y1  - 2002
A1  - van Hemert, J. I.
ED  - S. Cagnoni
ED  - J. Gottlieb
ED  - E. Hart
ED  - M. Middendorf
ED  - G. Raidl
KW  - constraint satisfaction
AB  - Constraint Satisfaction Problems form a class of problems that are generally computationally difficult and have been addressed with many complete and heuristic algorithms. We present two complete algorithms, as well as two evolutionary algorithms, and compare them on randomly generated instances of binary constraint satisfaction prob-lems. We find that the evolutionary algorithms are less effective than the classical techniques.
JF  - Springer Lecture Notes on Computer Science
PB  - Springer-Verlag, Berlin
ER  - 

TY  - CONF
T1  - Measuring the Searched Space to Guide Efficiency: The Principle and Evidence on Constraint Satisfaction
T2  - Springer Lecture Notes on Computer Science
Y1  - 2002
A1  - van Hemert, J. I.
A1  - Bäck, T.
ED  - J. J. Merelo
ED  - A. Panagiotis
ED  - H.-G. Beyer
ED  - Jos{\'e}-Luis Fern{\'a}ndez-Villaca{\~n}as
ED  - Hans-Paul Schwefel
KW  - constraint satisfaction
KW  - resampling ratio
AB  - In this paper we present a new tool to measure the efficiency of evolutionary algorithms by storing the whole searched space of a run, a process whereby we gain insight into the number of distinct points in the state space an algorithm has visited as opposed to the number of function evaluations done within the run. This investigation demonstrates a certain inefficiency of the classical mutation operator with mutation-rate 1/l, where l is the dimension of the state space. Furthermore we present a model for predicting this inefficiency and verify it empirically using the new tool on binary constraint satisfaction problems.
JF  - Springer Lecture Notes on Computer Science
PB  - Springer-Verlag, Berlin
SN  - 3-540-44139-5
ER  - 

TY  - CONF
T1  - Use of Evolutionary Algorithms for Telescope Scheduling
T2  - Integrated Modeling of Telescopes
Y1  - 2002
A1  - Grim, R.
A1  - Jansen, M. L. M.
A1  - Baan, A.
A1  - van Hemert, J. I.
A1  - de Wolf, H.
ED  - Torben Anderson
KW  - constraint satisfaction
KW  - scheduling
AB  - LOFAR, a new radio telescope, will be designed to observe with up to 8 independent beams, thus allowing several simultaneous observations. Scheduling of multiple observations parallel in time, each having their own constraints, requires a more intelligent and flexible scheduling function then operated before.   	  In support of the LOFAR radio telescope project, and in co-operation with Leiden University, Fokker Space has started a study to investigate the suitability of the use of evolutionary algorithms applied to complex scheduling problems. After a positive familiarisation phase, we now examine the potential use of evolutionary algorithms via a demonstration project. Results of the familiarisation phase, and the first results of the demonstration project are presented in this paper.
JF  - Integrated Modeling of Telescopes
PB  - The International Society for Optical Engineering ({SPIE})
VL  - 4757
ER  - 

TY  - CONF
T1  - Adaptive Genetic Programming Applied to New and Existing Simple Regression Problems
T2  - Springer Lecture Notes on Computer Science
Y1  - 2001
A1  - Eggermont, J.
A1  - van Hemert, J. I.
ED  - J. Miller
ED  - Tomassini, M.
ED  - P. L. Lanzi
ED  - C. Ryan
ED  - A. G. B. Tettamanzi
ED  - W. B. Langdon
KW  - data mining
AB  - In this paper we continue our study on adaptive genetic pro-gramming. We use Stepwise Adaptation of Weights to boost performance of a genetic programming algorithm on simple symbolic regression problems. We measure the performance of a standard GP and two variants of SAW extensions on two different symbolic regression prob-lems from literature. Also, we propose a model for randomly generating polynomials which we then use to further test all three GP variants.
JF  - Springer Lecture Notes on Computer Science
PB  - Springer-Verlag, Berlin
SN  - 9-783540-418993
ER  - 

TY  - CONF
T1  - An Engineering Approach to Evolutionary Art
T2  - Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001)
Y1  - 2001
A1  - van Hemert, J. I.
A1  - Jansen, M. L. M.
ED  - Lee Spector
ED  - Erik D. Goodman
ED  - Annie Wu
ED  - W. B. Langdon
ED  - Hans-Michael Voigt
ED  - Mitsuo Gen
ED  - Sandip Sen
ED  - Marco Dorigo
ED  - Shahram Pezeshk
ED  - Max H. Garzon
ED  - Edmund Burke
KW  - evolutionary art
AB  - We present a general system that evolves art on the Internet. The system runs on a server which enables it to collect information about its usage world wide; its core uses operators and representations from genetic program-ming. We show two types of art that can be evolved using this general system.
JF  - Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001)
PB  - Morgan Kaufmann Publishers, San Francisco
ER  - 

TY  - CONF
T1  - Evolutionary Computation in Constraint Satisfaction and Machine Learning --- An abstract of my PhD.
T2  - Proceedings of the Brussels Evolutionary Algorithms Day (BEAD-2001)
Y1  - 2001
A1  - van Hemert, J. I.
ED  - Anne Defaweux
ED  - Bernard Manderick
ED  - Tom Lenearts
ED  - Johan Parent
ED  - Piet van Remortel
KW  - constraint satisfaction
KW  - data mining
JF  - Proceedings of the Brussels Evolutionary Algorithms Day (BEAD-2001)
PB  - Vrije Universiteit Brussel (VUB)
ER  - 

TY  - CONF
T1  - A ``Futurist'' approach to dynamic environments
T2  - Proceedings of the Workshops at the Genetic and Evolutionary Computation Conference, Dynamic Optimization Problems
Y1  - 2001
A1  - van Hemert, J. I.
A1  - Van Hoyweghen, C.
A1  - Lukschandl, E.
A1  - Verbeeck, K.
ED  - J. Branke
ED  - Th. B{\"a}ck
KW  - dynamic problems
AB  - The optimization of dynamic environments has proved to be a difficult area for Evolutionary Algorithms. As standard haploid populations find it difficult to track a moving target, diffKerent schemes have been suggested to improve the situation. We study a novel approach by making use of a meta learner which tries to predict the next state of the environment, i.e. the next value of the goal the individuals have to achieve, by making use of the accumulated knowledge from past performance.
JF  - Proceedings of the Workshops at the Genetic and Evolutionary Computation Conference, Dynamic Optimization Problems
PB  - Morgan Kaufmann Publishers, San Francisco
ER  - 

TY  - CONF
T1  - Constraint Satisfaction Problems and Evolutionary Algorithms: A Reality Check
T2  - Proceedings of the Twelfth Belgium/Netherlands Conference on Artificial Intelligence (BNAIC'00)
Y1  - 2000
A1  - van Hemert, J. I.
ED  - van den Bosch, A.
ED  - H. Weigand
KW  - constraint satisfaction
AB  - Constraint satisfaction has been the subject of many studies. Different areas of research have tried to solve all kind of constraint problems. Here we will look at a general model for constraint satisfaction problems in the form of binary constraint satisfaction. The problems generated from this model are studied in the research area of constraint programming and in the research area of evolutionary computation. This paper provides an empirical comparison of two techniques from each area. Basically, this is a check on how well both areas are doing. It turns out that, although evolutionary algorithms are doing well, classic approaches are still more successful.
JF  - Proceedings of the Twelfth Belgium/Netherlands Conference on Artificial Intelligence (BNAIC'00)
PB  - BNVKI, Dutch and the Belgian AI Association
ER  - 

TY  - JOUR
T1  - De Creatieve Computer
JF  - AIgg Kennisgeving
Y1  - 2000
A1  - van Hemert, J. I.
KW  - evolutionary art
AB  - Here we show an application that generates images resembling art as it was produced by Mondriaan, a Dutch artist, well known for his minimalistic and pure abstract pieces of art. The current version generates images using a linear chromosome and a recursive function as a decoder.
PB  - Artifici{\"e}le Intelligentie gebruikers groep
VL  - 13
N1  - invited article (in Dutch)
ER  - 

TY  - CONF
T1  - Stepwise Adaptation of Weights for Symbolic Regression with Genetic Programming
T2  - Proceedings of the Twelfth Belgium/Netherlands Conference on Artificial Intelligence (BNAIC'00)
Y1  - 2000
A1  - Eggermont, J.
A1  - van Hemert, J. I.
ED  - van den Bosch, A.
ED  - H. Weigand
KW  - data mining
KW  - genetic programming
AB  - In this paper we continue study on the Stepwise Adaptation of Weights (SAW) technique. Previous studies on constraint satisfaction and data clas-sification have indicated that SAW is a promising technique to boost the performance of evolutionary algorithms. Here we use SAW to boost per-formance of a genetic programming algorithm on simple symbolic regression problems. We measure the performance of a standard GP and two variants of SAW extensions on two different symbolic regression problems.
JF  - Proceedings of the Twelfth Belgium/Netherlands Conference on Artificial Intelligence (BNAIC'00)
PB  - BNVKI, Dutch and the Belgian AI Association
ER  - 

TY  - CONF
T1  - Adapting the Fitness Function in GP for Data Mining
T2  - Springer Lecture Notes on Computer Science
Y1  - 1999
A1  - Eggermont, J.
A1  - Eiben, A. E.
A1  - van Hemert, J. I.
ED  - R. Poli
ED  - P. Nordin
ED  - W. B. Langdon
ED  - T. C. Fogarty
KW  - data mining
KW  - genetic programming
AB  - In this paper we describe how the Stepwise Adaptation of Weights (SAW) technique can be applied in genetic programming. The SAW-ing mechanism has been originally developed for and successfully used in EAs for constraint satisfaction problems. Here we identify the very basic underlying ideas behind SAW-ing and point out how it can be used for different types of problems. In particular, SAW-ing is well suited for data mining tasks where the fitness of a candidate solution is composed by `local scores' on data records. We evaluate the power of the SAW-ing mechanism on a number of benchmark classification data sets. The results indicate that extending the GP with the SAW-ing feature increases its performance when different types of misclassifications are not weighted differently, but leads to worse results when they are.
JF  - Springer Lecture Notes on Computer Science
PB  - Springer-Verlag, Berlin
SN  - 3-540-65899-8
ER  - 

TY  - CONF
T1  - Comparing genetic programming variants for data classification
T2  - Proceedings of the Eleventh Belgium/Netherlands Conference on Artificial Intelligence (BNAIC'99)
Y1  - 1999
A1  - Eggermont, J.
A1  - Eiben, A. E.
A1  - van Hemert, J. I.
ED  - E. Postma
ED  - M. Gyssens
KW  - classification
KW  - data mining
KW  - genetic programming
AB  - This article is a combined summary of two papers written by the authors.  Binary data classification problems (with exactly two disjoint classes) form an important application area of machine learning techniques, in particular genetic programming (GP). In this study we compare a number of different variants of GP applied to such problems whereby we investigate the effect of two significant changes in a fixed GP setup in combination with two different evolutionary models
JF  - Proceedings of the Eleventh Belgium/Netherlands Conference on Artificial Intelligence (BNAIC'99)
PB  - BNVKI, Dutch and the Belgian AI Association
ER  - 

TY  - CONF
T1  - A comparison of genetic programming variants for data classification
T2  - Springer Lecture Notes on Computer Science
Y1  - 1999
A1  - Eggermont, J.
A1  - Eiben, A. E.
A1  - van Hemert, J. I.
ED  - D. J. Hand
ED  - J. N. Kok
ED  - M. R. Berthold
KW  - classification
KW  - data mining
KW  - genetic programming
AB  - In this paper we report the results of a comparative study on different variations of genetic programming applied on binary data classification problems. The first genetic programming variant is weighting data records for calculating the classification error and modifying the weights during the run. Hereby the algorithm is defining its own fitness function in an on-line fashion giving higher weights to `hard' records. Another novel feature we study is the atomic representation, where `Booleanization' of data is not performed at the root, but at the leafs of the trees and only Boolean functions are used in the trees' body.  As a third aspect we look at generational and steady-state models in combination of both features.
JF  - Springer Lecture Notes on Computer Science
PB  - Springer-Verlag, Berlin
SN  - 3-540-66332-0
ER  - 

TY  - CONF
T1  - Mondriaan Art by Evolution
T2  - Proceedings of the Eleventh Belgium/Netherlands Conference on Artificial Intelligence (BNAIC'99)
Y1  - 1999
A1  - van Hemert, J. I.
A1  - Eiben, A. E.
ED  - E. Postma
ED  - M. Gyssens
KW  - evolutionary art
AB  - Here we show an application that generates images resembling art as it was produced by Mondriaan, a Dutch artist, well known for his minimalistic and pure abstract pieces of art. The current version generates images using a linear chromosome and a recursive function as a decoder.
JF  - Proceedings of the Eleventh Belgium/Netherlands Conference on Artificial Intelligence (BNAIC'99)
PB  - BNVKI, Dutch and the Belgian AI Association
ER  - 

TY  - CONF
T1  - Population dynamics and emerging features in AEGIS
T2  - Proceedings of the Genetic and Evolutionary Computation Conference
Y1  - 1999
A1  - Eiben, A. E.
A1  - Elia, D.
A1  - van Hemert, J. I.
ED  - W. Banzhaf
ED  - J. Daida
ED  - Eiben, A. E.
ED  - M. H. Garzon
ED  - V. Honavar
ED  - M. Jakiela
ED  - R. E. Smith
KW  - dynamic problems
AB  - We describe an empirical investigation within an artificial world, aegis, where a population of animals and plants is evolving. We compare different system setups in search of an `ideal' world that allows a constantly high number of inhabitants for a long period of time. We observe that high responsiveness at individual level (speed of movement) or population level (high fertility) are `ideal'.  Furthermore, we investigate the emergence of the so-called mental features of animals determining their social, consumptional and aggressive behaviour. The tests show that being socially oriented is generally advantageous, while agressive behaviour only emerges under specific circumstances.
JF  - Proceedings of the Genetic and Evolutionary Computation Conference
PB  - Morgan Kaufmann Publishers, San Francisco
ER  - 

TY  - CHAP
T1  - SAW-ing EAs: adapting the fitness function for solving constrained problems
T2  - New ideas in optimization
Y1  - 1999
A1  - Eiben, A. E.
A1  - van Hemert, J. I.
ED  - D. Corne
ED  - M. Dorigo
ED  - F. Glover
KW  - constraint satisfaction
AB  - In this chapter we describe a problem independent method for treating constrain ts in an evolutionary algorithm. Technically, this method amounts to changing the defini tion of the fitness function during a run of an EA, based on feedback from the search pr ocess. Obviously, redefining the fitness function means redefining the problem to be sol ved. On the short term this deceives the algorithm making the fitness values deteriorate , but as experiments clearly indicate, on the long run it is beneficial. We illustrate t he power of the method on different constraint satisfaction problems and point out other application areas of this technique.
JF  - New ideas in optimization
PB  - McGraw-Hill, London
ER  - 

TY  - CONF
T1  - Extended abstract: Solving Binary Constraint Satisfaction Problems using Evolutionary Algorithms with an Adaptive Fitness Function
T2  - Proceedings of the Xth Netherlands/Belgium Conference on Artificial Intelligence (NAIC'98)
Y1  - 1998
A1  - Eiben, A. E.
A1  - van Hemert, J. I.
A1  - Marchiori, E.
A1  - Steenbeek, A. G.
ED  - la Poutré, J. A.
ED  - van den Herik, J.
KW  - constraint satisfaction
JF  - Proceedings of the Xth Netherlands/Belgium Conference on Artificial Intelligence (NAIC'98)
PB  - BNVKI, Dutch and the Belgian AI Association
N1  - Abstract of \cite{EHMS98}
ER  - 

TY  - JOUR
T1  - Graph Coloring with Adaptive Evolutionary Algorithms
JF  - Journal of Heuristics
Y1  - 1998
A1  - Eiben, A. E.
A1  - van der Hauw, J. K.
A1  - van Hemert, J. I.
KW  - constraint satisfaction
KW  - graph colouring
AB  - This paper presents the results of an experimental investigation on solving graph coloring problems with Evolutionary Algorithms (EA). After testing different algorithm variants we conclude that the best option is an asexual EA using order-based representation and an adaptation mechanism that periodically changes the fitness function during the evolution.  This adaptive EA is general, using no domain specific knowledge, except, of course, from the decoder (fitness function). We compare this adaptive EA to a powerful traditional graph coloring technique DSatur and the Grouping GA on a wide range of problem instances with different size, topology and edge density. The results show that the adaptive EA is superior to the Grouping GA and outperforms DSatur on the hardest problem instances. Furthermore, it scales up better with the problem size than the other two algorithms and indicates a linear computational complexity.
PB  - Kluwer Academic Publishers
VL  - 4
ER  - 

TY  - CONF
T1  - Solving Binary Constraint Satisfaction Problems using Evolutionary Algorithms with an Adaptive Fitness Function
T2  - Springer Lecture Notes on Computer Science
Y1  - 1998
A1  - Eiben, A. E.
A1  - van Hemert, J. I.
A1  - Marchiori, E.
A1  - Steenbeek, A. G.
ED  - Eiben, A. E.
ED  - Th. B{\"a}ck
ED  - M. Schoenauer
ED  - H.-P. Schwefel
KW  - constraint satisfaction
AB  - This paper presents a comparative study of Evolutionary Algorithms (EAs) for Constraint Satisfaction Problems (CSPs). We focus on EAs where fitness is based on penalization of constraint violations and the penalties are adapted during the execution. Three different EAs based on this approach are implemented. For highly connected constraint networks, the results provide further empirical support to the theoretical prediction of the phase transition in binary CSPs.
JF  - Springer Lecture Notes on Computer Science
PB  - Springer-Verlag, Berlin
ER  -