TY - CONF T1 - C2MS: Dynamic Monitoring and Management of Cloud Infrastructures T2 - IEEE CloudCom Y1 - 2013 A1 - Gary McGilvary A1 - Josep Rius A1 - Íñigo Goiri A1 - Francesc Solsona A1 - Barker, Adam A1 - Atkinson, Malcolm P. AB - Server clustering is a common design principle employed by many organisations who require high availability, scalability and easier management of their infrastructure. Servers are typically clustered according to the service they provide whether it be the application(s) installed, the role of the server or server accessibility for example. In order to optimize performance, manage load and maintain availability, servers may migrate from one cluster group to another making it difficult for server monitoring tools to continuously monitor these dynamically changing groups. Server monitoring tools are usually statically configured and with any change of group membership requires manual reconfiguration; an unreasonable task to undertake on large-scale cloud infrastructures. In this paper we present the Cloudlet Control and Management System (C2MS); a system for monitoring and controlling dynamic groups of physical or virtual servers within cloud infrastructures. The C2MS extends Ganglia - an open source scalable system performance monitoring tool - by allowing system administrators to define, monitor and modify server groups without the need for server reconfiguration. In turn administrators can easily monitor group and individual server metrics on large-scale dynamic cloud infrastructures where roles of servers may change frequently. Furthermore, we complement group monitoring with a control element allowing administrator-specified actions to be performed over servers within service groups as well as introduce further customized monitoring metrics. This paper outlines the design, implementation and evaluation of the C2MS. JF - IEEE CloudCom CY - Bristol, UK ER - TY - BOOK T1 - The DATA Bonanza: Improving Knowledge Discovery in Science, Engineering, and Business T2 - Wiley Series on Parallel and Distributed Computing (Editor: Albert Y. Zomaya) Y1 - 2013 A1 - Atkinson, Malcolm P. A1 - Baxter, Robert M. A1 - Peter Brezany A1 - Oscar Corcho A1 - Michelle Galea A1 - Parsons, Mark A1 - Snelling, David A1 - van Hemert, Jano KW - Big Data KW - Data Intensive KW - data mining KW - Data Streaming KW - Databases KW - Dispel KW - Distributed Computing KW - Knowledge Discovery KW - Workflows AB - With the digital revolution opening up tremendous opportunities in many fields, there is a growing need for skilled professionals who can develop data-intensive systems and extract information and knowledge from them. This book frames for the first time a new systematic approach for tackling the challenges of data-intensive computing, providing decision makers and technical experts alike with practical tools for dealing with our exploding data collections. Emphasising data-intensive thinking and interdisciplinary collaboration, The DATA Bonanza: Improving Knowledge Discovery in Science, Engineering, and Business examines the essential components of knowledge discovery, surveys many of the current research efforts worldwide, and points to new areas for innovation. Complete with a wealth of examples and DISPEL-based methods demonstrating how to gain more from data in real-world systems, the book: * Outlines the concepts and rationale for implementing data-intensive computing in organisations * Covers from the ground up problem-solving strategies for data analysis in a data-rich world * Introduces techniques for data-intensive engineering using the Data-Intensive Systems Process Engineering Language DISPEL * Features in-depth case studies in customer relations, environmental hazards, seismology, and more * Showcases successful applications in areas ranging from astronomy and the humanities to transport engineering * Includes sample program snippets throughout the text as well as additional materials on a companion website The DATA Bonanza is a must-have guide for information strategists, data analysts, and engineers in business, research, and government, and for anyone wishing to be on the cutting edge of data mining, machine learning, databases, distributed systems, or large-scale computing. JF - Wiley Series on Parallel and Distributed Computing (Editor: Albert Y. Zomaya) PB - John Wiley & Sons Inc. SN - 978-1-118-39864-7 ER - TY - JOUR T1 - Lesion Area Detection Using Source Image Correlation Coefficient for CT Perfusion Imaging JF - IEEE Journal of Biomedical and Health Informatics Y1 - 2013 A1 - Fan Zhu A1 - Rodríguez, David A1 - Carpenter, Trevor K. A1 - Atkinson, Malcolm P. A1 - Wardlaw, Joanna M. KW - CT , Pattern Recognition , Perfusion Source Images , Segmentation AB - Computer tomography (CT) perfusion imaging is widely used to calculate brain hemodynamic quantities such as Cerebral Blood Flow (CBF), Cerebral Blood Volume (CBV) and Mean Transit Time (MTT) that aid the diagnosis of acute stroke. Since perfusion source images contain more information than hemodynamic maps, good utilisation of the source images can lead to better understanding than the hemodynamic maps alone. Correlation-coefficient tests are used in our approach to measure the similarity between healthy tissue time-concentration curves and unknown curves. This information is then used to differentiate penumbra and dead tissues from healthy tissues. The goal of the segmentation is to fully utilize information in the perfusion source images. Our method directly identifies suspected abnormal areas from perfusion source images and then delivers a suggested segmentation of healthy, penumbra and dead tissue. This approach is designed to handle CT perfusion images, but it can also be used to detect lesion areas in MR perfusion images. VL - 17 IS - 5 ER - TY - JOUR T1 - Data-Intensive Architecture for Scientific Knowledge Discovery JF - Distributed and Parallel Databases Y1 - 2012 A1 - Atkinson, Malcolm P. A1 - Chee Sun Liew A1 - Michelle Galea A1 - Paul Martin A1 - Krause, Amrey A1 - Adrian Mouat A1 - Oscar Corcho A1 - Snelling, David KW - Knowledge discovery, workflow management system AB - This paper presents a data-intensive architecture that demonstrates the ability to support applications from a wide range of application domains, and support the different types of users involved in defining, designing and executing data-intensive processing tasks. The prototype architecture is introduced, and the pivotal role of DISPEL as a canonical language is explained. The architecture promotes the exploration and exploitation of distributed and heterogeneous data and spans the complete knowledge discovery process, from data preparation, to analysis, to evaluation and reiteration. The architecture evaluation included large-scale applications from astronomy, cosmology, hydrology, functional genetics, imaging processing and seismology. VL - 30 UR - http://dx.doi.org/10.1007/s10619-012-7105-3 IS - 5 ER - TY - JOUR T1 - Performance database: capturing data for optimizing distributed streaming workflows JF - Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences Y1 - 2011 A1 - Chee Sun Liew A1 - Atkinson, Malcolm P. A1 - Radoslaw Ostrowski A1 - Murray Cole A1 - van Hemert, Jano I. A1 - Liangxiu Han KW - measurement framework KW - performance data KW - streaming workflows AB - The performance database (PDB) stores performance-related data gathered during workflow enactment. We argue that by carefully understanding and manipulating this data, we can improve efficiency when enacting workflows. This paper describes the rationale behind the PDB, and proposes a systematic way to implement it. The prototype is built as part of the Advanced Data Mining and Integration Research for Europe project. We use workflows from real-world experiments to demonstrate the usage of PDB. VL - 369 IS - 1949 ER - TY - CONF T1 - Towards Optimising Distributed Data Streaming Graphs using Parallel Streams T2 - Data Intensive Distributed Computing (DIDC'10), in conjunction with the 19th International Symposium on High Performance Distributed Computing Y1 - 2010 A1 - Chee Sun Liew A1 - Atkinson, Malcolm P. A1 - van Hemert, Jano A1 - Liangxiu Han KW - Data-intensive Computing KW - Distributed Computing KW - Optimisation KW - Parallel Stream KW - Scientific Workflows AB - Modern scientific collaborations have opened up the opportunity of solving complex problems that involve multi- disciplinary expertise and large-scale computational experiments. These experiments usually involve large amounts of data that are located in distributed data repositories running various software systems, and managed by different organisations. A common strategy to make the experiments more manageable is executing the processing steps as a workflow. In this paper, we look into the implementation of fine-grained data-flow between computational elements in a scientific workflow as streams. We model the distributed computation as a directed acyclic graph where the nodes represent the processing elements that incrementally implement specific subtasks. The processing elements are connected in a pipelined streaming manner, which allows task executions to overlap. We further optimise the execution by splitting pipelines across processes and by introducing extra parallel streams. We identify performance metrics and design a measurement tool to evaluate each enactment. We conducted ex- periments to evaluate our optimisation strategies with a real world problem in the Life Sciences—EURExpress-II. The paper presents our distributed data-handling model, the optimisation and instrumentation strategies and the evaluation experiments. We demonstrate linear speed up and argue that this use of data-streaming to enable both overlapped pipeline and parallelised enactment is a generally applicable optimisation strategy. JF - Data Intensive Distributed Computing (DIDC'10), in conjunction with the 19th International Symposium on High Performance Distributed Computing PB - ACM CY - Chicago, Illinois UR - http://www.cct.lsu.edu/~kosar/didc10/index.php ER - TY - CONF T1 - Automating Gene Expression Annotation for Mouse Embryo T2 - Lecture Notes in Computer Science (Advanced Data Mining and Applications, 5th International Conference) Y1 - 2009 A1 - Liangxiu Han A1 - van Hemert, Jano A1 - Richard Baldock A1 - Atkinson, Malcolm P. ED - Ronghuai Huang ED - Qiang Yang ED - Jian Pei ED - et al JF - Lecture Notes in Computer Science (Advanced Data Mining and Applications, 5th International Conference) PB - Springer VL - LNAI 5678 ER - TY - CONF T1 - A Distributed Architecture for Data Mining and Integration T2 - Data-Aware Distributed Computing (DADC'09), in conjunction with the 18th International Symposium on High Performance Distributed Computing Y1 - 2009 A1 - Atkinson, Malcolm P. A1 - van Hemert, Jano A1 - Liangxiu Han A1 - Ally Hume A1 - Chee Sun Liew AB - This paper presents the rationale for a new architecture to support a significant increase in the scale of data integration and data mining. It proposes the composition into one framework of (1) data mining and (2) data access and integration. We name the combined activity “DMI”. It supports enactment of DMI processes across heterogeneous and distributed data resources and data mining services. It posits that a useful division can be made between the facilities established to support the definition of DMI processes and the computational infrastructure provided to enact DMI processes. Communication between those two divisions is restricted to requests submitted to gateway services in a canonical DMI language. Larger-scale processes are enabled by incremental refinement of DMI-process definitions often by recomposition of lower-level definitions. Autonomous types and descriptions which will support detection of inconsistencies and semi-automatic insertion of adaptations.These architectural ideas are being evaluated in a feasibility study that involves an application scenario and representatives of the community. JF - Data-Aware Distributed Computing (DADC'09), in conjunction with the 18th International Symposium on High Performance Distributed Computing PB - ACM ER - TY - RPRT T1 - An e-Infrastructure for Collaborative Research in Human Embryo Development Y1 - 2009 A1 - Barker, Adam A1 - van Hemert, Jano I. A1 - Baldock, Richard A. A1 - Atkinson, Malcolm P. AB - Within the context of the EU Design Study Developmental Gene Expression Map, we identify a set of challenges when facilitating collaborative research on early human embryo development. These challenges bring forth requirements, for which we have identified solutions and technology. We summarise our solutions and demonstrate how they integrate to form an e-infrastructure to support collaborative research in this area of developmental biology. UR - http://arxiv.org/pdf/0901.2310v1 ER - TY - CONF T1 - An E-infrastructure to Support Collaborative Embryo Research T2 - Cluster Computing and the Grid Y1 - 2009 A1 - Barker, Adam A1 - van Hemert, Jano I. A1 - Baldock, Richard A. A1 - Atkinson, Malcolm P. JF - Cluster Computing and the Grid PB - IEEE Computer Society SN - 978-0-7695-3622-4 ER - TY - CONF T1 - OGSA-DAI: Middleware for Data Integration: Selected Applications T2 - ESCIENCE '08: Proceedings of the 2008 Fourth IEEE International Conference on eScience Y1 - 2008 A1 - Grant, Alistair A1 - Antonioletti, Mario A1 - Hume, Alastair C. A1 - Krause, Amy A1 - Dobrzelecki, Bartosz A1 - Jackson, Michael J. A1 - Parsons, Mark A1 - Atkinson, Malcolm P. A1 - Theocharopoulos, Elias JF - ESCIENCE '08: Proceedings of the 2008 Fourth IEEE International Conference on eScience PB - IEEE Computer Society CY - Washington, DC, USA SN - 978-0-7695-3535-7 ER - TY - CONF T1 - EGEE: building a pan-European grid training organisation T2 - ACSW Frontiers Y1 - 2006 A1 - Berlich, R{\"u}diger A1 - Hardt, Marcus A1 - Kunze, Marcel A1 - Atkinson, Malcolm P. A1 - Fergusson, David JF - ACSW Frontiers ER - TY - JOUR T1 - The design and implementation of Grid database services in OGSA-DAI JF - Concurrency - Practice and Experience Y1 - 2005 A1 - Antonioletti, Mario A1 - Atkinson, Malcolm P. A1 - Baxter, Robert M. A1 - Borley, Andrew A1 - Hong, Neil P. Chue A1 - Collins, Brian A1 - Hardman, Neil A1 - Hume, Alastair C. A1 - Knox, Alan A1 - Mike Jackson A1 - Krause, Amrey A1 - Laws, Simon A1 - Magowan, James A1 - Pato VL - 17 ER - TY - JOUR T1 - Web Service Grids: an evolutionary approach JF - Concurrency - Practice and Experience Y1 - 2005 A1 - Atkinson, Malcolm P. A1 - Roure, David De A1 - Dunlop, Alistair N. A1 - Fox, Geoffrey A1 - Henderson, Peter A1 - Hey, Anthony J. G. A1 - Paton, Norman W. A1 - Newhouse, Steven A1 - Parastatidis, Savas A1 - Trefethen, Anne E. A1 - Watson, Paul A1 - Webber, Jim VL - 17 ER - TY - CONF T1 - Grid-Based Metadata Services T2 - SSDBM Y1 - 2004 A1 - Deelman, Ewa A1 - Singh, Gurmeet Singh A1 - Atkinson, Malcolm P. A1 - Chervenak, Ann L. A1 - Hong, Neil P. Chue A1 - Kesselman, Carl A1 - Patil, Sonal A1 - Pearlman, Laura A1 - Su, Mei-Hui JF - SSDBM ER - TY - CONF T1 - Databases and the Grid: Who Challenges Whom? T2 - BNCOD Y1 - 2003 A1 - Atkinson, Malcolm P. JF - BNCOD ER - TY - JOUR T1 - The pervasiveness of evolution in GRUMPS software JF - Softw., Pract. Exper. Y1 - 2003 A1 - Evans, Huw A1 - Atkinson, Malcolm P. A1 - Brown, Margaret A1 - Cargill, Julie A1 - Crease, Murray A1 - Draper, Steve A1 - Gray, Philip D. A1 - Thomas, Richard VL - 33 ER - TY - JOUR T1 - Database indexing for large DNA and protein sequence collections JF - VLDB J. Y1 - 2002 A1 - Hunt, Ela A1 - Atkinson, Malcolm P. A1 - Irving, Robert W. VL - 11 ER - TY - CONF T1 - A Database Index to Large Biological Sequences T2 - VLDB Y1 - 2001 A1 - Hunt, Ela A1 - Atkinson, Malcolm P. A1 - Irving, Robert W. JF - VLDB ER - TY - JOUR T1 - An efficient object promotion algorithm for persistent object systems JF - Softw., Pract. Exper. Y1 - 2001 A1 - Printezis, Tony A1 - Atkinson, Malcolm P. VL - 31 ER - TY - JOUR T1 - Guest editorial JF - VLDB J. Y1 - 2000 A1 - Atkinson, Malcolm P. VL - 9 ER - TY - CONF T1 - Persistence and Java - A Balancing Act T2 - Objects and Databases Y1 - 2000 A1 - Atkinson, Malcolm P. JF - Objects and Databases ER - TY - CONF T1 - Scalable and Recoverable Implementation of Object Evolution for the PJama1 Platform T2 - POS Y1 - 2000 A1 - Atkinson, Malcolm P. A1 - Dmitriev, Misha A1 - Hamilton, Craig A1 - Printezis, Tony JF - POS ER - TY - CONF T1 - Defining and Handling Transient Fields in PJama T2 - DBPL Y1 - 1999 A1 - Printezis, Tony A1 - Atkinson, Malcolm P. A1 - Jordan, Mick J. JF - DBPL ER - TY - CONF T1 - Evolutionary Data Conversion in the PJama Persistent Language T2 - ECOOP Workshop on Object-Oriented Databases Y1 - 1999 A1 - Dmitriev, Misha A1 - Atkinson, Malcolm P. JF - ECOOP Workshop on Object-Oriented Databases ER - TY - CONF T1 - Evolutionary Data Conversion in the PJama Persistent Language T2 - ECOOP Workshops Y1 - 1999 A1 - Dmitriev, Misha A1 - Atkinson, Malcolm P. JF - ECOOP Workshops ER - TY - CONF T1 - Issues Raised by Three Years of Developing PJama: An Orthogonally Persistent Platform for Java T2 - ICDT Y1 - 1999 A1 - Atkinson, Malcolm P. A1 - Jordan, Mick J. JF - ICDT ER - TY - Generic T1 - VLDB'99, Proceedings of 25th International Conference on Very Large Data Bases, September 7-10, 1999, Edinburgh, Scotland, UK Y1 - 1999 A1 - Atkinson, Malcolm P. A1 - Maria E. Orlowska A1 - Patrick Valduriez A1 - Stanley B. Zdonik A1 - Michael L. Brodie ED - Atkinson, Malcolm P. ED - Maria E. Orlowska ED - Patrick Valduriez ED - Stanley B. Zdonik ED - Michael L. Brodie PB - Morgan Kaufmann SN - 1-55860-615-7 ER -