TY - JOUR T1 - OMERO: flexible, model-driven data management for experimental biology JF - NATURE METHODS Y1 - 2012 A1 - Chris Allan A1 - Jean-Marie Burel A1 - Josh Moore A1 - Colin Blackburn A1 - Melissa Linkert A1 - Scott Loynton A1 - Donald MacDonald A1 - et al. AB - Data-intensive research depends on tools that manage multidimensional, heterogeneous datasets. We built OME Remote Objects (OMERO), a software platform that enables access to and use of a wide range of biological data. OMERO uses a server-based middleware application to provide a unified interface for images, matrices and tables. OMERO's design and flexibility have enabled its use for light-microscopy, high-content-screening, electron-microscopy and even non-image-genotype data. OMERO is open-source software, available at http://openmicroscopy.org/. PB - Nature Publishing Group, a division of Macmillan Publishers Limited. All Rights Reserved. VL - 9 SN - 1548-7091 UR - http://dx.doi.org/10.1038/nmeth.1896 IS - 3 ER - TY - BOOK T1 - Optimisation of the enactment of fine-grained distributed data-intensive workflows Y1 - 2012 A1 - Chee Sun Liew AB - The emergence of data-intensive science as the fourth science paradigm has posed a data deluge challenge for enacting scientific workflows. The scientific community is facing an imminent flood of data from the next generation of experiments and simulations, besides dealing with the heterogeneity and complexity of data, applications and execution environments. New scientific workflows involve execution on distributed and heterogeneous computing resources across organisational and geographical boundaries, processing gigabytes of live data streams and petabytes of archived and simulation data, in various formats and from multiple sources. Managing the enactment of such workflows not only requires larger storage space and faster machines, but the capability to support scalability and diversity of the users, applications, data, computing resources and the enactment technologies. We argue that the enactment process can be made efficient using optimisation techniques in an appropriate architecture. This architecture should support the creation of diversified applications and their enactment on diversified execution environments, with a standard interface, i.e.~a workflow language. The workflow language should be both human readable and suitable for communication between the enactment environments. The data-streaming model central to this architecture provides a scalable approach to large-scale data exploitation. Data-flow between computational elements in the scientific workflow is implemented as streams. To cope with the exploratory nature of scientific workflows, the architecture should support fast workflow prototyping, and the re-use of workflows and workflow components. Above all, the enactment process should be easily repeated and automated. In this thesis, we present a candidate data-intensive architecture that includes an intermediate workflow language, named DISPEL. We create a new fine-grained measurement framework to capture performance-related data during enactments, and design a performance database to organise them systematically. We propose a new enactment strategy to demonstrate that optimisation of data-streaming workflows can be automated by exploiting performance data gathered during previous enactments. PB - The University of Edinburgh CY - Edinburgh ER - TY - CONF T1 - Optimum Platform Selection and Configuration for Computational Jobs T2 - All Hands Meeting 2011 Y1 - 2011 A1 - Gary McGilvary A1 - Malcolm Atkinson A1 - Barker, Adam A1 - Ashley Lloyd AB - The performance and cost of many scientific applications which execute on a variety of High Performance Computing (HPC), local cluster environments and cloud services could be enhanced, and costs reduced if the platform was carefully selected on a per-application basis and the application itself was optimally configured for a given platform. With a wide-variety of computing platforms on offer, each possessing different properties, all too frequently platform decisions are made on an ad-hoc basis with limited ‘black-box’ information. The limitless number of possible application configurations also make it difficult for an individual who wants to achieve cost-effective results with the maximum performance available. Such individuals may include biomedical researchers analysing microarray data, software developers running aviation simulations or bankers performing risk assessments. However in either case, it is likely that many may not have the required knowledge to select the optimum platform and setup for their application; to do so, would require extensive knowledge of their applications and various platforms. In this paper we describe a framework that aims to resolve such issues by (i) reducing the detail required in the decision making process by placing this information within a selection framework, thereby (ii) maximising an application’s performance gain and/or reducing costs. We present a set of preliminary results where we compare the performance of running the Simple Parallel R INTerface (SPRINT) over a variety of platforms. SPRINT is a framework providing parallel functions of the statistical package R, allowing post genomic data to be easily analysed on HPC resources [1]. We run SPRINT on Amazon’s Elastic Compute Cloud (EC2) to compare the performance with the results obtained from HECToR, the UK’s National Supercomputing Service, and the Edinburgh Compute and Data Facilities (ECDF) cluster. JF - All Hands Meeting 2011 CY - York ER - TY - JOUR T1 - An open source toolkit for medical imaging de-identification JF - European Radiology Y1 - 2010 A1 - Rodríguez, David A1 - Carpenter, Trevor K. A1 - van Hemert, Jano I. A1 - Wardlaw, Joanna M. KW - Anonymisation KW - Data Protection Act (DPA) KW - De-identification KW - Digital Imaging and Communications in Medicine (DICOM) KW - Privacy policies KW - Pseudonymisation KW - Toolkit AB - Objective Medical imaging acquired for clinical purposes can have several legitimate secondary uses in research projects and teaching libraries. No commonly accepted solution for anonymising these images exists because the amount of personal data that should be preserved varies case by case. Our objective is to provide a flexible mechanism for anonymising Digital Imaging and Communications in Medicine (DICOM) data that meets the requirements for deployment in multicentre trials. Methods We reviewed our current de-identification practices and defined the relevant use cases to extract the requirements for the de-identification process. We then used these requirements in the design and implementation of the toolkit. Finally, we tested the toolkit taking as a reference those requirements, including a multicentre deployment. Results The toolkit successfully anonymised DICOM data from various sources. Furthermore, it was shown that it could forward anonymous data to remote destinations, remove burned-in annotations, and add tracking information to the header. The toolkit also implements the DICOM standard confidentiality mechanism. Conclusion A DICOM de-identification toolkit that facilitates the enforcement of privacy policies was developed. It is highly extensible, provides the necessary flexibility to account for different de-identification requirements and has a low adoption barrier for new users. VL - 20 UR - http://www.springerlink.com/content/j20844338623m167/ IS - 8 ER - TY - JOUR T1 - An Open Grid Services Architecture Primer JF - Computer Y1 - 2009 A1 - Grimshaw, Andrew A1 - Morgan, Mark A1 - Merrill, Duane A1 - Kishimoto, Hiro A1 - Savva, Andreas A1 - Snelling, David A1 - Smith, Chris A1 - Dave Berry PB - IEEE Computer Society Press CY - Los Alamitos, CA, USA VL - 42 ER - TY - CONF T1 - OGSA-DAI: Middleware for Data Integration: Selected Applications T2 - ESCIENCE '08: Proceedings of the 2008 Fourth IEEE International Conference on eScience Y1 - 2008 A1 - Grant, Alistair A1 - Antonioletti, Mario A1 - Hume, Alastair C. A1 - Krause, Amy A1 - Dobrzelecki, Bartosz A1 - Jackson, Michael J. A1 - Parsons, Mark A1 - Atkinson, Malcolm P. A1 - Theocharopoulos, Elias JF - ESCIENCE '08: Proceedings of the 2008 Fourth IEEE International Conference on eScience PB - IEEE Computer Society CY - Washington, DC, USA SN - 978-0-7695-3535-7 ER - TY - CONF T1 - Orchestrating Data-Centric Workflows T2 - The 8th IEEE International Symposium on Cluster Computing and the Grid (CCGrid) Y1 - 2008 A1 - Barker, Adam A1 - Weissman, Jon B. A1 - van Hemert, Jano KW - grid computing KW - workflow JF - The 8th IEEE International Symposium on Cluster Computing and the Grid (CCGrid) PB - IEEE Computer Society ER - TY - JOUR T1 - OBO Explorer: An Editor for Open Biomedical Ontologies in OWL JF - Bioinformatics Y1 - 2007 A1 - Stuart Aitken A1 - Yin Chen A1 - Jonathan Bard AB - To clarify the semantics, and take advantage of tools and algorithms developed for the Semantic Web, a mapping from the Open Biomedical Ontologies (OBO) format to the Web Ontology Language (OWL) has been established. We present an ontology editor that allows end users to work directly with this OWL representation of OBO format ontologies. PB - Oxford Journals UR - http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btm593? ER - TY - CONF T1 - OGSA-DAI 3.0 - The What's and Whys T2 - UK e-Science All Hands Meeting Y1 - 2007 A1 - Antonioletti, M. A1 - Hong, N. P. Chue A1 - Hume, A. C. A1 - Jackson, M. A1 - Karasavvas, K. A1 - Krause, A. A1 - Schopf, J. M. A1 - Atkinson, M. P. A1 - Dobrzelecki, B. A1 - Illingworth, M. A1 - McDonnell, N. A1 - Parsons, M. A1 - Theocharopoulous, E. JF - UK e-Science All Hands Meeting ER - TY - CONF T1 - Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application T2 - IPDPS Y1 - 2007 A1 - Rosa Filgueira A1 - David E. Singh A1 - Florin Isaila A1 - Jesús Carretero A1 - Antonio Garcia Loureiro JF - IPDPS ER - TY - CONF T1 - OGSA-DAI Status and Benchmarks T2 - All Hands Meeting 2005 Y1 - 2005 A1 - Antonioletti, Mario A1 - Malcolm Atkinson A1 - Rob Baxter A1 - Andrew Borle A1 - Hong, Neil P. Chue A1 - Patrick Dantressangle A1 - Hume, Alastair C. A1 - Mike Jackson A1 - Krause, Amy A1 - Laws, Simon A1 - Parsons, Mark A1 - Paton, Norman W. A1 - Jennifer M. Schopf A1 - Tom Sugden A1 - Watson, Paul AB - This paper presents a status report on some of the highlights that have taken place within the OGSADAI project since the last AHM. A description of Release 6.0 functionality and details of the forthcoming release, due in September 2005, is given. Future directions for this project are discussed. This paper also describes initial results of work being done to systematically benchmark recent OGSADAI releases. The OGSA-DAI software distribution, and more information about the project, is available from the project website at www.ogsadai.org.uk. JF - All Hands Meeting 2005 CY - Nottingham, UK ER - TY - CONF T1 - Organization of the International Testbed of the CrossGrid Project T2 - Cracow Grid Workshop 2005 Y1 - 2005 A1 - Gomes, J. A1 - David, M. A1 - Martins, J. A1 - Bernardo, L. A1 - Garcia, A. A1 - Hardt, M. A1 - Kornmayer, H. A1 - Marco, Rafael A1 - Rodríguez, David A1 - Diaz, Irma A1 - Cano, Daniel A1 - Salt, J. A1 - Gonzalez, S. A1 - Sanchez, J. A1 - Fassi, F. A1 - Lara, V. A1 - Nyczyk, P. A1 - Lason, P. A1 - Ozieblo, A. A1 - Wolniewicz, P. A1 - Bluj, M. JF - Cracow Grid Workshop 2005 ER - TY - CONF T1 - OGSA-DAI Status Report and Future Directions T2 - All Hands Meeting 2004 Y1 - 2004 A1 - Antonioletti, Mario A1 - Malcolm Atkinson A1 - Rob Baxter A1 - Borley, Andrew A1 - Hong, Neil P. Chue A1 - Collins, Brian A1 - Jonathan Davies A1 - Desmond Fitzgerald A1 - Hardman, Neil A1 - Hume, Alastair C. A1 - Mike Jackson A1 - Krause, Amrey A1 - Laws, Simon A1 - Paton, Norman W. A1 - Tom Sugden A1 - Watson, Paul A1 - Mar AB - Data Access and Integration (DAI) of data resources, such as relational and XML databases, within a Grid context. Project members also participate in the development of DAI standards through the GGF DAIS WG. The standards that emerge through this effort will be adopted by OGSA-DAI once they have stabilised. The OGSA-DAI developers are also engaging with a growing user community to gather their data and functionality requirements. Several large projects are already using OGSA-DAI to provide their DAI capabilities. This paper presents a status report on OGSA-DAI activities since the last AHM and announces future directions. The OGSA-DAI software distribution and more information about the project is available from the project website at http://www.ogsadai.org.uk/. JF - All Hands Meeting 2004 CY - Nottingham, UK ER - TY - CONF T1 - OGSA-DAI: Two Years On T2 - GGF10 Y1 - 2004 A1 - Antonioletti, Mario A1 - Malcolm Atkinson A1 - Rob Baxter A1 - Borley, Andrew A1 - Neil Chue Hong A1 - Collins, Brian A1 - Jonathan Davies A1 - Hardman, Neil A1 - George Hicken A1 - Ally Hume A1 - Mike Jackson A1 - Krause, Amrey A1 - Laws, Simon A1 - Magowan, James A1 - Jeremy Nowell A1 - Paton, Norman W. A1 - Dave Pearson A1 - To AB - The OGSA-DAI project has been producing Grid-enabled middleware for almost two years now, providing data access and integration capabilities to data resources, such as databases, within an OGSA context. In these two years, OGSA-DAI has been tracking rapidly evolving standards, managing changes in software dependencies, contributing to the standardisation process and liasing with a growing user community together with their associated data requirements. This process has imparted important lessons and raised a number of issues that need to be addressed if a middleware product is to be widely adopted. This paper examines the experiences of OGSA-DAI in implementing proposed standards, the likely impact that the still-evolving standards landscape will have on future implementations and how these affect uptake of the software. The paper also examines the gathering of requirements from and engagement with the Grid community, the difficulties of defining a process for the management and publishing of metadata, and whether relevant standards can be implemented in an efficient manner. The OGSA-DAI software distribution and more details about the project are available from the project Web site at http://www.ogsadai.org.uk/. JF - GGF10 CY - Berlin, Germany ER -