It is evident that data-intensive research is transforming computing landscape. We are facing the challenge of handling the deluge of data generated by sensors and modern instruments that are widely used in all domains. The number of sources of data is increasing, while, at the same time, the diversity, complexity and scale of these data resources are also growing dramatically. To survive the data tsunami, we need to improve our apparatus for the exploration and exploitation of the growing wealth of data.
We report the rationale for a new architecture to support a significant increase in the scale of data integration and data mining under development in the Advanced Data Mining and Integration Research for Europe (ADMIRE) project. It proposes the composition into one framework of (1) data mining and (2) data access and integration. We name the combined activity “DMI”. It supports enactment of DMI processes across heterogeneous and distributed data resources and data mining services. The proposed DMI architecture intended to make all of the stages of DMI process development and enactment as identified in easier and more economic.
In this talk, I will discuss the ADMIRE architecture, our distributed computation model, the optimisation and instrumentation strategies, and the evaluation experiments with a real world problem in Life Sciences.