he aim of the project is to perform some exploratory work on how to deal with the problem of I/O bound processing, by implementing technology-specific components in a provided system. The goal is to distribute data and processing so that a CPU processes data locally, minimising data transfer. The assumption is that I/O is the major bottleneck in processing, and computation could be done with less powerful (greener and cheaper) CPUs, rather than with a powerful CPU that wastes energy waiting for data. Different technologies for storing and processing the data can be explored. More than one student can work on this challenge. Each student can explore a different technology. In particular: 1. Distributed image storage and processing in array-based databases: exploiting Rasdaman, SciDB or MonetDB. 2. Distributed image storage and processing with Hadoop and Sector/Sphere, implementations of MapReduce The influence of the storage support (HDD and SDD) on performance should be analysed as well. The students will use a new cluster, composed by over 120 nodes, studied for dealing with I/O bound problems: each node is composed by a light-weight ATOM CPU, 6 TB storage, distributed between HDDs and SDDs.
Attachment | Size |
---|---|
![]() | 1.31 MB |