Latest news

Publication

C2MS: Dynamic Monitoring and Management of Cloud Infrastructures

9 years 2 months ago
Presentation

Ad hoc Cloud Computing

9 years 2 months ago
Publication

Evolutionary Computation and Constraint Satisfaction

9 years 3 months ago
Publication

Ad hoc Cloud Computing

9 years 4 months ago
Software release

DICOM Confidential 1.4.4 released

9 years 7 months ago
Story

Congratulations to Gary McGilvary on his PhD

10 years 1 week ago
Publication

Ad hoc Cloud Computing (PhD Thesis)

10 years 1 week ago
Publication

Quantification of Ultra-Widefield Retinal Images

10 years 1 month ago
Publication

Precise montaging and metric quantification of retinal surface area from ultra-widefield fundus photography and fluorescein angiography

10 years 1 month ago
Software release

New DICOM Confidential Release

10 years 2 months ago

Historical Interest Only

This is a static HTML version of an old Drupal site. The site is no longer maintained and could be deleted at any point. It is only here for historical interest.

Large data storage

20 September 2011 - 1:56pm — Paolo.Besana

Scientific laboratories produce large amounts of data, often stored as files in hierarchical folders. File systems do not scale well with large number of files. In particular, access to data becomes hard if query criteria do not match storage criteria.
Different solutions have been proposed. The simplest approach is to keep relying on the file systems, but store file paths and metadata into standard DBMS. In HDF5 files are stored together with metadata in large files, and specialised API are provided.
Approaches like Hadoop store the data into ad hoc file systems and are particularly adapted to batch map reduce processing, but perform poorly when random accesses are needed.
We are interested in looking for alternative storage systems able to provide both easy access to the data according to different criteria, and local processing capabilities (like hadoop) but mantaining decent performance in random access.
Solutions like MonetDB/sciQL, Rasdaman, SciDB, Hadoop+HBase, Sector and Sphere should be explored, using as test bed seismological data.

Degree level:

Subject areas:

e-Science

Computer Architecture

Computer Communication/Networking

Databases

Distributed Systems

Parallel Programming

Software Engineering

Main menu

Latest news

Pages

You are here

Historical Interest Only

Large data storage

Search form

Main menu

Latest news

Pages

You are here

Historical Interest Only

Large data storage