Open access research
publication repository

Extracting, Transforming and Archiving Scientific Data [r-libre/232]

Lemire, Daniel, & Vellino, Andre (2011). Extracting, Transforming and Archiving Scientific Data. In Proceedings of the Fourth Workshop on Very Large Digital Libraries. DELOS Association for Digital Libraries.

File(s) available for this item:
[img]  PDF - 1108.4041v2.pdf  
Item Type: Papers in Conference Proceedings
Refereed: Yes
Status: Published
Abstract: It is becoming common to archive research datasets that are not only large but also numerous. In addition, their corresponding metadata and the software required to analyse or display them need to be archived. Yet the manual curation of research data can be difficult and expensive, particularly in very large digital repositories, hence the importance of models and tools for automating digital curation tasks. The automation of these tasks faces three major challenges: (1) research data and data sources are highly heterogeneous, (2) future research needs are difficult to anticipate, (3) data is hard to index. To address these problems, we propose the Extract, Transform and Archive (ETA) model for managing and mechanizing the curation of research data. Specifically, we propose a scalable strategy for addressing the research-data problem, ranging from the extraction of legacy data to its long-term storage. We review some existing solutions and propose novel avenues of research.
Depositor: Lemire, Daniel
Owner / Manager: Daniel Lemire
Deposited: 03 Sep 2014 18:53
Last Modified: 16 Jul 2015 00:46

Actions (login required)