Towards migration-free “just-in-case” data archival for future cloud data lakes using synthetic DNA

Marinelli, Eugenio; Yan, Yiqing; Magnone, Virginie; Dumargne, Marie-Charlotte; Barbry, Pascal; Heinis, Thomas; Appuswamy, Raja
VLDB 2023, 49th International Conference on Very Large Data Bases, 28 August-1 September 2023 , Vancouver, Canada / Vol.16, N°8

Given the growing adoption of AI, cloud data lakes are facing the need to support cost-effective “just-in-case” data archival over long time periods to meet regulatory compliance requirements. Unfortunately, current media technologies suffer from fundamental issues that will soon, if not already, make cost-effective data archival
infeasible. In this paper, we present a vision for redesigning the archival tier of cloud data lakes based on a novel, obsolescencefree storage medium–synthetic DNA. In doing so, we make two contributions: (i) we highlight the challenges in using DNA for data archival and list several open research problems, (ii) we outline OligoArchive-DSM (OA-DSM)–an end-to-endDNAstorage pipeline that we are developing to demonstrate the feasibility of our vision.

DOI
HAL
Type:
Conférence
City:
Vancouver
Date:
2023-08-28
Department:
Data Science
Eurecom Ref:
7319
Copyright:
© ACM, 2023. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in VLDB 2023, 49th International Conference on Very Large Data Bases, 28 August-1 September 2023 , Vancouver, Canada / Vol.16, N°8 http://dx.doi.org/10.14778/3594512.3594522

PERMALINK : https://www.eurecom.fr/publication/7319