MQ-Coder inspired arithmetic coder for synthetic DNA data storage
Authors:
Xavier Pic,
Melpomeni Dimopoulou,
Eva Gil San Antonio,
Marc Antonini
Abstract:
Over the past years, the ever-growing trend on data storage demand, more specifically for "cold" data (i.e. rarely accessed), has motivated research for alternative systems of data storage. Because of its biochemical characteristics, synthetic DNA molecules are now considered as serious candidates for this new kind of storage. This paper introduces a novel arithmetic coder for DNA data storage, an…
▽ More
Over the past years, the ever-growing trend on data storage demand, more specifically for "cold" data (i.e. rarely accessed), has motivated research for alternative systems of data storage. Because of its biochemical characteristics, synthetic DNA molecules are now considered as serious candidates for this new kind of storage. This paper introduces a novel arithmetic coder for DNA data storage, and presents some results on a lossy JPEG 2000 based image compression method adapted for DNA data storage that uses this novel coder.
The DNA coding algorithms presented here have been designed to efficiently compress images, encode them into a quaternary code, and finally store them into synthetic DNA molecules. This work also aims at making the compression models better fit the problematic that we encounter when storing data into DNA, namely the fact that the DNA writing, storing and reading methods are error prone processes.
The main take away of this work is our arithmetic coder and it's integration into a performant image codec.
△ Less
Submitted 22 June, 2023;
originally announced June 2023.
A JPEG-based image coding solution for data storage on DNA
Authors:
Melpomeni Dimopoulou,
Eva Gil San Antonio,
Marc Antonini
Abstract:
The efficient storage of digital data is becoming very challenging over the years due to the exponential increase in the generation of data which can't compete with the existing storage resources. Furthermore, the infrequently accessed data can be safely stored for no longer than 10-20 years due to the short life-span of conventional storage devices. To this end, recent studies have proven DNA to…
▽ More
The efficient storage of digital data is becoming very challenging over the years due to the exponential increase in the generation of data which can't compete with the existing storage resources. Furthermore, the infrequently accessed data can be safely stored for no longer than 10-20 years due to the short life-span of conventional storage devices. To this end, recent studies have proven DNA to be a very promising candidate for the long-term storage of digital data. Several pioneering works have proposed different encoding methods for the specific encoding of images into a quaternary DNA representation while first compressing the image using the classical JPEG standard to reduce the high cost of DNA synthesis. However this type of compression is not optimized with respect to the quaternary DNA code and results in an open-loop workflow. In our previous works we have introduced the first closed-loop solution for generating a constrained fixed-length quaternary code which allows controlling the synthesis cost thanks to a source allocation algorithm. In this paper, we extend our studies to proposing a variable-length encoding solution which is based on the JPEG standard.
△ Less
Submitted 17 March, 2021;
originally announced March 2021.