DNA storage is the process of encoding and decoding binary data onto and from synthesized strands of DNA (deoxyribonucleic acid). In nature, DNA molecules contain genetic blueprints for living cells and organisms.
To store a binary digital file as DNA, the individual bits (binary digits) are converted from 1 and 0 to the letters A, C, G, and T. These letters represent the four main compounds in DNA: adenine, cytosine, guanine, and thymine. The physical storage medium is a synthesized DNA molecule containing these four compounds in a sequence corresponding to the order of the bits in the digital file. To recover the data, the sequence A, C, G, and T representing the DNA molecule is decoded back into the original sequence of bits 1 and 0.
Researchers at the European Molecular Biology Laboratory (EMBL) have encoded audio, image, and text files into a synthesized DNA molecule about the size of a dust grain, and then successfully read the information from the DNA to recover the files, claiming 99.99 percent accuracy.
An obvious advantage of DNA storage, should it ever become practical for everyday use, would be its ability to store massive quantities of data in media having small physical volume. Dr. Sriram Kosuri, a scientist at Harvard, believes that all the digital information currently existing in the world could reside in four grams of synthesized DNA.
A less obvious, but perhaps more significant, advantage of DNA storage is its longevity. Because DNA molecules can survive for thousands of years, a digital archive encoded in this form could be recovered by people for many generations to come. This longevity might resolve the troubling prospect of our digital age being lost to history because of the relative impermanence of optical, magnetic, and electronic media.
The principal disadvantages of DNA storage for practical use today are its slow encoding speed and high cost. The speed issue limits the technology's promise for archiving purposes in the near term, although eventually the speed may improve to the point where DNA storage can function effectively for general backup applications and perhaps even primary storage. As for the cost, Dr. Nick Goldman of the EMBL suggests that by the mid-2020s, expenses could come down to the point where the technology becomes commercially viable on a large scale.