What I Learned at the Library of Congress Conference
Last week I attended the annual conference on digital preservation at the Library of Congress in Washington, D.C. The theme for this year’s conference was “Designing Storage Architectures for Digital Collections.” Bernard Barton, the CIO of the Library of Congress, gave the opening remarks. Needless to say, there is a lot more digital media than paper media these days, though I was surprised to learn the LoC still collects hundreds of daily newspapers and magazines. They even have a lab where they treat paper to remove the oxidants that could turn the papers to dust in a few years. (Seems a little weird when you consider almost all these papers have online editions.)
So digital storage was the big issue. With thousands of TV shows, movies, and other media to archive, the LoC can chew through petabytes pretty quickly, though they are limited by an inadequate budget. So how to store data safely and at lower costs was really the main theme of the day. Both Seagate and Western Digital talked about new hard-drive technologies that are just around the corner, including microwave-assisted magnetic recording (MAMR) and heat-assisted magnetic recording (HAMR) drives—technologies that Wasabi is clearly rooting for. Western Digital sees the possibility of 40TB disk drives within the next four to five years.
HAMR looks very promising. Current disk technology is hitting the wall in terms of data density. At room temperature it takes a lot of energy to flip the magnetic bits on disk, which limits how small you can make an individual bit. The magnetic field from the head creates a large smear of magnetism across the platter. If you heat the magnetic particles on the platter, however, it takes a lot less energy to flip the magnetic bits, so you can make them much smaller. The idea is to use a laser mounted on the drive head to heat a tiny spot on the platter, then write a bit, then let it cool. And all that has to happen in less than a nanosecond. Very nifty stuff. Coming soon, in theory.
Many of the attendees called themselves “archivists.” These people have a VERY long view of data preservation—like hundreds of years. Many of the presenters were fixated on developing media that can last for hundreds of years, but I think that’s barking up the wrong tree. Even if you can make a recording medium today that will last hundreds of years, a hundred years from now the players originally used to write that medium will be on a trash heap somewhere.
I think the key to reliable long-term archival storage is to store many copies of the data and to check each copy regularly to fix any errors that randomly creep in. Rather than trying to find a medium that will last 100 years, use whatever technology that’s available at the time like standard spinning magnetic disks, and be prepared to migrate the data to new media as they become available. If you gave Wasabi some data to store for 100 years, we could keep it at a single location with 11 nines of durability. Even better, we could replicate it to two data centers on different sides of the country to protect against natural disasters. Hard drives will wear out after five or six years, but it’s our responsibility to migrate customer data to new drives and decommission the old ones. We also verify the integrity of each copy every 90 days.
Until now, archivists have preserved digital data using “cold storage” solutions—tapes or very slow clouds like Amazon Glacier. In general for the past several decades, “hot” storage was fast and expensive, and “cold” storage was slow and cheap. I think that distinction is going away and there won’t be much need for “cold” storage going forward. With today’s technology and the promised economies of new technologies like HAMR, hot storage won’t cost any more than cold storage. So why would you need it? Cold storage devices like robotic LTO tape libraries and “cold” cloud storage products like Glacier will have very limited use.
Storage will become a commodity like electricity: one-size-fits-all, cheap, fast, super-secure, and available everywhere you have an internet connection.