Data Immutability Done Right
adjective: unchanging over time or unable to be changed, fixed, permanent, carved in stone
I had the recent pleasure of visiting the headquarters for a new $1.1 billion telescope under construction in Chile. Nobody actually “looks through” these giant telescopes. Rather, they are controlled by astronomers remotely from around the world, and the images are transmitted back to the U.S. where they are stored for study. Since these images are the sole output of a $1.1 billion investment, it’s not unreasonable for project management to refer to their stored data as their “billion-dollar dataset.”
When your data is worth a billion dollars, you don’t want someone deleting it, accidentally or otherwise. And you certainly don’t want some malicious boffin to be able to modify it and create fake images. Over the centuries, people have devised all sorts of ingenious ways to protect valuable data from destruction, modification, or theft. The Greeks and Romans literally carved it into stone. By the mid-Twentieth Century, documents meant to last were printed on special acid-free archival paper and stored away in dark, air-conditioned vaults. Extremely important documents remain under constant guard. Even the bank safe deposit box–in which two people, the banker and the content owner, turn their keys simultaneously to gain access–is a simple, but effective example of “data” protection.
In the digital world, the de facto standard has been WORM (write once, read many) tapes. These tapes are physically marked so that the machines that record on them cannot overwrite or erase information that is already there. And like paper, they can be stored under lock and key in secured facilities.
As the world moves away from extremely slow, offline media like tape to live, fast-access media like disk storage, we need to think about data immutability and implement solutions so that:
- Data doesn’t deteriorate over time
- Hackers can’t penetrate your firewalls and destroy your data or hold it for ransom
- Rogue employees can’t delete or alter your data, even when they have the proper credentials
- Your data will survive even if the data center where it’s stored does not
Amazon Web Services (AWS) offers immutable buckets. But during a recent presentation, an AWS product manager explained that only the systems admin could destroy or alter the data. In my opinion, this isn’t good enough. Credentials can be stolen, and a single bad actor with that kind of access can wreak havoc on businesses and organizations.
The Rules of Data Immutability Done Right
So, how should you implement real data immutability? I have two rules: 1) No one person should be able to destroy data that is in an immutable bucket, and 2) Nobody should be able to touch a production system anonymously.
Rule #1 is similar to the safe deposit box example, or the launch procedure for ICBMs, where two people must turn their keys at the same time. There should always be checks and balances on what any one individual is allowed to do. With any data stored on disk, there are four ways that “immutable” data can be lost:
- A malicious employee programmer could change the code to allow intrusion into a production system
- Someone in the data center could physically remove or destroy the disks
- Random disk failures could result in loss if there isn’t sufficient redundancy
- Data could suffer from “bit rot” and deteriorate if it is not checked and refreshed on a regular basis
Here’s how we address each of these issues at Wasabi
First, no Wasabi programmer can change code on a production system without an elaborate series of code reviews and thorough testing. These procedures involve many people, any one of whom is highly qualified to spot malicious code. Second, our data centers are secured with all the usual fingerprint ID systems, man traps, and the like. There are thousands of disks, and a data center technician would have no access to the databases to indicate what data was stored on what disk. (A programmer might be able to figure it out, but programmers–or anyone, for that matter–are not allowed to enter the data center unaccompanied.) Third, we have extreme redundancy– 11 nines of durability. If you gave Wasabi one million objects to store, statistically we would lose one object every 659,000 years. And finally, we read every object every 90 days and automatically correct any random errors.
With Wasabi immutable buckets, no one can delete or alter your data–not even a systems administrator.
While the network security guys do their best to keep out the intruders, immutability done right will protect your data from being lost or destroyed no matter who hacks or fumbles their way in.