the bucket

What Does 11 Nines of Durability Really Mean?

What Does 11 Nines of Durability Really Mean?

David Friend
By David Friend
President, CEO & Co-founder

May 28, 2019

Wasabi has been publicly available for more than two years, and while many things have changed (one key growth stat, it took 5 months to ingest our first petabyte, and we now ingest roughly a petabyte of data every 5 days, accelerating every day), we continue to answer many of the same questions that we did on Day One.

Data storage is serious business. And modern data storage is clearly not well understood compared to the legacy storage systems so many organizations continue to operate (I’m looking at you, NAS and SAN implementations).

Defining Reliability, Durability, Availability and Uptime

Reliability and durability are used interchangeably in storage, and focuses on ensuring that stored data does not suffer from bit rot, degradation or other corruption. In short, the data you’ve stored, whether 5 seconds ago, or 15 years ago, is still exactly the same data, with no modifications, and nothing missing. Highly reliable storage is extremely important for long-term storage needs, which means the system needs to proactively watch for and correct any issues on a regular basis (we call that “Active Integrity Checking”).

Availability (often called uptime), focuses on ensuring the storage system is operational and can deliver your data when you request it. If you have highly reliable storage, but can’t get to it (for example, tape in an offline vault, or a regional/system-wide network or power outage), your stored data effectively doesn’t exist.

Both reliability and availability require robust architectural designs to meet the expectations we all have for storage – that when we want to get to our data we can, and it is exactly the data we expect.

Do You Trust Your Storage Vendor?

Customers entrust their data to their storage vendors with the understanding that it will be there when they want it. No excuses.

My previous company was Carbonite, a well-known backup company where I was the CEO and Co-founder. Carbonite backs up about a half a billion computer files every day. So our product, engineering and ops teams at Wasabi, many of them ex-Carbonite team members, know a few things about data loss in the cloud and what it can mean to customers.

As Shakespeare famously wrote, there is “Much Ado About Nothing” on the benefits of on-premises, physical storage compared to cloud storage.

Here’s what I can tell you…

Cloud Storage is More Reliable than Physical Storage

If your eyes rolled back for a moment, I understand.

But the headline above is absolutely true.

You may not believe it, but that’s because of some fundamental truths about people.

  1. Simply put, most of us are terrible at quantifying risks. (I highly recommend reading “Against the Gods: The Remarkable Story of Risk” by Peter L. Bernstein).
  2. We are physical beings, and as a result, we trust physical things we can touch with our own hands more readily than “storing data in the cloud.” Completely understandable. Losing something like a physical folder of papers is tangible and easy to grasp, or, I suppose NOT grasp, if you’ve lost it.

That's why some IT professionals (incorrectly) think that having data stored in their own data centers is somehow inherently safer than storing it in a public cloud.

Risk and Reliability are Tightly Coupled

When dealing with something as vaporous as millions of computer files, most people don’t have a good gut feel for how reliable data storage needs to be in order to avoid costly and embarrassing losses.

And as much as you may feel comforted by having physical copies of documents or tapes, they are certainly not foolproof!

From the Boston Globe…

“McLean Hospital said Tuesday that information from about 12,600 people who donated their brains to research has gone missing. […] The hospital said four backup data tapes at its Harvard Brain Tissue Resource Center went missing on May 29. These tapes contained private information, including names, dates of birth, diagnoses, and some Social Security numbers. The tapes, which are unencrypted, were never found.”

From The New York Times…

“Time Warner said the data, on 40 tapes in a container the size of a cooler, disappeared more than a month ago while being shipped to an offsite storage center.”

From Portland Press Weekly…

“As many as 267,000 TD Bank customers from Maine to California were affected by the loss of two data backup tapes that contained personal information such as Social Security numbers and driver’s license numbers. […]

The unencrypted tapes were lost more than six months ago, but TD Bank did not alert attorney generals in affected states until this week. The loss of data affects bank customers in at least six states, and may include names, addresses, dates of birth and account numbers.”

Or tell the residents of King's Landing from the Game of Thrones that… OK, if you haven’t seen the final episode, I’m not going to ruin that for you. Needless to say, life is risky business.

Fun with Statistics

What does 5 nines of reliability mean? That’s been a very standard “reliability” statistic that’s been thrown around years. But what does it mean?

Let’s do the math. If 99% reliability means that you will lose one object out of 100 every year, then 99.999% (5 nines) reliability means that you will lose one object out of 100,000 objects every year.

In the physical world, Iron Mountain makes 5 million pickups and deliveries a year, so by their numbers you can expect them to lose 50 objects per year. That’s probably consistent with the losses we read about in the newspapers.

By contrast, top-tier cloud storage vendors, including AWS S3, Microsoft Azure, and my company, Wasabi, offer 11 nines of reliability (or durability as we say in the industry) – 11 nines = 99.999999999% reliability.

Doesn’t seem like adding more 9s past the decimal would really matter, would it?

What Does 11 9s of Durability Mean, Really?

99.999999999% durability of objects over any given year makes cloud storage 1 million times more reliable than Iron Mountain’s physical storage, and without the hazards of getting caught in Boston’s rush hour traffic.

Breaking it Down the Wasabi Way

At Wasabi, we store billions of “objects,” or files that customers have sent us. On average, files are about 800 MB in size. So if your organization is storing 1 PB of data, it’s likely that you have something like 1.2 billion objects.

If your storage were 99% reliable, that would mean that you would lose one out of every 100 objects every year. The least durable commercial cloud storage is Amazon S3 Reduced Redundancy Storage (RRS) which is spec’d at 99.99%. Using RRS, you could expect to lose .01% of your files every year, or .0001 x 1.2B = 12 million lost files per year.

BTW – Did I mention that S3 RRS is 4 times more expensive than Wasabi? Granted that RRS is less expensive than S3, but even S3 is 5 times more expensive than Wasabi. (Read why more tiers aren’t better, and one hot tier is all you need)

Here’s a table with some representative products and the expected data loss per year:

Cloud Storage Vendor Durability Files lost per year per PB
Amazon S3 RRS 99.99% (4 nines) 12 million
Amazon S3 Standard 99.999999999% (11 nines) .12 (i.e., one every 8 years)
99.999999999% (11 nines) .12 (i.e., one every 8 years)

 

Active Integrity Checking Means Extra Protection

With either S3 RRS or similar lower reliability services, the problem is that you won’t know you’ve lost files until you try to use them. It’s not like when you lose all your files and can restore them from a backup.

For example, say you store your data for five years in S3 RRS. After five years you would expect to accumulate 600 lost files (5 x 120). Backups from five years ago are probably gone, leaving you with permanent data loss. That’s why many IT managers resort to annual (or more frequent) testing of all their data to create and test checksums on what they actually have in storage. If a mismatch is found, hopefully, there is another copy somewhere that they can access to restore a corrupted or missing file.

Wasabi does a checksum comparison every 90 days – we call it Active Integrity Checking.

Since there are effectively five copies of every piece of data to achieve 11 nines, any one copy that becomes corrupted or lost can be quickly and reliably restored. With 11 nines of durability, the likelihood is that you will never experience data loss in your lifetime.

That leads us to a related topic – availability. If Wasabi has 99.999999999% object durability, should you replicate data to a second data center?

It’s All About Availability

Replicating your data in a second data center at a different location gets you two things:

  1. Insurance against a local disaster (flood, fire, earthquake) that could physically destroy one of the data centers or take it offline, and
  2. Increased availability.

Data centers can and do go offline from time to time due to power or local Internet failures. If a data center guarantees 99.9% uptime, that means that it will be offline .1% of the time, or about 9 hours per year. Geographic replication would give you 99.9999% uptime, or 1/1000th the amount of downtime. This level of availability may or may not be worth the extra money; it really depends on your application and what any amount of downtime means to your business.

The Inconvenient Truth of Reliability

There is one very important and inconvenient truth about reliability:

Two-thirds of all data loss has nothing to do with hardware failure, whether it’s YOUR hardware, or ours.

data loss graphic

 

The real culprits of data protection are a combination of:

  • Human error,
  • Viruses,
  • Bugs in application software, and
  • Malicious employees or intruders.

Almost everyone has accidentally erased or overwritten a file. Even if your cloud storage had one million nines of durability, it can’t protect you from human error.

Enter the Immutable

For this reason, Wasabi was the first public cloud storage provider to offer the “immutable bucket” — storage that cannot be erased or modified by anyone – not even the administrator for your account or anyone at Wasabi. Once you write it, it’s there until the hold time that you designate expires. If someone tries to erase or modify an immutable file, you just get an error message.

I’ve written a whole blog post on immutability if you’d like to learn more.

Summary

  • Reliability/Durability = stored data does not suffer from bit rot, degradation or other corruption.
  • Availability/Uptime = the storage system is operational and can deliver your data when you request it.

Wasabi takes your data storage very seriously – we are designed, built and operate with one goal in mind, to be your trusted cloud storage utility – both highly durable and available, as well as inexpensive and high performance. To us, this is the foundation that all next generation cloud providers, like Wasabi, should be based on.

PS – Not Technical Enough?

If you are interested in even more technical details and mathematical models on how Wasabi is built for high levels of object durability, read our Tech Brief ebook “Wasabi Extremely High Durability Protects Mission-Critical Data.”

the bucket
David Friend
By David Friend
President, CEO & Co-founder