What Do Data Storage Experts Think About the Myspace Fiasco?
Recently, Myspace made headlines with news of a massive data loss of their core consumer-contributed assets, music and photos. After years of being all but irrelevant with the rise of Facebook, Spotify, Soundcloud and YouTube, this may truly be the final swan song for Myspace.
Most people don’t think about or know the details of what it takes to store and manage high performance data on a large scale, and simply take storage for granted as consumers, employees, even as IT professionals.
But there’s a reason why storage is our sole focus.
Doing highly scalable, cost-effective data storage right is seriously hard work. There are many things a company can do to maintain their competitive advantage and add-value to the market – but unless you HAVE TO BE a storage expert, why would you waste your time attempting it, when the risk of failing is so high?
The official story is that “due to a server migration files were corrupted and unable to be transferred over to our updated site.” And what was lost? 12 years of music and photos uploaded prior to 2015.
To add injury to insult, there was no official acknowledgement of this failure for more than a year. Complaints started surfacing on reddit in February 2018 about songs being listed but not playing.
While Myspace faded away, the platform launched a number of independent bands who are now established artists, such as Panic! At the Disco, Ghost, The Devil Wears Prada, Bullet for My Valentine and Black Veil Brides.
For artists who had their start on Myspace, and kept copies of their songs only on Myspace, the historical record of their early days may be long gone.
Myspace broke new ground in data storage
When Myspace was at its peak, the size of the data that they had to manage seemed impossibly large. According to an article from SearchStorage “MySpace tackles extraordinary data storage requirements” from November 29th, 2006:
“The company uses a homegrown distributed file system that runs across 1,000 Hewlett-Packard Co. (HP) servers to store the majority of its small files, including MP3s and video clips — 3 billion images in total. Eight dedicated senior developers and engineers keep this monster up and running night and day.”
And the size of the site at the time was roughly 130 million subscribers, with 250,000 new users per day and more than 127 million profile pages.
To deal with that growth, they had challenges as any modern day data center has, including both running out of space and power:
Vice president of technology Jim Benedetto pointed to a stack of flattened cardboard boxes as large as a truck. “That’s just the systems we unpacked this week … and we can’t even power them all up, there’s no more power in Los Angeles.”
The state of the art disk size they were using at the time? 150 GB and 73 GB hard drives, which they could effectively only use 5-10 GB of due to “disk thrashing.” Honestly, at the time, their needs were way ahead of the available storage solutions could handle. You have more storage in your hand right now than the drives they were using in 2006.
Consumer data: then vs. now
Purely from the music file perspective, reports estimate that the total number of music files stored on Myspace were roughly 53 million songs from 14.2 million artists.
Statistics from our partner Filecatalyst shows that an average MP3 file is 3.5 MB.
53 million times 3.5 MB = 177 TB. That’s one portion of the data lost by MySpace.
While that amount of data may have seemed unbelievably large in 2003 when Myspace first started and large consumer datasets of unstructured data like this were rare, today, an estimated 300 hours of video are uploaded to YouTube every minute, and that’s just one consumer service.
After more than 15 years in the data storage business, I can safely tell you that while most people take it for granted that storage is storage, the truth is miles away from that. Storage is harder than you think, and with the exponentially increasing trend of data creation and storage, the bigger the data gets, the more you need to know what you’re doing.
Myspace blazed such an early path in data storage needs that at the time, nobody knew more about this scale of storage than they did, and yet apparently, that wasn’t enough to make their latest data migration project successful. 12 years of lost data is a heck of hole to simply gloss over.
Should you run your own storage? Probably not.
Yes, it’s self-serving for me to say you shouldn’t own and operate your own storage if storage isn’t really the nature of your business.
But when you’ve surrounded yourself for years with a team of world-class storage experts with decades of large-scale enterprise class storage experience, I can safely tell you from experience…
You may think being able to tailor your storage architecture entirely to your own organization is the safest and most cost-effective way to handle your storage needs, but that can put you on a path that can’t scale affordably, quickly, or reliably enough to power your business, as Myspace found.
Treat storage like you would treat electricity or bandwidth – a commodity that’s best left to companies that focus exclusively on highly reliable, cost-effective cloud storage.
Let the storage vendor worry about migrations to new hardware, software upgrades, and all the engineering that has to go into making a scalable and reliable service.
Meanwhile, you can focus your budget, staff, and time into areas where you can make the biggest impact for your business, and ask questions like “What can we do with this data to have the greatest impact?” rather than having customers ask “What happened to all of our data?”