Earlier this week, I spent some time at Internet2's 2019 Global Summit in Washington, DC. For those of you who are not familiar with Internet2, it is essentially an alternate Internet that was founded by a consortium of higher education institutions in 1996. Today it connects 317 US universities, 60 government agencies, and more than 100,000 institutions in 100 countries. The idea is to have an alternative to the public Internet where there\u2019s very high bandwidth and connected services that are relevant to the member organizations.\r\n\r\nWe were honored to join the Internet2 Cloud Exchange in 2018 as the only pure-play cloud storage vendor, and since then have been ramping up Internet2 members to take advantage of a radically reduced price point (80% less expensive than AWS S3), and the combination of high performance connectivity and storage.\r\n\r\nFor 2019 - This year\u2019s conference was mostly about security, identity, and networking issues. At its core, Internet2 is a network, so it\u2019s not surprising that the folks at this conference are focused on the connectivity backbone, particularly how to enhance it, and how to protect it.\r\n\r\nTo me, though, it felt like the elephant in the room was actually storage, rather than connectivity and security. Everybody was talking about the volume of data their researchers were generating. This was especially true of the medical research community where imaging and genomics are just going berserk.\r\n\r\nLet\u2019s look at genetic data, for example. One speaker said that it won\u2019t be long before every baby has their gene sequenced before they leave the hospital. Why? Your genes can guide doctors on what kind of medications (and dosage) you\u2019ll use throughout your life in a more predictive rather than reactive way. This makes it possible to drive down future costs and improve wellness, both of which are major initiatives of healthcare organizations and government agencies alike.\r\n\r\nThe machines to sequence a human genome are coming down in price to the point where this is looking entirely practical. The problem is the data, and although we\u2019re used to seeing massive amount of data at Wasabi, frankly I was surprised at the magnitude of the challenge.\r\n\r\nWith genomic sequencing, you have to do multiple scans to get high quality, reliable data. The \u201c1000 Genomes Project\u201d (begun in 2012) consists of >200 TB for 1700 participants, or 118 GB per individual.\r\n\r\nA sequencing machine such as those made by Illumina, will do 30 scans, producing 90 billion pairs and data of roughly 200 GB. Next-gen scanners do 100 scans per genome. As the resolution and completeness of these scans continue to grow, these file sizes will increase correspondingly.\r\n\r\nWhat\u2019s the scale of storing genomic data at a country-wide level?\r\n\r\nLet\u2019s use the more recent data size of 200 GB per person for our calculations - it\u2019s a reasonable average between the data provided from genome scanners circa 2012, and what\u2019s possible with next-generation genome scanners becoming available now.\r\n\r\nThere are 4 million babies born in the United States every year. If every baby received a genomic scan before leaving the hospital, that would result in 800 Petabytes (PB) of data (200 GB x 4,000,000 babies).\r\n\r\n800 PB stored with S3 would cost over $200 million per year of new scans.\r\n\r\nIn a five year period, that would grow to over $1 Billion in ongoing storage costs as the data added up over time.\r\n\r\nClearly Wasabi could drive those costs down tremendously at 80% less than S3 - a $750 million or more PER YEAR savings by year 5! (See our pricing calculator to compare Wasabi vs. other cloud storage providers)\r\n\r\nI\u2019m used to thinking big, but that is a daunting number.\r\n\r\nAll of this potential data, in just a single portion of one slice of one industry...\r\n\r\nAnd yet while almost everybody at the Internet2 Global Summit was talking about moving data, almost nobody was talking about storing it. But as you can see, storage is the big concern of the researchers themselves who somehow have to find a way to keep the data collected for their experiments.\r\n\r\nIn my opinion, Internet2 members are going to have to address the issue of data storage at some point, and soon. The revolution in the ability to create data needs to be met with the revolution in storing and using it to advance scientific research.\r\n\r\nJust as the movement of data across their network is a fast, secure, commodity that any member can use, if there\u2019s no place to store that data at the other end, then what good is it?