– Abhinav Sharma, bioinformatics lead
A collaborative group of researchers led by Dr. Emilyn Costa Conceição, from the Instituto Nacional de Infectologia Evandro Chagas of Fundação Oswaldo Cruz, Rio de Janeiro, with other team members working from India, Pakistan, Mozambique, Switzerland, Germany, France, Malaysia and Portugal are doing genomic research regarding tuberculosis (TB)- focusing in molecular epidemiology, phylogeny and genetic diversity the variations.
Using a combination of Wasabi (scalable storage), Equinix Metal (on-demand compute), rclone (open source tool to transfer data between cloud services including Wasabi, Azure, OneDrive, Dropbox) and the Bioconda collection of open source bioinformatic tools, they have been able to greatly reduce the cost of performing academic genomic research, while increasing the speed with which they can research collaboratively. Both are important for this team, as they are self-funding their research while simultaneously pursuing grant-based funding.
Challenge: Genomics Research on a Budget
The rise of Cloud 1.0 providers like AWS has not been as useful as academic researchers had hoped due to the total costs of first-generation cloud storage and compute including egress and API call charges. These overall costs make large dataset-intensive research like genomics too expensive for most academic budgets. While the on-demand nature of Cloud 1.0 was attractive, it wasn’t enough.
Local computing, in this case, the equipment of individual researchers was also not enough, despite the open source tools themselves being freely available. The CPU and connectivity requirements to share files across the globe, combined with required local storage (up to 9 TB at a time) simply were not viable options. The extensive delays introduced by forcing individual researchers to run their analysis independently, or to share large datasets across slow and often unstable internet connections, add to the problems of multi-country collaboration, slowing research considerably.
Solution: Wasabi + Equinix Metal + High-Speed Cloud Connectivity
“We found Wasabi through HackerNews and Reddit – and reading about the partnership with Equinix Metal was key,” explains bioinformatics lead, Abhinav Sharma, currently pursuing a Master’s Degree in Data Science through IIIT-Bangalore and Liverpool John Moores University. “We had evaluated S3, and the cost of storage plus hard to predict fees was prohibitively expensive for us. We also evaluated EC2, but found the pricing and organizational structure was both too expensive and complicated to explain to scientific researchers. Using Equinix Metal’s bare metal servers produced a much simpler, affordable solution, with better performance.”
Wasabi’s on-demand, scalable storage, means that it is now affordable to maintain both the original source files containing the raw genomes, as well as intermediate files produced during the course of analysis. No more need to manually move data between researchers thousands of miles or even local storage. Having enough storage capacity for any given analysis is no longer a concern, and the price is significantly less expensive and less risky than storing it locally, where the danger of drive failures and power spikes are a concern.
Results: Affordable, Powerful, Next-gen Cloud for Genomics
On top of the overall price and performance benefits, they found that the capabilities of these next-generation cloud providers provided benefits they hadn’t anticipated. For example, the high-speed internet connections between Wasabi, Equinix Metal and other cloud tools used by the team make it possible for researchers to work in parallel, rather than serially, and at a significantly faster pace than previous solutions allowed.
Because the researchers are distributed across a variety of countries, using their own internet connections, bandwidth both in uploading / downloading is restricted and often unstable, making it difficult to move multi-terabyte files across individual researcher’s machines around the world.
For example, moving selective copies between Wasabi and Dropbox using rclone from a Equinix Metal-hosted server, allows the researchers to take advantage of the much higher speed connectivity and peering of professional cloud providers compared to connections that an individual researcher would have.
“The high-speed networking infrastructure between Wasabi and Equinix Metal dramatically changes the pace with which we can get the right data, visualizations and analysis in the hands of the researchers,” says Abhinav, “And the best thing about using bare metal – everything is much faster than a standard virtual compute instance in our comparisons.”