the bucket

Wasabi for Snowflake Data Storage

Wasabi for Snowflake Data Storage

Luke Boland
By Luke Boland
Associate Competitive Analyst

July 26, 2023

The world is moving to multi-cloud. Organizations are increasingly interested in distributing workloads, applications, and data across multiple platforms to avoid vendor lock-in, lower costs, and mitigate risk. Multi-cloud architectures offer the path to these benefits. 

Recently, Snowflake, a leading cloud data platform, added support for Wasabi Hot Cloud Storage, giving users an alternative to hyperscaler cloud providers for storing their unstructured data, such as business documents, videos, emails, audio files, as well as semi-structured data, such as JSON, Avro, ORC, Parquet, and XML files. Wasabi’s support for a wide diversity of data types makes it an important ingredient in a Snowflake data lake solution. 

Snowflake specializes in providing cloud-based data analytics services that enables users to uncover insights and gain a comprehensive understanding of their business in real time.  

Now,  Snowflake users building data lakes can create External Stages on Wasabi.  Snowflake refers to the location of data files in storage as a “stage.”  There are two types of stages: internal and external. Internal stages are part of a customer’s Snowflake account. They are used as an intermediate storage location for data files before they are loaded into a table or after they are unloaded from a table. Think of internal stages as folders that users PUT files into during the ingestion phase of the data workflow. External stages can be located outside of the Snowflake service, and are owned, managed, and paid for by the customer. The bill for this storage consumption will come from AWS, Google, Microsoft or Wasabi, not Snowflake. 

With the addition of Wasabi for External Stages, Snowflake users can use Wasabi for two important use cases: 

  1. Store unstructured data that can be used to populate Snowflake database tables. 
  1. Store Snowflake External Tables that users can share with others in collaborative use cases. 

With External Tables and Snowflake’s cross-cloud capabilities, customers’ data doesn’t have to be locked-in to a particular cloud or region. This multi-cloud capability is very desirable in use cases and industries where collaboration is critical, such as financial services, media and entertainment, and healthcare. For example, a customer running Snowflake on AWS with External Tables stored in Wasabi can share access to those tables with someone else running Snowflake on a different cloud in a different region. A true multi-cloud design. 

Using Wasabi as an External Stage as part of a Snowflake data lake provides customers the ability to upload data without incurring AWS S3 PUT charges. These charges can be significant. Consider the use case where a healthcare provider is storing PDFs of doctor’s handwritten notes, screenshots of insurance cards and prescriptions, and call center recordings. There could be millions of these data objects and AWS S3 PUT charges could cost thousands of dollars. Beyond that, LIST requests for objects in AWS S3 buckets drive up costs further. Wasabi doesn’t charge users for API requests. This means that there are no surprises when the cloud bill arrives, making Wasabi cloud storage costs predictable and easy to budget.  

When data files are uploaded into a customer’s Wasabi External Stage, Simple Notification Service (SNS) can be used to notify Snowpipe, Snowflake’s ingestion service, to automatically copy the new files to the Snowflake database tables in their AWS account.

Organizations should store the raw, unstructured data in an inexpensive form and then add structure to it later as the need arises. This ensures that the organization remains responsive and agile to its business intelligence needs without compromising on data fidelity because this data may be required in its original format further down the road. 

At $6.99 per TB per month, Wasabi is typically 80% less expensive than AWS S3, Azure Blob, and Google object storage AWS S3 Standard pricing, on the other hand, can vary from $20 to $25 per TB per month, depending upon the region used. 

As mentioned above, when using External Stages, typical API and operations charges from AWS, Azure, Google, and Cloudflare R2 will apply. Wasabi does not charge for API requests.  

When used as a collaborative tool for partners, customers, or remote workers who will be accessing External Tables and the unstructured data used to populate it, egress charges will be incurred if AWS, Azure, and Google are used. Egress charges are a natural barrier to multi-cloud architectures. Avoiding egress charges by limiting outside access to data or collaboration reduces its value and the insights of business intelligence generated by it. Wasabi’s multi-cloud-friendly no egress charge allows Snowflake users to leverage different cloud providers based on specific requirements, such as cost-effectiveness, geographic reach, compliance needs or specialized services.  

Additional information about using Snowflake with Wasabi can be found in this Wasabi Knowledge Base article 

the bucket
Luke Boland
By Luke Boland
Associate Competitive Analyst