Guide to Load Data from Azure Cloud to Snowflake

Guide to Load Data from Azure Cloud to Snowflake

Celestinfo Software Solutions Pvt. Ltd. Sep 12, 2024

Last updated: October 2024

Quick answer: To load data from Azure Blob Storage into Snowflake, create a storage integration using CREATE STORAGE INTEGRATION with your Azure tenant ID and container URL, define an external stage pointing to the blob path, create a file format (CSV/Parquet), then run COPY INTO to load the data into your target Snowflake table.

Introduction

Loading data from Azure Blob Storage to Snowflake is a core ETL task for data engineers working with Azure and Snowflake cloud integration. For a secure private connection between Azure and Snowflake, see our Snowflake Azure PrivateLink implementation guide. This step-by-step guide walks you through the complete process of configuring Azure Blob Storage integration with Snowflake, creating external stages, and using COPY INTO commands to transfer CSV data from Azure to Snowflake tables.

Pre-requisites

Procedure

Step 1: Create a Database

First, create a database in Snowflake to store your data. The following command creates a database named raw_azure_db if it doesn’t already exist.

AWS Console Home

Use default schema or create a separate schema using. Here I’m using the default schema “PUBLIC” as my schema.

Azure to Snowflake data loading configuration
Azure to Snowflake data loading configuration

Procedure to create azure account:

Step 1.1: Sign In to the Azure Portal

Step 1.2: Create a Resource Group

A resource group is a logical container for Azure resources. Let’s create one for our storage account.

Azure to Snowflake data loading configuration


Step 1.3: Create a Storage Account

The storage account provides a unique namespace for your Blob Storage data.

Azure to Snowflake data loading configuration

common Name: mystorageaccount2159 (unique, descriptive, and aligns with your blobstorage2159 example).

Step 1.4: upload the data(azure):

Go to storage accounts And click refresh and click the created storage account and click the upload button on top.then browse the files and uploaded

Azure to Snowflake data loading configuration
Azure to Snowflake data loading configuration

Step 2: Create a Storage Integration

To connect Snowflake with Azure Blob Storage, create a storage integration. This defines the connection parameters, including the Azure tenant ID and allowed storage locations.

Azure to Snowflake data loading configuration

After creating the integration, describe it to verify the setup:

Azure to Snowflake data loading configuration
Azure to Snowflake data loading configuration

Authorize Snowflake Using AZURE_CONSENT_URL

After creating the storage integration, you need to authorize Snowflake to access your Azure Blob Storage. Here’s how

Copy the azure_content_url from the output

Note: Ensure the Azure tenant ID and storage URL match your Azure account details. You may need to grant Snowflake access to the Azure storage account via role-based access control (RBAC).

Step 3: Create a File Format

Define a file format to specify how Snowflake should parse the CSV file. Here, we create a CSV file format that skips the header row and uses a comma as the delimiter.

Azure to Snowflake data loading configuration

Adjust the SKIP_HEADER or FIELD_DELIMITER based on your CSV file’s structure.

Azure to Snowflake data loading configuration

Adjust the SKIP_HEADER or FIELD_DELIMITER based on your CSV file’s structure.

Step 4: Create a Stage

A stage in Snowflake is a reference to the external storage location (Azure Blob Storage). The stage links to the storage integration and specifies the file format.

Azure to Snowflake data loading configuration

Step 5: Verify Stage Contents

Preview the data in the stage to confirm the file is accessible and formatted correctly.

Azure to Snowflake data loading configuration

This query displays the raw contents of the CSV file as a single column. Check for any parsing issues here.

Step 6: Create a Target Table

Create a table in Snowflake to store the data from the CSV file. The table’s schema should match the CSV file’s structure.

Azure to Snowflake data loading configuration

Ensure the column names and data types align with the CSV file’s columns.

Step 7: Load Data into the Table

Use the COPY INTO command to load the data from the Azure stage into the STUDENTS2 table. The ON_ERROR = CONTINUE option ensures the process continues even if some rows fail to load.

Azure to Snowflake data loading configuration

Step 8: Verify the Loaded Data

Query the table to confirm the data was loaded successfully.

Azure to Snowflake data loading configuration

Conclusion

You have now loaded a CSV file from Azure Blob Storage into a Snowflake table. The process involves setting up a database, connecting Snowflake to Azure via a storage integration, defining a file format, staging the data, and loading it into a table. If you need to load data from your local machine instead, check out our guide on connecting Snowflake with SnowSQL. For loading JSON or semi-structured data from AWS S3, see our article on loading semi-structured data into Snowflake. If you encounter issues, verify the Azure URL, file format, and column mappings. For large datasets, consider Snowflake’s parallel loading capabilities to optimize performance.

Frequently Asked Questions

Q: How do I load data from Azure to Snowflake?

To load data from Azure to Snowflake, create an Azure storage integration in Snowflake, set up an external stage pointing to your Azure Blob Storage, define a file format, and use the COPY INTO command to load the data into your Snowflake tables.

Q: What Azure storage types does Snowflake support?

Snowflake supports loading data from Azure Blob Storage and Azure Data Lake Storage (ADLS) Gen2. You can configure storage integrations to securely connect Snowflake to these Azure storage services.

Q: Do I need a storage integration for Azure to Snowflake data loading?

While you can use SAS tokens directly, creating a storage integration is the recommended approach. It provides a more secure and manageable way to connect Snowflake to Azure storage, avoiding the need to embed credentials in SQL statements.

Chakri, Intern

Chakri is an Intern at CelestInfo with hands-on experience across AWS, Azure, GCP, and Snowflake cloud infrastructure.

Related Articles

Burning Questions
About CelestInfo

Simple answers to make things clear.

Our AI insights are continuously trained on large datasets and validated by experts to ensure high accuracy.

Absolutely. CelestInfo supports integration with a wide range of industry-standard software and tools.

We implement enterprise-grade encryption, access controls, and regular audits to ensure your data is safe.

Insights are updated in real-time as new data becomes available.

We offer 24/7 support via chat, email, and dedicated account managers.

Still have questions?

Get Assistance

Ready? Let's Talk!

Get expert insights and answers tailored to your business requirements and transformation.

Get Assistance