๐Azure Data Factory: Copying Data from ADLS to MSSQL
๐ Azure Data Factory: Copying Data from ADLS to MSSQL
๐ Key Vault & Initial Setup
To copy data from Azure Data Lake Storage (ADLS) to Microsoft SQL Server (MSSQL) using Azure Data Factory (ADF), follow these structured steps.
๐ Step 1: Creating Azure Data Factory (ADF) & Linking Services
๐️ Creating an Azure Data Factory
- ๐ Search for "Data Factory" in the Azure portal.
- ➕ Click "Create Data Factory".
- ๐ฆ Select the Free Tier.
- ๐ Attach the resource group: rock2025.
- ๐ท️ Provide a name, e.g., learnazure (lowercase, no special characters or numbers).
- ✅ Click "Review + Create" and proceed with deployment.
๐ Linking Azure Data Lake Storage to Data Factory
- Navigate to Azure Data Factory → Launch ADF Studio.
- In the Manage section:
- Click Linked Services → + New Linked Service.
- Select Azure Data Lake Storage Gen2.
- Provide a name, e.g., LS_FEB13.
- Choose the subscription (Free Tier).
- Select the Storage Account (e.g., divyabucket02).
- ✅ Click Create.
๐️ Step 2: Creating an SQL Database
๐️ Creating a Database in Azure
- ๐ Search for SQL Database in Azure.
- ➕ Click + Create Database.
- Configure the following settings:
- Subscription: Free Tier
- Resource Group: divyalearn_az
- Server Name: mysql13febdb
- Region: Asia Pacific (Central India)
- Authentication Method: SQL Authentication
- Server Admin Login: divya
- Password: (Set your own password)
- ✅ Click Review + Create.
Here’s a quick breakdown of the options to help you choose the right Azure Backup Storage Redundancy type:
1. Locally-Redundant Storage (LRS)
- What it is: Data is replicated 3 times within a single datacenter in the same region.
- Use case:
- Low-cost option.
- Suitable for non-critical data where data loss in case of a datacenter failure is acceptable.
- Limitation: Not protected against datacenter-level failures.
2. Zone-Redundant Storage (ZRS)
- What it is: Data is replicated across multiple availability zones (data centers) in the same region.
- Use case:
- Higher availability within the same region.
- Suitable for applications needing high availability but not requiring region-level protection.
- Limitation: Doesn’t protect against entire region outages.
3. Geo-Redundant Storage (GRS)
- What it is: Data is replicated across regions (primary + secondary region, hundreds of miles apart).
- Use case:
- Suitable for critical applications where data recovery after regional disasters is essential.
- Provides disaster recovery capabilities.
- Limitation: More expensive than LRS and ZRS.
4. Geo-Zone-Redundant Storage (GZRS) [Preview]
- What it is: Combines ZRS and GRS by replicating data across availability zones and geographically to a secondary region.
- Use case:
- Highest durability and availability, covering datacenter, zone, and regional failures.
- Ideal for mission-critical workloads.
- Limitation: Likely the most expensive option.
✅ Which should you choose as a student?
If you’re learning or working on non-production projects:
- Choose: ๐น Locally-Redundant Storage (LRS) — It’s cost-effective and sufficient for learning purposes.
- If you want to explore high availability concepts:
- Zone-Redundant Storage (ZRS) could be a next step.
Would you like more real-world scenarios or examples for each?
๐ Connecting SQL Workbench to the Database(locally in own system)
- Copy Connection String from Azure SQL Database.
- Open SQL Workbench → File → New Connection.
- Paste the JDBC connection string:
jdbc:sqlserver://divyamysqldb.database.windows.net:1433; database=mysql13febdb; user=divya@divyamysqldb; password={your_password_here}; encrypt=true; trustServerCertificate=false; - Click Connect.
๐ง Troubleshooting SQL Connection Issues
- If unable to connect:
- Navigate to SQL Database → Networking → Firewall Rules.
- Add a new rule:
- Rule Name: AllAccess
- Start IP: 0.0.0.0
- End IP: 0.0.0.0 (Allows global access, not recommended for production).
- ✅ Save and retry connection.
๐ค Step 3: Copying Data from ADLS to SQL Database
๐ Preparing the Table in SQL Workbench
- Run the following SQL command:
CREATE TABLE asltab ( name VARCHAR(32), age INT, city VARCHAR(32) ); - Verify table creation:
SELECT * FROM asltab;
๐ Creating & Configuring Linked Services in ADF
- Navigate to Azure Data Factory (ADF) → Manage.
- Click Linked Services → + New.
- Select SQL Server and configure:
- Name: LS_SqlServer1
- Server Name: divyamysqldb.database.windows.net
- Database: mysql13febdb
- Authentication Type: SQL Authentication
- Username & Password: (Same as SQL Workbench connection)
- ✅ Click Create.
๐ Creating a Dataset in ADF
- Navigate to ADF → Author → Datasets → + New.
- Select Azure Data Lake Storage Gen2.
- set propertites:
- Format: Delimited Text
- Name: ds_asldata
- Linked Service: LS_FEB13
- File Path:
input/asldata
- ✅ Click OK.
๐️ Creating a Pipeline in ADF
- Navigate to ADF → Author → Pipelines → + New.
- Drag & drop Copy Activity.
- Configure Source:
- Dataset: ds_asldata
- Configure Sink:
- Dataset: (Create New) → SQL Server
- Linked Service: LS_SqlServer1
- Table Name: asltab
- ✅ Save and Publish the pipeline.
⏩ Executing the Pipeline
- Click Debug to run the pipeline.
- go to SQL workbench local system delete asltab then perform query select *from asltab we get all information that means our pipeline work . Before we i have only column names only no data now we got all data.
๐ฏ Conclusion
Following these steps, you've successfully copied data from Azure Data Lake Storage (ADLS) to SQL Server using Azure Data Factory (ADF). ๐
stay connected! ๐
Comments
Post a Comment