🚀 Azure Data Factory: Copying Data from ADLS to MSSQL

🔑 Key Vault & Initial Setup

To copy data from Azure Data Lake Storage (ADLS) to Microsoft SQL Server (MSSQL) using Azure Data Factory (ADF), follow these structured steps.

🌐 Step 1: Creating Azure Data Factory (ADF) & Linking Services

🏗️ Creating an Azure Data Factory

🔍 Search for "Data Factory" in the Azure portal.
➕ Click "Create Data Factory".
📦 Select the Free Tier.
📂 Attach the resource group: rock2025.
🏷️ Provide a name, e.g., learnazure (lowercase, no special characters or numbers).
✅ Click "Review + Create" and proceed with deployment.

🔗 Linking Azure Data Lake Storage to Data Factory

Navigate to Azure Data Factory → Launch ADF Studio.
In the Manage section:
- Click Linked Services → + New Linked Service.
- Select Azure Data Lake Storage Gen2.
- Provide a name, e.g., LS_FEB13.
- Choose the subscription (Free Tier).
- Select the Storage Account (e.g., divyabucket02).
✅ Click Create.

🏛️ Step 2: Creating an SQL Database

🗄️ Creating a Database in Azure

🔍 Search for SQL Database in Azure.
➕ Click + Create Database.
Configure the following settings:
- Subscription: Free Tier
- Resource Group: divyalearn_az
- Server Name: mysql13febdb
- Region: Asia Pacific (Central India)
- Authentication Method: SQL Authentication
- Server Admin Login: divya
- Password: (Set your own password)
✅ Click Review + Create.

Here’s a quick breakdown of the options to help you choose the right Azure Backup Storage Redundancy type:

1. Locally-Redundant Storage (LRS)

What it is: Data is replicated 3 times within a single datacenter in the same region.
Use case:
- Low-cost option.
- Suitable for non-critical data where data loss in case of a datacenter failure is acceptable.
Limitation: Not protected against datacenter-level failures.

2. Zone-Redundant Storage (ZRS)

What it is: Data is replicated across multiple availability zones (data centers) in the same region.
Use case:
- Higher availability within the same region.
- Suitable for applications needing high availability but not requiring region-level protection.
Limitation: Doesn’t protect against entire region outages.

3. Geo-Redundant Storage (GRS)

What it is: Data is replicated across regions (primary + secondary region, hundreds of miles apart).
Use case:
- Suitable for critical applications where data recovery after regional disasters is essential.
- Provides disaster recovery capabilities.
Limitation: More expensive than LRS and ZRS.

4. Geo-Zone-Redundant Storage (GZRS) [Preview]

What it is: Combines ZRS and GRS by replicating data across availability zones and geographically to a secondary region.
Use case:
- Highest durability and availability, covering datacenter, zone, and regional failures.
- Ideal for mission-critical workloads.
Limitation: Likely the most expensive option.

✅ Which should you choose as a student?

If you’re learning or working on non-production projects:

Choose: 🔹 Locally-Redundant Storage (LRS) — It’s cost-effective and sufficient for learning purposes.
If you want to explore high availability concepts:
- Zone-Redundant Storage (ZRS) could be a next step.

Would you like more real-world scenarios or examples for each?

🔗 Connecting SQL Workbench to the Database(locally in own system)

Copy Connection String from Azure SQL Database.
Open SQL Workbench → File → New Connection.

Paste the JDBC connection string:

jdbc:sqlserver://divyamysqldb.database.windows.net:1433;
database=mysql13febdb;
user=divya@divyamysqldb;
password={your_password_here};
encrypt=true;
trustServerCertificate=false;

Click Connect.

🔧 Troubleshooting SQL Connection Issues

If unable to connect:
1. Navigate to SQL Database → Networking → Firewall Rules.
2. Add a new rule:
  - Rule Name: AllAccess
  - Start IP: 0.0.0.0
  - End IP: 0.0.0.0 (Allows global access, not recommended for production).
3. ✅ Save and retry connection.

📤 Step 3: Copying Data from ADLS to SQL Database

📑 Preparing the Table in SQL Workbench

Run the following SQL command:

CREATE TABLE asltab (
    name VARCHAR(32),
    age INT,
    city VARCHAR(32)
);

Verify table creation:
```
SELECT * FROM asltab;
```

🔗 Creating & Configuring Linked Services in ADF

Navigate to Azure Data Factory (ADF) → Manage.
Click Linked Services → + New.
Select SQL Server and configure:
- Name: LS_SqlServer1
- Server Name: divyamysqldb.database.windows.net
- Database: mysql13febdb
- Authentication Type: SQL Authentication
- Username & Password: (Same as SQL Workbench connection)
✅ Click Create.

📂 Creating a Dataset in ADF

Navigate to ADF → Author → Datasets → + New.
Select Azure Data Lake Storage Gen2.
set propertites:
- Format: Delimited Text
- Name: ds_asldata
- Linked Service: LS_FEB13
- File Path: input/asldata
✅ Click OK.

🏗️ Creating a Pipeline in ADF

Navigate to ADF → Author → Pipelines → + New.
Drag & drop Copy Activity.
Configure Source:
- Dataset: ds_asldata
Configure Sink:
- Dataset: (Create New) → SQL Server
- Linked Service: LS_SqlServer1
- Table Name: asltab
✅ Save and Publish the pipeline.

⏩ Executing the Pipeline

Click Debug to run the pipeline.
go to SQL workbench local system delete asltab then perform query select *from asltab we get all information that means our pipeline work . Before we i have only column names only no data now we got all data.

🎯 Conclusion

Following these steps, you've successfully copied data from Azure Data Lake Storage (ADLS) to SQL Server using Azure Data Factory (ADF). 🚀

stay connected! 😊

Search This Blog

Data Nexus