🌐 Azure Data Factory Pipeline: Copy SQL Server Tables to Azure Data Lake Gen2
Azure Data Factory Pipeline: Copy SQL Server Tables to Azure Data Lake Gen2
This blog provides a step-by-step guide to create an Azure Data Factory (ADF) pipeline that lists all tables from a SQL Server database and copies them one by one into Azure Data Lake Gen2.
Step 1: Create Pipeline with Lookup Activity
-
Add Lookup Activity
- In the pipeline canvas, drag and drop the Lookup activity.
-
Source Dataset
- Create a New Dataset:
- Name:
sqlserval
- Linked service:
lssql
(SQL Server linked service)
- Name:
- Create a New Dataset:
-
Query Settings
- Use the following query:
SELECT TABLE_NAME FROM information_schema.tables WHERE table_type = 'BASE TABLE';
- Uncheck the "First row only" option.
- Use the following query:
-
Debug
- Click Debug to ensure all SQL Server tables are listed in the output.
Step 2: ForEach Activity for Iteration
-
Add ForEach Activity
- Drag ForEach activity to the canvas.
- Connect it to the Lookup activity.
-
Settings Tab
- Sequential: Enable this option.
- Items: Add the dynamic expression:
@activity('lookup1').output.value
Step 3: Inside ForEach - Copy Activity
-
Add Copy Activity
- Inside the ForEach activity, drag and drop the Copy activity.
-
Source Settings
- Dataset: Create a new SQL Server dataset.
- Name:
anyname_ds
- Linked service:
ls_ddd
- Name:
- Parameters: Create a parameter
p
. - Table Value: Use the dynamic content:
@item().TABLE_NAME
- Dataset: Create a new SQL Server dataset.
-
Sink Settings
- Dataset: Create a new Azure Data Lake Gen2 Delimited Text dataset.
- Name:
sink
- Linked service: Link to Gen2 storage.
- File Path:
out/folder/@dataset().p
- Name:
- Call the dataset parameter in the file path:
@dataset().p
- Dataset: Create a new Azure Data Lake Gen2 Delimited Text dataset.
Step 4: Create and Assign Variables
-
Create Variable
- In the Variables tab, create a variable to store the table name.
-
Set Variable
- Use Set Variable activity if needed to capture
@item().TABLE_NAME
during each iteration.
- Use Set Variable activity if needed to capture
Step 5: Execute the Pipeline
- Trigger the pipeline.
- The pipeline will iterate through each table in SQL Server and copy the data into the specified Azure Data Lake Gen2 folder.
Key Dynamic Content Used
- To get table names from Lookup:
@activity('lookup1').output.value
- Passing table names dynamically to the dataset:
@item().TABLE_NAME
- Dynamic file path in Sink:
@dataset().p
Conclusion
Following these steps, you will have a fully functional ADF pipeline that dynamically reads all tables from a SQL Server database and copies them to Azure Data Lake Gen2 one by one. Adjust file paths and dataset parameters based on your project requirements. Happy data engineering!
Comments
Post a Comment