🌐Getting Started with Azure: Creating an Account, Subscription, Resource Group, and Storage Account, Link services

February 12, 2025

🚀 Getting Started with Azure: Creating an Account, Subscription, Resource Group, and Storage Account

Azure is a powerful cloud platform that provides a range of services for managing, storing, and processing data. Before utilizing Azure services, you need to follow four essential steps:

🆕 Create an Azure Account
📝 Set Up a Subscription
📦 Create a Resource Group
🗂️ Use Resources (e.g., Storage Account)

🛠️ Step 1: Creating a Resource Group in Azure

A Resource Group in Azure is a logical container that holds related Azure resources, allowing for easy management, organization, and access control. Any resource in Azure must belong to a resource group.

📌 How to Create a Resource Group?

🔍 Search for "Resource Group" in the Azure search bar.
🏢 Click on "Create Resource Group" and provide a name.
✅ Click "Create" to finalize the resource group.

Alternative Method: While creating a Storage Account, you can create a new Resource Group:

Click on "Create New" during storage account setup.
Enter a name, e.g., divyalearn_az.
Click "Create".

🛠️ Step 2: Creating a Storage Account in Azure

A Storage Account in Azure allows you to store data in various formats, such as Blob Storage, Table Storage, File Shares, and Queues.

📊 How to Create a Storage Account?

🔍 Search for "Storage Account" in the Azure search bar.
👉 Click "Create Storage Account".
📦 Attach the existing resource group (divyalearn_az).
📝 Fill in the Instance Details:
- Storage Account Name: divyabucket02 (must be unique, lowercase, and can include numbers).
- 🌍 Region: Choose any location, e.g., USA.
- 🎨 Primary Services:
  - Azure Blob Storage (for unstructured data backup like WhatsApp media, etc.)
  - Azure Delta Lake Storage (for big data and data science analysis, as it supports structured and semi-structured data.)
🌐 Performance Options:
- Standard (default for general use)
- Premium (recommended for production workloads)
🏙️ Redundancy:
- Locally Redundant Storage (LRS): Data is stored in a single location (low-cost, good for practice).
- Geo-Redundant Storage (GRS): Data is stored across multiple locations (recommended for real-time scenarios, reduces data loss risk but is costly).
🏰 Enable Hierarchical Namespace:
- ✅ If enabled, it is considered Data Lake Storage (used in 80% of office use cases).
- ❌ If disabled, it remains Blob Storage (suitable for general-purpose file storage).
✅ Click "Review + Create" and finalize the setup.

📂 Working with Containers

Once the Storage Account is created:

Navigate to Storage Account → Data Storage → Containers → Click + Container.
Provide a container name, e.g., input or output (similar to folders in AWS S3 buckets).
✅ Click "Create".
💾 Upload data by dragging and dropping files into the container (e.g., 1000record.csv and asl.csv).

🌌 Step 3: Processing Data Using Azure Data Factory

🛠️ Creating an Azure Data Factory

🔍 Search for "Data Factory" in Azure.
👉 Click "Create Data Factory".
💏 Select the Free Tier.
📦 Attach the resource group (divyalearn_az).
🏷️ Provide a name, e.g., learnazure (lowercase, no special characters or numbers).
✅ Click "Review + Create".

👤 Linking Azure Data Lake Storage to Data Factory

Navigate to Azure Data Factory → Launch Studio.
In the Manage section (4th option on the left panel):

Click Linked Services → + New Linked Service.
Select Azure Data Lake Storage Gen2.
Provide a name, e.g., myfirstlinkservice (Ls_connect_adl for company standard names).
Choose the subscription (Free Tier).
Select the Storage Account (divyabucket02).
✅ Click "Create".

🛠️ Creating a Dataset in Data Factory

Navigate to Author (pencil icon on the left panel).
Click "Dataset" → + New Dataset.
Set the properties:

Name: Ds_adharcardata.
Linked Service: Ls_connect_adl.
File Path: Drag and drop input/adharcard.csv.

🎯 Creating a Data Flow to Process Data

In Author, click "Data Flow" → + New Data Flow".
Set properties:
- Name: ds_processadhar.
- Click Add Source → Arrow → Add Source.
Configure Source Settings:
- Output Stream Name: input_source1.
- Source Type: Dataset.
- Dataset: Ds_adharcardata.
- ✅ Enable Dataflow Debug (to preview data for one hour).
Click + Add Filter Condition to process specific data:
- Select Filter On.
- Click Expression Builder.
- Add condition: age > 35.
Click + Add Sink to store processed data.

📄 Conclusion

By following these steps, you have successfully:

Created an Azure Resource Group.
Set up an Azure Storage Account with containers.
Created an Azure Data Factory and linked it to Data Lake Storage.
Processed a dataset (adharcard.csv) using Data Flow with a filter (age > 35).
Stored the processed data using Sink in Data Factory.

These fundamental steps help in managing and processing data efficiently using Azure cloud services. 💪🚀

Search This Blog

Data Nexus