Getting Started with Azure: Creating an Account, Subscription, Resource Group, and Storage Account, Link services

 

🚀 Getting Started with Azure: Creating an Account, Subscription, Resource Group, and Storage Account

Azure is a powerful cloud platform that provides a range of services for managing, storing, and processing data. Before utilizing Azure services, you need to follow four essential steps:

  1. 🆕 Create an Azure Account
  2. 📝 Set Up a Subscription
  3. 📦 Create a Resource Group
  4. 🗂️ Use Resources (e.g., Storage Account)

🛠️ Step 1: Creating a Resource Group in Azure

A Resource Group in Azure is a logical container that holds related Azure resources, allowing for easy management, organization, and access control. Any resource in Azure must belong to a resource group.

📌 How to Create a Resource Group?

  • 🔍 Search for "Resource Group" in the Azure search bar.
  • 🏢 Click on "Create Resource Group" and provide a name.
  • ✅ Click "Create" to finalize the resource group.

Alternative Method: While creating a Storage Account, you can create a new Resource Group:

  • Click on "Create New" during storage account setup.
  • Enter a name, e.g., divyalearn_az.
  • Click "Create".



🛠️ Step 2: Creating a Storage Account in Azure

A Storage Account in Azure allows you to store data in various formats, such as Blob Storage, Table Storage, File Shares, and Queues.

📊 How to Create a Storage Account?

  1. 🔍 Search for "Storage Account" in the Azure search bar.
  2. 👉 Click "Create Storage Account".
  3. 📦 Attach the existing resource group (divyalearn_az).
  4. 📝 Fill in the Instance Details:
    • Storage Account Name: divyabucket02 (must be unique, lowercase, and can include numbers).
    • 🌍 Region: Choose any location, e.g., USA.
    • 🎨 Primary Services:
      • Azure Blob Storage (for unstructured data backup like WhatsApp media, etc.)
      • Azure Delta Lake Storage (for big data and data science analysis, as it supports structured and semi-structured data.)
  5. 🌐 Performance Options:
    • Standard (default for general use)
    • Premium (recommended for production workloads)
  6. 🏙️ Redundancy:
    • Locally Redundant Storage (LRS): Data is stored in a single location (low-cost, good for practice).
    • Geo-Redundant Storage (GRS): Data is stored across multiple locations (recommended for real-time scenarios, reduces data loss risk but is costly).
  7. 🏰 Enable Hierarchical Namespace:
    • ✅ If enabled, it is considered Data Lake Storage (used in 80% of office use cases).
    • ❌ If disabled, it remains Blob Storage (suitable for general-purpose file storage).
  8. ✅ Click "Review + Create" and finalize the setup.

📂 Working with Containers

Once the Storage Account is created:

  • Navigate to Storage AccountData StorageContainers → Click + Container.
  • Provide a container name, e.g., input or output (similar to folders in AWS S3 buckets).
  • ✅ Click "Create".
  • 💾 Upload data by dragging and dropping files into the container (e.g., 1000record.csv and asl.csv).



🌌 Step 3: Processing Data Using Azure Data Factory

🛠️ Creating an Azure Data Factory

  1. 🔍 Search for "Data Factory" in Azure.
  2. 👉 Click "Create Data Factory".
  3. 💏 Select the Free Tier.
  4. 📦 Attach the resource group (divyalearn_az).
  5. 🏷️ Provide a name, e.g., learnazure (lowercase, no special characters or numbers).
  6. ✅ Click "Review + Create".

👤 Linking Azure Data Lake Storage to Data Factory

  1. Navigate to Azure Data FactoryLaunch Studio.
  2. In the Manage section (4th option on the left panel):
    • Click Linked Services+ New Linked Service.
    • Select Azure Data Lake Storage Gen2.
    • Provide a name, e.g., myfirstlinkservice (Ls_connect_adl for company standard names).
    • Choose the subscription (Free Tier).
    • Select the Storage Account (divyabucket02).
    • ✅ Click "Create".

🛠️ Creating a Dataset in Data Factory

  1. Navigate to Author (pencil icon on the left panel).
  2. Click "Dataset" → + New Dataset.
  3. Set the properties:
    • Name: Ds_adharcardata.
    • Linked Service: Ls_connect_adl.
    • File Path: Drag and drop input/adharcard.csv.



🎯 Creating a Data Flow to Process Data

  1. In Author, click "Data Flow" → + New Data Flow".
  2. Set properties:
    • Name: ds_processadhar.
    • Click Add Source → Arrow → Add Source.
  3. Configure Source Settings:
    • Output Stream Name: input_source1.
    • Source Type: Dataset.
    • Dataset: Ds_adharcardata.
    • ✅ Enable Dataflow Debug (to preview data for one hour).
  4. Click + Add Filter Condition to process specific data:
    • Select Filter On.
    • Click Expression Builder.
    • Add condition: age > 35.
  5. Click + Add Sink to store processed data.

📄 Conclusion

By following these steps, you have successfully:

  1. Created an Azure Resource Group.
  2. Set up an Azure Storage Account with containers.
  3. Created an Azure Data Factory and linked it to Data Lake Storage.
  4. Processed a dataset (adharcard.csv) using Data Flow with a filter (age > 35).
  5. Stored the processed data using Sink in Data Factory.

These fundamental steps help in managing and processing data efficiently using Azure cloud services. 💪🚀

Comments

Popular posts from this blog

AWS Athena, AWS Lambda, AWS Glue, and Amazon S3 – Detailed Explanation

Kafka Integrated with Spark Structured Streaming

Azure Data Factory: Copying Data from ADLS to MSSQL