๐ŸŒGetting Started with Azure: Creating an Account, Subscription, Resource Group, and Storage Account, Link services

 

๐Ÿš€ Getting Started with Azure: Creating an Account, Subscription, Resource Group, and Storage Account

Azure is a powerful cloud platform that provides a range of services for managing, storing, and processing data. Before utilizing Azure services, you need to follow four essential steps:

  1. ๐Ÿ†• Create an Azure Account
  2. ๐Ÿ“ Set Up a Subscription
  3. ๐Ÿ“ฆ Create a Resource Group
  4. ๐Ÿ—‚️ Use Resources (e.g., Storage Account)

๐Ÿ› ️ Step 1: Creating a Resource Group in Azure

A Resource Group in Azure is a logical container that holds related Azure resources, allowing for easy management, organization, and access control. Any resource in Azure must belong to a resource group.

๐Ÿ“Œ How to Create a Resource Group?

  • ๐Ÿ” Search for "Resource Group" in the Azure search bar.
  • ๐Ÿข Click on "Create Resource Group" and provide a name.
  • ✅ Click "Create" to finalize the resource group.

Alternative Method: While creating a Storage Account, you can create a new Resource Group:

  • Click on "Create New" during storage account setup.
  • Enter a name, e.g., divyalearn_az.
  • Click "Create".



๐Ÿ› ️ Step 2: Creating a Storage Account in Azure

A Storage Account in Azure allows you to store data in various formats, such as Blob Storage, Table Storage, File Shares, and Queues.

๐Ÿ“Š How to Create a Storage Account?

  1. ๐Ÿ” Search for "Storage Account" in the Azure search bar.
  2. ๐Ÿ‘‰ Click "Create Storage Account".
  3. ๐Ÿ“ฆ Attach the existing resource group (divyalearn_az).
  4. ๐Ÿ“ Fill in the Instance Details:
    • Storage Account Name: divyabucket02 (must be unique, lowercase, and can include numbers).
    • ๐ŸŒ Region: Choose any location, e.g., USA.
    • ๐ŸŽจ Primary Services:
      • Azure Blob Storage (for unstructured data backup like WhatsApp media, etc.)
      • Azure Delta Lake Storage (for big data and data science analysis, as it supports structured and semi-structured data.)
  5. ๐ŸŒ Performance Options:
    • Standard (default for general use)
    • Premium (recommended for production workloads)
  6. ๐Ÿ™️ Redundancy:
    • Locally Redundant Storage (LRS): Data is stored in a single location (low-cost, good for practice).
    • Geo-Redundant Storage (GRS): Data is stored across multiple locations (recommended for real-time scenarios, reduces data loss risk but is costly).
  7. ๐Ÿฐ Enable Hierarchical Namespace:
    • ✅ If enabled, it is considered Data Lake Storage (used in 80% of office use cases).
    • ❌ If disabled, it remains Blob Storage (suitable for general-purpose file storage).
  8. ✅ Click "Review + Create" and finalize the setup.

๐Ÿ“‚ Working with Containers

Once the Storage Account is created:

  • Navigate to Storage AccountData StorageContainers → Click + Container.
  • Provide a container name, e.g., input or output (similar to folders in AWS S3 buckets).
  • ✅ Click "Create".
  • ๐Ÿ’พ Upload data by dragging and dropping files into the container (e.g., 1000record.csv and asl.csv).



๐ŸŒŒ Step 3: Processing Data Using Azure Data Factory

๐Ÿ› ️ Creating an Azure Data Factory

  1. ๐Ÿ” Search for "Data Factory" in Azure.
  2. ๐Ÿ‘‰ Click "Create Data Factory".
  3. ๐Ÿ’ Select the Free Tier.
  4. ๐Ÿ“ฆ Attach the resource group (divyalearn_az).
  5. ๐Ÿท️ Provide a name, e.g., learnazure (lowercase, no special characters or numbers).
  6. ✅ Click "Review + Create".

๐Ÿ‘ค Linking Azure Data Lake Storage to Data Factory

  1. Navigate to Azure Data FactoryLaunch Studio.
  2. In the Manage section (4th option on the left panel):
    • Click Linked Services+ New Linked Service.
    • Select Azure Data Lake Storage Gen2.
    • Provide a name, e.g., myfirstlinkservice (Ls_connect_adl for company standard names).
    • Choose the subscription (Free Tier).
    • Select the Storage Account (divyabucket02).
    • ✅ Click "Create".

๐Ÿ› ️ Creating a Dataset in Data Factory

  1. Navigate to Author (pencil icon on the left panel).
  2. Click "Dataset" → + New Dataset.
  3. Set the properties:
    • Name: Ds_adharcardata.
    • Linked Service: Ls_connect_adl.
    • File Path: Drag and drop input/adharcard.csv.



๐ŸŽฏ Creating a Data Flow to Process Data

  1. In Author, click "Data Flow" → + New Data Flow".
  2. Set properties:
    • Name: ds_processadhar.
    • Click Add Source → Arrow → Add Source.
  3. Configure Source Settings:
    • Output Stream Name: input_source1.
    • Source Type: Dataset.
    • Dataset: Ds_adharcardata.
    • ✅ Enable Dataflow Debug (to preview data for one hour).
  4. Click + Add Filter Condition to process specific data:
    • Select Filter On.
    • Click Expression Builder.
    • Add condition: age > 35.
  5. Click + Add Sink to store processed data.

๐Ÿ“„ Conclusion

By following these steps, you have successfully:

  1. Created an Azure Resource Group.
  2. Set up an Azure Storage Account with containers.
  3. Created an Azure Data Factory and linked it to Data Lake Storage.
  4. Processed a dataset (adharcard.csv) using Data Flow with a filter (age > 35).
  5. Stored the processed data using Sink in Data Factory.

These fundamental steps help in managing and processing data efficiently using Azure cloud services. ๐Ÿ’ช๐Ÿš€

Comments

Popular posts from this blog

๐ŸŒFiltering and Copying Files Dynamically in Azure Data Factory (ADF)

๐Ÿ”ฅApache Spark Architecture with RDD & DAG

๐Ÿš€End-to-End Data Flow Pipeline using Apache NiFi, Kafka-Spark Structured Streaming, and Snowflake