๐Getting Started with Azure: Creating an Account, Subscription, Resource Group, and Storage Account, Link services
๐ Getting Started with Azure: Creating an Account, Subscription, Resource Group, and Storage Account
Azure is a powerful cloud platform that provides a range of services for managing, storing, and processing data. Before utilizing Azure services, you need to follow four essential steps:
- ๐ Create an Azure Account
- ๐ Set Up a Subscription
- ๐ฆ Create a Resource Group
- ๐️ Use Resources (e.g., Storage Account)
๐ ️ Step 1: Creating a Resource Group in Azure
A Resource Group in Azure is a logical container that holds related Azure resources, allowing for easy management, organization, and access control. Any resource in Azure must belong to a resource group.
๐ How to Create a Resource Group?
- ๐ Search for "Resource Group" in the Azure search bar.
- ๐ข Click on "Create Resource Group" and provide a name.
- ✅ Click "Create" to finalize the resource group.
Alternative Method: While creating a Storage Account, you can create a new Resource Group:
- Click on "Create New" during storage account setup.
- Enter a name, e.g.,
divyalearn_az. - Click "Create".
๐ ️ Step 2: Creating a Storage Account in Azure
A Storage Account in Azure allows you to store data in various formats, such as Blob Storage, Table Storage, File Shares, and Queues.
๐ How to Create a Storage Account?
- ๐ Search for "Storage Account" in the Azure search bar.
- ๐ Click "Create Storage Account".
- ๐ฆ Attach the existing resource group (
divyalearn_az). - ๐ Fill in the Instance Details:
- Storage Account Name:
divyabucket02(must be unique, lowercase, and can include numbers). - ๐ Region: Choose any location, e.g., USA.
- ๐จ Primary Services:
- Azure Blob Storage (for unstructured data backup like WhatsApp media, etc.)
- Azure Delta Lake Storage (for big data and data science analysis, as it supports structured and semi-structured data.)
- Storage Account Name:
- ๐ Performance Options:
- Standard (default for general use)
- Premium (recommended for production workloads)
- ๐️ Redundancy:
- Locally Redundant Storage (LRS): Data is stored in a single location (low-cost, good for practice).
- Geo-Redundant Storage (GRS): Data is stored across multiple locations (recommended for real-time scenarios, reduces data loss risk but is costly).
- ๐ฐ Enable Hierarchical Namespace:
- ✅ If enabled, it is considered Data Lake Storage (used in 80% of office use cases).
- ❌ If disabled, it remains Blob Storage (suitable for general-purpose file storage).
- ✅ Click "Review + Create" and finalize the setup.
๐ Working with Containers
Once the Storage Account is created:
- Navigate to Storage Account → Data Storage → Containers → Click + Container.
- Provide a container name, e.g.,
inputoroutput(similar to folders in AWS S3 buckets). - ✅ Click "Create".
- ๐พ Upload data by dragging and dropping files into the container (e.g.,
1000record.csvandasl.csv).
๐ Step 3: Processing Data Using Azure Data Factory
๐ ️ Creating an Azure Data Factory
- ๐ Search for "Data Factory" in Azure.
- ๐ Click "Create Data Factory".
- ๐ Select the Free Tier.
- ๐ฆ Attach the resource group (
divyalearn_az). - ๐ท️ Provide a name, e.g.,
learnazure(lowercase, no special characters or numbers). - ✅ Click "Review + Create".
๐ค Linking Azure Data Lake Storage to Data Factory
- Navigate to Azure Data Factory → Launch Studio.
- In the Manage section (4th option on the left panel):
- Click Linked Services → + New Linked Service.
- Select Azure Data Lake Storage Gen2.
- Provide a name, e.g.,
myfirstlinkservice(Ls_connect_adlfor company standard names). - Choose the subscription (
Free Tier). - Select the Storage Account (
divyabucket02). - ✅ Click "Create".
๐ ️ Creating a Dataset in Data Factory
- Navigate to Author (pencil icon on the left panel).
- Click "Dataset" → + New Dataset.
- Set the properties:
- Name:
Ds_adharcardata. - Linked Service:
Ls_connect_adl. - File Path: Drag and drop
input/adharcard.csv.
๐ฏ Creating a Data Flow to Process Data
- In Author, click "Data Flow" → + New Data Flow".
- Set properties:
- Name:
ds_processadhar. - Click Add Source → Arrow → Add Source.
- Name:
- Configure Source Settings:
- Output Stream Name:
input_source1. - Source Type: Dataset.
- Dataset:
Ds_adharcardata. - ✅ Enable Dataflow Debug (to preview data for one hour).
- Output Stream Name:
- Click + Add Filter Condition to process specific data:
- Select Filter On.
- Click Expression Builder.
- Add condition:
age > 35.
- Click + Add Sink to store processed data.
๐ Conclusion
By following these steps, you have successfully:
- Created an Azure Resource Group.
- Set up an Azure Storage Account with containers.
- Created an Azure Data Factory and linked it to Data Lake Storage.
- Processed a dataset (
adharcard.csv) using Data Flow with a filter (age > 35). - Stored the processed data using Sink in Data Factory.
These fundamental steps help in managing and processing data efficiently using Azure cloud services. ๐ช๐







Comments
Post a Comment