Getting Started with Azure: Creating an Account, Subscription, Resource Group, and Storage Account, Link services
🚀 Getting Started with Azure: Creating an Account, Subscription, Resource Group, and Storage Account
Azure is a powerful cloud platform that provides a range of services for managing, storing, and processing data. Before utilizing Azure services, you need to follow four essential steps:
- 🆕 Create an Azure Account
- 📝 Set Up a Subscription
- 📦 Create a Resource Group
- 🗂️ Use Resources (e.g., Storage Account)
🛠️ Step 1: Creating a Resource Group in Azure
A Resource Group in Azure is a logical container that holds related Azure resources, allowing for easy management, organization, and access control. Any resource in Azure must belong to a resource group.
📌 How to Create a Resource Group?
- 🔍 Search for "Resource Group" in the Azure search bar.
- 🏢 Click on "Create Resource Group" and provide a name.
- ✅ Click "Create" to finalize the resource group.
Alternative Method: While creating a Storage Account, you can create a new Resource Group:
- Click on "Create New" during storage account setup.
- Enter a name, e.g.,
divyalearn_az
. - Click "Create".
🛠️ Step 2: Creating a Storage Account in Azure
A Storage Account in Azure allows you to store data in various formats, such as Blob Storage, Table Storage, File Shares, and Queues.
📊 How to Create a Storage Account?
- 🔍 Search for "Storage Account" in the Azure search bar.
- 👉 Click "Create Storage Account".
- 📦 Attach the existing resource group (
divyalearn_az
). - 📝 Fill in the Instance Details:
- Storage Account Name:
divyabucket02
(must be unique, lowercase, and can include numbers). - 🌍 Region: Choose any location, e.g., USA.
- 🎨 Primary Services:
- Azure Blob Storage (for unstructured data backup like WhatsApp media, etc.)
- Azure Delta Lake Storage (for big data and data science analysis, as it supports structured and semi-structured data.)
- Storage Account Name:
- 🌐 Performance Options:
- Standard (default for general use)
- Premium (recommended for production workloads)
- 🏙️ Redundancy:
- Locally Redundant Storage (LRS): Data is stored in a single location (low-cost, good for practice).
- Geo-Redundant Storage (GRS): Data is stored across multiple locations (recommended for real-time scenarios, reduces data loss risk but is costly).
- 🏰 Enable Hierarchical Namespace:
- ✅ If enabled, it is considered Data Lake Storage (used in 80% of office use cases).
- ❌ If disabled, it remains Blob Storage (suitable for general-purpose file storage).
- ✅ Click "Review + Create" and finalize the setup.
📂 Working with Containers
Once the Storage Account is created:
- Navigate to Storage Account → Data Storage → Containers → Click + Container.
- Provide a container name, e.g.,
input
oroutput
(similar to folders in AWS S3 buckets). - ✅ Click "Create".
- 💾 Upload data by dragging and dropping files into the container (e.g.,
1000record.csv
andasl.csv
).
🌌 Step 3: Processing Data Using Azure Data Factory
🛠️ Creating an Azure Data Factory
- 🔍 Search for "Data Factory" in Azure.
- 👉 Click "Create Data Factory".
- 💏 Select the Free Tier.
- 📦 Attach the resource group (
divyalearn_az
). - 🏷️ Provide a name, e.g.,
learnazure
(lowercase, no special characters or numbers). - ✅ Click "Review + Create".
👤 Linking Azure Data Lake Storage to Data Factory
- Navigate to Azure Data Factory → Launch Studio.
- In the Manage section (4th option on the left panel):
- Click Linked Services → + New Linked Service.
- Select Azure Data Lake Storage Gen2.
- Provide a name, e.g.,
myfirstlinkservice
(Ls_connect_adl
for company standard names). - Choose the subscription (
Free Tier
). - Select the Storage Account (
divyabucket02
). - ✅ Click "Create".
🛠️ Creating a Dataset in Data Factory
- Navigate to Author (pencil icon on the left panel).
- Click "Dataset" → + New Dataset.
- Set the properties:
- Name:
Ds_adharcardata
. - Linked Service:
Ls_connect_adl
. - File Path: Drag and drop
input/adharcard.csv
.
🎯 Creating a Data Flow to Process Data
- In Author, click "Data Flow" → + New Data Flow".
- Set properties:
- Name:
ds_processadhar
. - Click Add Source → Arrow → Add Source.
- Name:
- Configure Source Settings:
- Output Stream Name:
input_source1
. - Source Type: Dataset.
- Dataset:
Ds_adharcardata
. - ✅ Enable Dataflow Debug (to preview data for one hour).
- Output Stream Name:
- Click + Add Filter Condition to process specific data:
- Select Filter On.
- Click Expression Builder.
- Add condition:
age > 35
.
- Click + Add Sink to store processed data.
📄 Conclusion
By following these steps, you have successfully:
- Created an Azure Resource Group.
- Set up an Azure Storage Account with containers.
- Created an Azure Data Factory and linked it to Data Lake Storage.
- Processed a dataset (
adharcard.csv
) using Data Flow with a filter (age > 35
). - Stored the processed data using Sink in Data Factory.
These fundamental steps help in managing and processing data efficiently using Azure cloud services. 💪🚀
Comments
Post a Comment