Azure Data Factory Microsoft Fabric
Azure Data Factory Microsoft Fabric
August 2024
Overview
02
Key Features of Azure Data Factory
1.Pipelines
Logical grouping of activities that perform a unit of work.
Allows management of activities as a set.
Can operate sequentially or in parallel.
2. Activities
Represent processing steps in a pipeline (e.g., copying or transforming data).
Types include data movement, data transformation, and control activities.
3. Datasets
Represent data structures within data stores.
Used as inputs or outputs in activities.
4. Linked Services
Define the connection information to external resources (like connection strings).
Used for data stores and compute resources.
03
5. Data Flows
Manage graphs of data transformation logic.
Execute processes on a Spark cluster, automatically scaled as needed.
6. Integration Runtime
Provides the bridge between activities and linked services.
Executes activities in the closest region for performance and compliance.
7. Triggers
Determine when a pipeline execution is initiated.
Various types available for different events.
8. Parameters: Read-only configurations defined in the pipeline, passed during execution.
9. Variables: Temporary storage within pipelines, used for value passing between activities.
10. Control Flow
Orchestration of activities, including sequencing, branching, and looping.
Allows complex workflows with custom state passing.
04
Azure Blob Storage
Scalable object storage for any type of data (e.g., photos,
videos, backups).
Ideal for storing and retrieving large amounts of unstructured
data.
Automatically scales to handle growing data needs.
Access data from anywhere via the internet with secure
access controls.
Pay only for what you use, with tiered pricing based on
storage class and access frequency.
Store backups with high durability.
Store data for analysis with tools like Amazon Redshift.
Store and distribute media files globally.
05
Azure Data Lake Storage Gen2
Optimized storage solution for big data analytics, combining
features of a data lake and cloud storage.
Seamlessly integrates with Azure analytics services like Azure
Databricks, Synapse Analytics, and HDInsight.
Supports directories and subdirectories for efficient data
organization.
Designed for high-throughput and low-latency big data workloads.
Offers tiered storage options and optimized transactions for cost-
effective big data processing.
Central repository for structured and unstructured data.
Store and process large datasets for insights.
Large-scale data storage for model training and analysis.
06
Data Factory
07
Data Factory
08
Data Factory
09
Data Factory
10
Introduction to Microsoft Fabric
Unified Data Platform: Combines data lakes and data
warehouses for seamless collaboration.
Scalable Analytics: Empowers organizations to analyze
data at scale.
Integrated AI: Accelerates insights and decision-making
with built-in AI capabilities.
Versatile Solution: Supports various workloads,
adaptable to any organization's needs.
Data-Driven Culture: Bridges the gap between data
teams to foster data-driven decision-making.
11
Thank You