ADE - 7 - 30AM - Frame 4
ADE - 7 - 30AM - Frame 4
DB's Transformation
Extraction Loading
Extraction Loading ADF
Onprem ADF
FileSys
Cloud
FileSys
Azure Data Factory : It is a cloud-based integration services that transforms and orchestrates data sources to destination(cloud)
Integration Runtimes: Integration runtime is nothing but a compute structure used by Azure Data Factory to give integration
capabilities across different network environments
Azure IR It is used to bring data from Azure Storages Blob Storage, ADLS Gen2, Azure SQL Server, Azure DWH and used to load data
into Azure Storages. It's default Integration Runtime in ADF. It comes with ADF
DB DB
Azure IR Azure IR
ADF
FS ADWH FS ADWH
Self-Hosted IR It is used to bring data from Non-Azure systems and On-prem Systems
DB
DB
Self-Hosted IR Azure IR
DWH ADF
FS ADWH
FS
SSIS IR It is used to to execute SSIS packages in the Data Factory We can lift and shift SSIS packages as it is without doing any changes into
Azure and then we can execute them with the help of SSIS IR
Linked Services: It is used to have connection details of the Data source and target(sink)
There are various connectors to read data and to load data.
Ex: Azure SQL Database
Azure Data Lake Storage Gen2
SQL Server Database
:
Activity: It is used to represent single processing step in the pipeline. Output of activity may be input for another activity
Ex: Getting the size of the file
Copying data from one place to another place
Checking Folder existence
Dataflow Activity: It is used to shape/transform data between Data Source and Target based on business requirement.
Pipeline: It is used to represent Logical Group of Activities that performs one unit of work. Single or Collection of activities is known as
Pipeline. Azure Data Factory contains multiple pipelines
Pipeline
Control flow: It’s used to control the execution flow of the pipeline activities.