0% found this document useful (0 votes)
13 views1 page

ADE - 7 - 30AM - Frame 4

Uploaded by

Burugolla Ravi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views1 page

ADE - 7 - 30AM - Frame 4

Uploaded by

Burugolla Ravi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Data Ingestion Data Transformation

DB's Transformation

Extraction Loading
Extraction Loading ADF
Onprem ADF
FileSys

Cloud
FileSys

DATA SOURCES ADLS GEN2 ADLS GEN2

After creation of Data Factory, we can open it in two different ways


1. By using Portal, we can launch ADF(www.portal.azure.com)
2. by using ADF portal, we can launch ADF(www.adf.azure.com)

Azure Data Factory : It is a cloud-based integration services that transforms and orchestrates data sources to destination(cloud)

Building Blocks / Components Of ADF


1. Integration Runtimes
2. Linked Services
3. Datasets
4. Activities
5. Dataflows
6. Pipeline

Integration Runtimes: Integration runtime is nothing but a compute structure used by Azure Data Factory to give integration
capabilities across different network environments

There are three types of Integration Runtimes


1. Azure IR
2. Self-Hosted IR
3. SSIS - IR

Azure IR It is used to bring data from Azure Storages Blob Storage, ADLS Gen2, Azure SQL Server, Azure DWH and used to load data
into Azure Storages. It's default Integration Runtime in ADF. It comes with ADF

DB DB
Azure IR Azure IR
ADF

FS ADWH FS ADWH

SOURCEAzure Storages) TARGETAzure Storages)

Self-Hosted IR It is used to bring data from Non-Azure systems and On-prem Systems

DB
DB
Self-Hosted IR Azure IR
DWH ADF

FS ADWH
FS

SOURCEOnprem Storages) TARGETAzure Storages)

SSIS IR It is used to to execute SSIS packages in the Data Factory We can lift and shift SSIS packages as it is without doing any changes into
Azure and then we can execute them with the help of SSIS IR

Linked Services: It is used to have connection details of the Data source and target(sink)
There are various connectors to read data and to load data.
Ex: Azure SQL Database
Azure Data Lake Storage Gen2
SQL Server Database
:

Datasets: It is used represent data that is required for pipeline


Whatever the connectors are used to create linked service, same connector can be used to create a Dataset.

Ex: Azure SQL Database


Azure Data Lake Storage Gen2
SQL Server Database
:

Activity: It is used to represent single processing step in the pipeline. Output of activity may be input for another activity
Ex: Getting the size of the file
Copying data from one place to another place
Checking Folder existence

Dataflow Activity: It is used to shape/transform data between Data Source and Target based on business requirement.

Pipeline: It is used to represent Logical Group of Activities that performs one unit of work. Single or Collection of activities is known as
Pipeline. Azure Data Factory contains multiple pipelines

Single Processing step Single Processing step


Single Processing step

One Unit Of Work


Activity-1 Activity-2 Activity-3

Pipeline

Control flow: It’s used to control the execution flow of the pipeline activities.

Trigger: It specifies the time when the pipeline will be executed.

You might also like