100% found this document useful (1 vote)

152 views4 pages

Adf Loop PDF

Uploaded by

srinivaskrishna235

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

152 views4 pages

Adf Loop PDF

Uploaded by

srinivaskrishna235

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

AZURE DATA FACTORY

1. What is azure data factory used for Azure Data factory is the data orchestration service provided by the Microsoft Azure cloud. ADF is used for
following use cases mainly:
1. Data migration from one data source to other
2. On Premise to cloud data migration(Data migration is the process of moving data from one location to another, one format to another,
or one application to another)
3. ETL purpose
4. Automated the data flow.
**3. what are the Top-Level Concepts of Azure DF? Data Factory consists of a number of components? What kind of activities from data factory
you used in your project? Buildings blocks of ADF?
 Pipeline – A pipeline is a collection of data movement and transformation activities, grouped together to achieve a higher-level data
integration task.
 Activities: Activities represent the processing steps in a pipeline. A pipeline can have one or multiple activities. It can be anything i.e process like
querying a data set or moving the dataset from one source to another.
 Dataset: it connects to the data source via linked service. It is created based upon the type of the data and data source you want to connect.
Dataset resembles the type of the data holds by data source.
 Linked Service – Linked service in azure data factory are basically the connection mechanism to connect to the external source. It works as the
connection string to hold the user authentication information.
 Integration runtime is referred to as a compute infrastructure used by Azure Data Factory. It provides integration capabilities across various
network environments.
 Trigger – A trigger is a unit of processing that determines when a pipeline needs to be run. These can be scheduled or set off (triggered) by a
different event.
 Control Flow – The control flow in a data factory is what’s orchestrating how the pipeline is going to be sequenced. This includes activities
you’ll be performing with those pipelines, such as sequencing, branching and looping.
**5. what is the Integration Runtimes and types?
Integration runtime is referred to as a compute infrastructure used by Azure Data Factory. It provides integration capabilities across various network
environments.
A quick look at the Types of Integration Runtimes:
1. Azure Integration Runtime – Can copy data between cloud data stores and send activity to various computing services such as SQL
Server, Azure HDInsight, etc.
2. Self-Hosted Integration Runtime – It’s basically software with the same code as the Azure Integration Runtime, but it’s installed on your
local system or virtual machine over a virtual network.
3. Azure SSIS Integration Runtime – Azure SSIS Integration is a fully managed cluster of virtual machines that are hosted in Azure and
dedicated to run SSIS packages in the data factory. We can easily scale up the SSIS nodes by configuring the node size or scaled out by
configuring the number of nodes on the Virtual Machine’s cluster.
6. What is required to execute an SSIS package in Data Factory? We need to create an SSIS Integration Runtime, and an SSIS Database catalogue
hosted in the Azure SQL database or Azure SQL managed instance.

7b. What are the storage types in Azure?

 Azure Blob: A scalable object store for text and binary data
 Azure Files: Managed file shares for cloud or on-premises deployments.
 Azure Queue: A messaging store for reliable messaging between application components.
 Azure Table: A NoSQL store for no-schema storage of structured data

8. What is use of lookup activity in azure data factory?

Lookup activity in adf pipeline is generally used for configuration lookup purpose. It has the source dataset. Lookup activity used to pull the data
from source dataset and keep it as the output of the activity. Output of the lookup activity generally used further in the pipeline for making some
decision, configuration accordingly.
Lookup activity can retrieve a dataset from any of the data sources supported by data factory and Synapse pipelines. The Lookup activity can
read data stored in a database or file system and pass it to subsequent copy or transformation activities.

9. What is copy activity in azure data factory In Azure Data Factory and Synapse pipelines, you can use the Copy activity to copy data among data
stores located on-premises and in the cloud. After you copy the data, you can use other activities to further transform and analyze it. For creating
the copy activity you need to have your source and destination ready. Here destination is called as sink. Copy activity requires:
Linked Service Dataset

9a. What is Get Metadata activity in azure data factory?

The Get Metadata activity allows reading metadata information of its sources. Name of the file or folder.
Folder Level Field List: - Child Items, Exists, Item name, Item type, Last Modified
File Level Field List: - Column Count, Content MD5(MD5 of the file. Applicable only to files), Exists, Item name, Item type, Last Modified,
Size, Structure

9b. what is Data Flows Data flow is an activity in pipeline. it is used to perform transformations over data.to perform transformations, “Mapping
Data Flows” are used. These are used under Dataflow activity. Transformations are Source, union, join, filter, select, derived column, Exist, Sink
Etc…
9c. what is filter Activity Filter activity is used to filter out the input array based on certain conditions. There are 2 activities.
Item: - @activity(“Get Metadat1”).output.ChildItems
Condition: - @startswith(item().name,’emp’) @not(startswith(item().name,’emp’))

9d. what is forEcah Activity?

The ForEach Activity defines a repeating control flow. This activity is used to iterate over a collection and executes specified activities in a loop.
There are 2 options
Sequential the loop should be executed sequentially or in parallel.
Batch count to be used for controlling the number of parallel execution.(max 50batches and min No)

e. What is conditional Split?

Conditional split allows you to split the input stream into n number of output stream based on expressions conditions. Rows not matching the
condition will be routed to default output.

*9g. How many ways does the Data Factory pipeline can be executed?

You can execute a pipeline either manually or by using a trigger. This article provides details about both ways of executing a pipeline.
*
*9h. what is Key vaults?

Azure Key Vault is a cloud service for securely storing and accessing secrets. A secret is anything that you want to tightly control access to,
such as API keys, passwords, certificates, or cryptographic keys. Key Vault service supports two types of containers: vaults and managed
hardware security module(HSM) pools.
*9I. why do you use Logic apps?

Azure Logic Apps is a cloud-based platform for creating and running automated workflows that integrate your apps, data, services, and
systems.

10. What do you mean by variables in the azure data factory?

They are available inside the pipeline and there is set inside the pipeline. Set Variable and append variable are two types of activities used for the
setting or manipulating the variables values. There are two types of the variable:
System variable: These are some kind of the fixed variable from the azure pipeline itself. For example, pipeline name, pipeline id, trigger name etc.
You mostly need this to get the system information which might be needed in your use case.
User variable: User variable is something which you declared manually based on your logic of the pipeline.

11. What is the breakpoint in the adf pipeline The service allows for you to debug a pipeline until you reach a particular activity on the pipeline
canvas.

12. What is the difference between SSIS and Azure data Factory?
Azure Data Factory (ADF) SQL Server Integration Services (SSIS)
ADF is a Extract-Load Tool SSIS is an Extract-Transfer-Load tool
ADF is a cloud-based service (PAAS tool) SSIS is a desktop tool (uses SSDT)
ADF is a pay-as-you-go Azure subscription. SSIS is a licensed tool included with SQL Server.
ADF does not have error handling capabilities. SSIS has error handling capabilities.
ADF uses JSON scripts for its orchestration (coding). SSIS uses drag-and-drop actions (no coding).

23. What Are Azure Databricks

it represent an easy, quick, and mutual Apache Spark based analytics platform that is optimized for Azure. It is being designed in
partnership with the founders of Apache Spark. Moreover, Azure Databricks blends the finest of Databricks and Azure to let customers speed up
innovation through a quick setup. The smooth workflows and an engaging workspace facilitate teamwork between data engineers, data scientists,
and business analysts.

24. What is Azure SQL Data Warehouse?

It is a huge storage of data collected from a broad range of sources in a company and useful to make management decisions. These
warehouses enable you to accumulate data from diverse databases existing as either remote or distributed systems.
An Azure SQL Data Warehouse can be created by integrating data from multiple sources which can be utilized for decision making,
analytical reporting, etc. In other words, it is a cloud-based enterprise application allowing you to function under parallel processing to rapidly
examine a complex query from the massive data volume. Also, it works as a solution for Big-Data concepts.

**26. What is a Slowly Changing Dimension?

A Slowly Changing Dimension (SCD) is a dimension that stores and manages both current and historical data over time in a data warehouse. It is
considered and implemented as one of the most critical ETL tasks in tracking the history of dimension records.
There are three types of SCD and they are as follows:
SCD 1 – The new record replaces the original record
SCD 2 – A new record is added to the existing customer dimension table
SCD 3 – A original data is modified to include new data.

**28. what are the types of triggers?? Different between Schedule trigger & Window Trigger?
 The Schedule trigger that is used to execute the ADF pipeline on a wall-clock schedule
 The Tumbling window trigger that is used to execute the ADF pipeline on a periodic interval, and retains the pipeline state
 The Event-based trigger that responds to a blob related event, such as adding or deleting a blob from an Azure storage account

**29. Any Data Factory pipeline can be executed using three methods. Mention these methods
 Under Debug mode
 Manual execution using Trigger now
 Using an added scheduled, tumbling window or event trigger
30. What is fault tolerance in azure Fault tolerance is the property that enables a system to continue operating properly in the event of the failure of
(or one or more faults within) some of its components.

**31. what is incremental load? Incremental loading is the activity of loading only new or updated records from a source into Treasure Data .
Incremental loads are useful because they run efficiently when compared to full loads, and particularly for large data sets. Incremental loading is
available for many of the Treasure Data integrations

33. What is a PolyBase? Uses of PloyBase? PolyBase is a technology that accesses external data stored in Azure Blob storage or Azure Data Lake
Store via the T-SQL language. It is used to query relational and non-relational databases (NoSQL). You can use PolyBase to query tables and files in
Hadoop or in Azure Blob Storage. The PolyBase works as an intermediate for communication between the Azure Data Storage and SQL Server.

38. How do I gracefully handle null values in an activity output?

You can use the @coalesce construct in the expressions to handle the null values gracefully.

**39. Which Data Factory version do I use to create data flows? Use the Data Factory V2 version to create data flows.

40. What is the difference between Azure Data Lake and Azure Data Warehouse? What is the purpose of Azure data lake?
Azure Data Lake Data Warehouse

Data Lake is a capable way of storing any type, size, and shape of Data Warehouse acts as a repository for already filtered data from a specific
data. resource.

It is mainly used by Data Scientists. It is more frequently used by Business Professionals.

It becomes a pretty rigid and costly task to make changes in Data

It is highly accessible with quicker updates.
Warehouse.

It defines the schema after when the data is stored

successfully. developers to get insight from massive and complexDatawarehouse defines the schema before storing the data.
data sets.

It uses ELT (Extract, Load and Transform) process. It uses ETL (Extract, Transform and Load) process.

It is an ideal platform for doing in-depth analysis. It is the best platform for operational users.

41. What is Blob Storage in Azure? It helps to store a large amount of unstructured data such as text, images or binary data. It can be used to
expose data publicly to the world. Blob storage is most commonly used for streaming audios or videos, storing data for backup, and disaster
recovery, storing data for analysis etc. You can also create Data Lakes using blob storage to perform analytics.

**42. Difference between Data Lake Storage and Blob Storage.

Data Lake Storage Blob Storage

Blob Storage is general-purpose storage for a wide variety of

It is an optimized storage solution for big data analytics workloads.
scenarios. It can also do Big Data Analytics.

It follows a hierarchical file system. It follows an object store with a flat namespace.

Blob storage lets you create a storage account. Storage account

In Data Lake Storage, data is stored as files inside folders.
has containers that store the data.

It can be used to store Batch, interactive, stream analytics, andWe can use it to store text files, binary data, media storage for streaming
machine learning data. and general-purpose data.

44. Explain the two levels of security in ADLS Gen2?

 Role-Based Access Control It includes built-in azure rules such as reader, contributor, owner or customer roles. It is specified for two
reasons. The first is, who can manage the service itself, and the second is, to permit the reasons is to permit the users built-in data explorer tools.
 Access Control List Azure Data Lake Storage specifies precisely which data object users may read or write or execute. 46. What is the
difference between ADF v1 and v2?
 ADFv1 – is a service designed for the batch data processing of time series data.
 ADFv2 – is a very general-purpose hybrid data integration service with very flexible execution patterns. The Data Lake Storage Gen2
hierarchical namespace accelerates big data analytics workloads and enables file-level access control lists (ACLs)

 47. What is the difference between the mapping data flow and wrangling data flow transformation?
 Mapping Data Flow: It is a visually designed data transformation activity that lets users design a graphical data transformation logic
without needing an expert developer.
 Wrangling Data Flow: This is a code-free data preparation activity that integrates with Power Query Online.

 48. Data Factory supports two types of compute environments to execute the transform activities. Mention them briefly.
 Let’s go through the types:
 On-demand compute environment – It is a fully managed environment offered by ADF. In this compute type, a cluster is created to
execute the transform activity and removed automatically when the activity is completed.
 Bring your own environment – In this environment, you yourself manage the compute environment with the help of ADF.

51. what is Storage key Storage keys are tokens used to authorize access to a storage account. You can manage an account’s keys in the Azure portal.

52. what are the security connections types? RBAC, Connection String, Access Keys, Shared Access Keys

Azure Data Factory
No ratings yet
Azure Data Factory
6 pages
Azure Data Factory
No ratings yet
Azure Data Factory
3,167 pages
Azure Data Engineer
No ratings yet
Azure Data Engineer
8 pages
Reference Manual: Unicel DXC Synchron Clinical Systems
No ratings yet
Reference Manual: Unicel DXC Synchron Clinical Systems
560 pages
Azure Data Factory Tutorial
No ratings yet
Azure Data Factory Tutorial
36 pages
Azure Data Engineer Mock Interview - Project Special
No ratings yet
Azure Data Engineer Mock Interview - Project Special
11 pages
Azure Data Factory Notes 1682135573
No ratings yet
Azure Data Factory Notes 1682135573
78 pages
ADF Copy Data
100% (1)
ADF Copy Data
81 pages
Tech Pathfinder Stuff
100% (1)
Tech Pathfinder Stuff
253 pages
ADF Notes
No ratings yet
ADF Notes
1 page
ADE Azure Data Engineer Interview
No ratings yet
ADE Azure Data Engineer Interview
12 pages
AWS To Azure Services Comparison - Azure Architecture Center - Microsoft Docs
No ratings yet
AWS To Azure Services Comparison - Azure Architecture Center - Microsoft Docs
15 pages
Most Frequently Asked Azure Data Factory Interview Questions
0% (1)
Most Frequently Asked Azure Data Factory Interview Questions
5 pages
Azure Data Factory Interview Questions and Aswers
No ratings yet
Azure Data Factory Interview Questions and Aswers
5 pages
Lightning
100% (1)
Lightning
469 pages
06.introduction To Data Factory
No ratings yet
06.introduction To Data Factory
26 pages
Dp203 Notes
No ratings yet
Dp203 Notes
87 pages
What Is Azure Data Engineer
No ratings yet
What Is Azure Data Engineer
74 pages
Top 50 Azure Data Factory Interview Questions and Answers
No ratings yet
Top 50 Azure Data Factory Interview Questions and Answers
14 pages
Python For Data Engineering Guide
No ratings yet
Python For Data Engineering Guide
4 pages
KQL Cheat Sheet DP700
No ratings yet
KQL Cheat Sheet DP700
2 pages
Interview Questions On ADF
No ratings yet
Interview Questions On ADF
2 pages
Warner DP 203 Slides
No ratings yet
Warner DP 203 Slides
98 pages
Maneesh Azure
No ratings yet
Maneesh Azure
6 pages
Azure DataEngineer Training
No ratings yet
Azure DataEngineer Training
13 pages
100 Interview Questions On Hadoop - Hadoop Online Tutorials
100% (1)
100 Interview Questions On Hadoop - Hadoop Online Tutorials
22 pages
Microsoft - Pass4sure - DP 203.free - pdf.2024 Mar 29
No ratings yet
Microsoft - Pass4sure - DP 203.free - pdf.2024 Mar 29
21 pages
PySpark Meetup Talk
No ratings yet
PySpark Meetup Talk
35 pages
Electron Docs Gitbook en PDF
No ratings yet
Electron Docs Gitbook en PDF
217 pages
Azure Databricks Overview
No ratings yet
Azure Databricks Overview
23 pages
Bhaskar ADE - Altimetrik
No ratings yet
Bhaskar ADE - Altimetrik
3 pages
Piyush Data Science 3
No ratings yet
Piyush Data Science 3
26 pages
Pipeline: Azure Data Factory Cheat Sheet by
100% (1)
Pipeline: Azure Data Factory Cheat Sheet by
14 pages
ADF Course Content
No ratings yet
ADF Course Content
11 pages
Azure DataEngineer Training
No ratings yet
Azure DataEngineer Training
12 pages
New Snowflake Questions
No ratings yet
New Snowflake Questions
4 pages
53 SQL Questions-Answers
No ratings yet
53 SQL Questions-Answers
89 pages
Azure Synpase Analytics Service
No ratings yet
Azure Synpase Analytics Service
22 pages
Azure Data Factory Vs Databricks - 4 Key Differences - Hevo
No ratings yet
Azure Data Factory Vs Databricks - 4 Key Differences - Hevo
14 pages
Zclus - Harish - Data Engineer
No ratings yet
Zclus - Harish - Data Engineer
6 pages
SSIS in The Cloud
No ratings yet
SSIS in The Cloud
17 pages
Shelly Bansal - SR Data Engineer
No ratings yet
Shelly Bansal - SR Data Engineer
6 pages
Talend Data Integration: Subramanyam K
No ratings yet
Talend Data Integration: Subramanyam K
64 pages
Talend Data Integration
No ratings yet
Talend Data Integration
5 pages
Azure SQL Database Administration-Overview
No ratings yet
Azure SQL Database Administration-Overview
7 pages
AZ-303 Official Course Study Guide
No ratings yet
AZ-303 Official Course Study Guide
18 pages
Hive Interview Questions Answers
No ratings yet
Hive Interview Questions Answers
6 pages
Senior Data Engineer Resume Example
No ratings yet
Senior Data Engineer Resume Example
1 page
Lab 2 - Working With Data Storage
No ratings yet
Lab 2 - Working With Data Storage
15 pages
NewGenius en
No ratings yet
NewGenius en
49 pages
Role Profile: Data Build Engineer
No ratings yet
Role Profile: Data Build Engineer
7 pages
Data Flow in Azure Data Factory - YouTube
No ratings yet
Data Flow in Azure Data Factory - YouTube
8 pages
Exam DP 100 Data Science Solution On Azure Skills Measured
No ratings yet
Exam DP 100 Data Science Solution On Azure Skills Measured
6 pages
Rocky Scripts Manual
No ratings yet
Rocky Scripts Manual
100 pages
Learn More About SQL Interview Questions-Ii: The Expert'S Voice in SQL Server
No ratings yet
Learn More About SQL Interview Questions-Ii: The Expert'S Voice in SQL Server
12 pages
INFA Notes
No ratings yet
INFA Notes
161 pages
Azure Logic Apps PDF
No ratings yet
Azure Logic Apps PDF
5 pages
Cs Test 1 With Answers
No ratings yet
Cs Test 1 With Answers
5 pages
Azure Data Factory
No ratings yet
Azure Data Factory
5 pages
Mie3 Basic Eng
No ratings yet
Mie3 Basic Eng
24 pages
Azure Interview Questions
No ratings yet
Azure Interview Questions
7 pages
Azure Engineer Interview Questions
No ratings yet
Azure Engineer Interview Questions
2 pages
Imp Quries
No ratings yet
Imp Quries
3 pages
SQL Server Theory
No ratings yet
SQL Server Theory
2 pages
Distributed Computing System (DCS) Solved MCQs (Set-4)
No ratings yet
Distributed Computing System (DCS) Solved MCQs (Set-4)
6 pages
Cisco IP Phone 8841, 8851, and 8861 Release Notes For Firmware Release 10.2
No ratings yet
Cisco IP Phone 8841, 8851, and 8861 Release Notes For Firmware Release 10.2
18 pages
Dobot Magician User Manual-Tutorial
100% (1)
Dobot Magician User Manual-Tutorial
119 pages
UC23NA 3OCG07 AVEVA Gardner System Platform Migration and Upgrading Best Practices
No ratings yet
UC23NA 3OCG07 AVEVA Gardner System Platform Migration and Upgrading Best Practices
88 pages
Unit 5
No ratings yet
Unit 5
15 pages
File Management Presentation
No ratings yet
File Management Presentation
10 pages
PitzDaily Tutorial
No ratings yet
PitzDaily Tutorial
33 pages
DMH g229bt Opm
No ratings yet
DMH g229bt Opm
45 pages
Docu86395 - ViPR SRM 4.1.1 Installation and Configuration Guide
No ratings yet
Docu86395 - ViPR SRM 4.1.1 Installation and Configuration Guide
78 pages
Unit 4 Ms Dos Operating System
No ratings yet
Unit 4 Ms Dos Operating System
26 pages
Scribd
No ratings yet
Scribd
23 pages
Epson FP Mate Dev Guide Rev y
No ratings yet
Epson FP Mate Dev Guide Rev y
94 pages
Tutorial Letter 201/2/2018: Introduction To Programming II
No ratings yet
Tutorial Letter 201/2/2018: Introduction To Programming II
17 pages
1-Uses of FTK Imager
No ratings yet
1-Uses of FTK Imager
20 pages
TBR Server Manual
No ratings yet
TBR Server Manual
109 pages
Cloud Computing - Chapter 5
No ratings yet
Cloud Computing - Chapter 5
65 pages
Saphana 60 Book
No ratings yet
Saphana 60 Book
46 pages
Lesson Presentation - Appliances
No ratings yet
Lesson Presentation - Appliances
27 pages
Unit-Ii CF
No ratings yet
Unit-Ii CF
13 pages
Odys Mp-x29fm Userbook
No ratings yet
Odys Mp-x29fm Userbook
40 pages
Swat Luu: User Manual
No ratings yet
Swat Luu: User Manual
13 pages
LLT Release Note
No ratings yet
LLT Release Note
14 pages
Waspmote Ide User Guide
No ratings yet
Waspmote Ide User Guide
13 pages
Data Engineering with Scala and Spark: Build streaming and batch pipelines that process massive amounts of data using Scala
From Everand
Data Engineering with Scala and Spark: Build streaming and batch pipelines that process massive amounts of data using Scala
Eric Tome
No ratings yet
Mastering Data Engineering and Analytics with Databricks: A Hands-on Guide to Build Scalable Pipelines Using Databricks, Delta Lake, and MLflow (English Edition)
From Everand
Mastering Data Engineering and Analytics with Databricks: A Hands-on Guide to Build Scalable Pipelines Using Databricks, Delta Lake, and MLflow (English Edition)
Manoj Kumar
No ratings yet
Microsoft AZURE® AZ-104 Administrator Practice Tests
From Everand
Microsoft AZURE® AZ-104 Administrator Practice Tests
iCertify Training
No ratings yet

Adf Loop PDF

Uploaded by

Adf Loop PDF

Uploaded by

AZURE DATA FACTORY

7b. What are the storage types in Azure?

8. What is use of lookup activity in azure data factory?

9a. What is Get Metadata activity in azure data factory?

9d. what is forEcah Activity?

e. What is conditional Split?

10. What do you mean by variables in the azure data factory?

23. What Are Azure Databricks

24. What is Azure SQL Data Warehouse?

**26. What is a Slowly Changing Dimension?

38. How do I gracefully handle null values in an activity output?

It is mainly used by Data Scientists. It is more frequently used by Business Professionals.

It becomes a pretty rigid and costly task to make changes in Data

It defines the schema after when the data is stored

**42. Difference between Data Lake Storage and Blob Storage.

Blob Storage is general-purpose storage for a wide variety of

Blob storage lets you create a storage account. Storage account

44. Explain the two levels of security in ADLS Gen2?

You might also like