Start To Finish With Azure Data Factory
Start To Finish With Azure Data Factory
Key Takeaway:
ADF can be used in real world data pipeline
scenarios, quickly and easily
Cortana Analytics
A Suite of Products that
allow you to Predict
Outcomes,Prescribe
Actions andAutomate
Decisions
Cortana
Cortana
Power BI
Azure HDInsight
Microsoft Azure
Operationalized Analytic
Solutions Information Big Data Stores Machine Learning Dashboards and
Management Visualizations
and Analytics Power BI
Business Azure
apps Storage
Azure
Personal Digital
Machine Learning
Assistant
Azure Azure Cortana
Data Factory Data Lake
People
Azure
Custom HDInsight (Hadoop) Perceptual
apps Azure Intelligence
Face, vision
Data Catalog Azure
SQL Data Warehouse
Speech, text
Azure
Azure Stream Analytics Business
Event Hub Scenarios
Recommendations,
Sensors
and devices customer churn, Automate
d
forecasting, etc.
Systems
Catalog
Discover
{} Orchestrat
Store Act
e
Cortana Analytics
Process:
https://tinyurl.com/caproc Ingest
Azure Data Factory
Connection to
Compute
Resource Also
Data Options
Source Sink
Blob, Table, SQL Database, SQL Data Warehouse, OnPrem SQL Server, SQL Server on IaaS,
Blob
DocumentDB, OnPrem File System, Data Lake Store
Blob, Table, SQL Database, SQL Data Warehouse, OnPrem SQL Server, SQL Server on IaaS,
Table
DocumentDB, Data Lake Store
Blob, Table, SQL Database, SQL Data Warehouse, OnPrem SQL Server, SQL Server on IaaS,
SQL Database
DocumentDB, Data Lake Store
Blob, Table, SQL Database, SQL Data Warehouse, OnPrem SQL Server, SQL Server on IaaS,
SQL Data Warehouse
DocumentDB, Data Lake Store
DocumentDB Blob, Table, SQL Database, SQL Data Warehouse, Data Lake Store
Blob, Table, SQL Database, SQL Data Warehouse, OnPrem SQL Server, SQL Server on IaaS,
Data Lake Store
DocumentDB, OnPrem File System, Data Lake Store
Blob, Table, SQL Database, SQL Data Warehouse, OnPrem SQL Server, SQL Server on IaaS,
SQL Server on IaaS
Data Lake Store
Blob, Table, SQL Database, SQL Data Warehouse, OnPrem SQL Server, SQL Server on IaaS,
OnPrem File System
OnPrem File System, Data Lake Store
Blob, Table, SQL Database, SQL Data Warehouse, OnPrem SQL Server, SQL Server on IaaS,
OnPrem SQL Server
Data Lake Store
Blob, Table, SQL Database, SQL Data Warehouse, OnPrem SQL Server, SQL Server on IaaS,
OnPrem Oracle Database
Data Lake Store
Blob, Table, SQL Database, SQL Data Warehouse, OnPrem SQL Server, SQL Server on IaaS,
OnPrem MySQL Database
Data Lake Store
Blob, Table, SQL Database, SQL Data Warehouse, OnPrem SQL Server, SQL Server on IaaS,
OnPrem DB2 Database
Data Lake Store
Blob, Table, SQL Database, SQL Data Warehouse, OnPrem SQL Server, SQL Server on IaaS,
Activity Options
Transformation activity Compute environment
Hive HDInsight [Hadoop]
Pig HDInsight [Hadoop]
MapReduce HDInsight [Hadoop]
Hadoop Streaming HDInsight [Hadoop]
Machine Learning activities:
Batch Execution and Update Azure VM
Resource
Stored Procedure Azure SQL
Data Lake Analytics U-SQL Azure Data Lake Analytics
HDInsight [Hadoop] or Azure
DotNet
Batch
4: Create
Datasets
Named
reference or
pointer to
Dataset Concepts
{
"name": "<name of dataset>",
"properties":
{
"structure": [ ],
"type": "<type of dataset>",
"external": <boolean flag to indicate external data>,
"typeProperties":
{
},
"availability":
{
},
"policy":
{
}
}.
5. Create
Pipelines
Logical
Grouping of
Activities
Pipeline Concepts
{
"name": "PipelineName",
"properties":
{
"description" : "pipeline description",
"activities":
[
],
"start": "<start date-time>",
"end": "<end date-time>"
}
}
6. Manage and
Monitor
Scheduling,
Monitoring,
Disposition
Locating Failures within a
Pipeline
2015 Microsoft Corporation. All rights reserved.