0% found this document useful (0 votes)
50 views2 pages

DP-203 Notes

Uploaded by

Sayeed Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views2 pages

DP-203 Notes

Uploaded by

Sayeed Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 2

AZure Data lake storage account & Az Synapse

=============================
1.To access external blob data into az synapse using openrowset function the user
need to have to have a BLOB READER permission assigned in his roles from Access
Control (IAM) by default it's not there and we'll see the
authorization error when trying to access the data.

2. To access blob data externally via url example


:-https://datalakegen2external.blob.core.windows.net/data/ActivityLog-01.csv it
will not be able to acccess by default due to security or authentication teh
workaround is to
Got to Storage Acc->Settings-> Configuration->Allow blob anonymous access to be
enable and again select teh specific container -> chnages access level to blob
Anonymous read access only.

3.Reading hidden files refre 58 lesson 5:00

4. For SQL dedicated ppol need to refer lesson 54 to 63 again

5.To Share the self host integration runtime across other services below are the
steps (challenges are present) here doign for Synapse SHIR to ADF

- I have installed already SHIR to my azure synapse works space and wanted to
share this with azure data factory
- go to azure synapse work space posrta and select integration runtimes -> SHIR->
and clik on Share option if share option is not present then you can execute the
below command
in Azure CLi or Powershell
- Logged into Az powershell (az login)
- it will ask for to select subscrition and tenant if you do not want to make any
changes just simply press enter (default tenant will be selected)
- run the following command to get the resource id of SHIR if running in
powershell
az synapse integration-runtime show `
>> --workspace-name azsynapsewrkspace3376 `
>> --name SHIR `
>> --resource-group dp-203-rg

-run the following command to get the resource id of SHIR if running in CLI
az synapse integration-runtime show \
>> --workspace-name azsynapsewrkspace3376 \
>> --name SHIR \
>> --resource-group dp-203-rg
output
{
"etag": "35003144-0000-2200-0000-675427970000",
"id": "/subscriptions/9fd46475-34f3-45af-a671-ab11cb4248a5/resourceGroups/dp-203-
rg/providers/Microsoft.Synapse/workspaces/azsynapsewrkspace3376/
integrationruntimes/SHIR",
"name": "SHIR",
"properties": {
"additionalProperties": null,
"description": "Self Host Integration Runtime for On Prem SQL Server",
"linkedInfo": null,
"type": "SelfHosted"
},
"resourceGroup": "dp-203-rg",
"type": "Microsoft.Synapse/workspaces/integrationruntimes"
}
-Resource id will "id" value from output
"id": "/subscriptions/9fd46475-34f3-45af-a671-ab11cb4248a5/resourceGroups/dp-203-
rg/providers/Microsoft.Synapse/workspaces/azsynapsewrkspace3376/
integrationruntimes/SHIR",
when sharing SHIR from Azure Synapse to Azure Data factory need to do a slight
change to id values
Replace Microsoft.Synapse/workspaces with Microsoft.DataFactory/factories

az synapse managed identity: f0d804e0-7f0d-4b0b-88e9-4073bc1b7f63


adf -object id: 79718567-8341-494f-99a0-35e8021c40a4

Getting an Error when trying to Share a SHIR of Azure Synapse with ADF
Failed to save integrationRuntime1.
Error: Failed to save integration runtime. Access denied. Unable to access shared
integration runtime 'SHIR'. Please check whether this resource has been granted
permission by the shared integration runtime.

6. Steps on how to use Cache Sink

When loading a dimension table in two phases with below query as example to fetch
1st records of Custoemr whose Id is less than 20,000
Select * from DimCustomer where CustomerId <20000

Create a another source in Data flow to cache teh maximum value of 1st phase of
data enetered and use that vakue as a next set of record number sequence
ex: CreateDerived Column and add below sink
CustomerSK+MaxCustomerSKCacheSink#outputs()[1].MaxCustomerSk

Loadign 2nd set of Data for remaining records of 20000 and above ids
Select * from DimCustomer where CustoemrId>20000

7. Need to practice lab 114 & 115,122, 130 ,131 till 133

Azure Event Hub & Azure Stream Analytics Job

1. Create Azure Event Hub namespace & Create an Event hub with any exiting model
(example:VehicleToll) & send event and in dashboard it will capture the events
2. Create stream analytics job & provide the create event hub as an input & it will
popup a query window & you can create and save query accordingly.
3. Provide an output for stream analytics job with the provided query & start the
job and in some time you can see the stream analytics will start capturing the
data.

You might also like