0% found this document useful (0 votes)
151 views12 pages

Running Azure Databricks Notebook On Synapse Analytics

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
151 views12 pages

Running Azure Databricks Notebook On Synapse Analytics

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Running Azure Databricks notebook from

Azure Synapse Analytics


In this artice we are going to see how can we use Databricks component in Azure Syanpse Analytics Integration tool
(Azure Data Factory ).

Requirements:

1- You need an Azure Databricks workspace.


2- A Spark cluster in Azure Databricks.
3- Create a service principal.
4- Configure Azure IntegrationRuntime and your virtual-network
5- Create an Azure Databricks Link-service.

So let’s see how it’s working.

First, I’m going to create a notebook in my Azure Databricks workspace DatabricksWS


Some code I took from this tutorial

https://docs.microsoft.com/en-us/azure/databricks/scenarios/databricks-extract-load-sql-data-
warehouse

Now we have a notebook file setup in Databricks workspaec named DatabricksWS

Next step is setup the Azure Databricks component fron Azure Synapse Analytics Integration Layer but
before we need a link service for Azure Databricks.

To do that , launch Azure Synapse Analytics Studio from


As you can see below , Azure Synapse Analytics have tree Layer , Ingest for ingestion Data from point A
to point B , Explore and Analyse by building some analytical queries and Visualization with Power BI

On your left at the manage layer, click Linked services and them New.
1- Give a name for you linked service
2- Enabled AutoResolveIntegrationRuntime
3- Your Azure subscription
4- Databricks Workspace URL (You can find It at the Databricks cluster Advance Option tab)
5- Authentication type (You can use an access token form your cluster or from Azure Key Vault)
6- Select New Job cluster (If you have instance pool can use it or an existent Interactive cluster)
7- Cluster version (the Cluster you create from your Azure Databricks Workspace)
8- Cluster node type
9- Python version
10- Select a fixed Worker or autoscaling option

If you have some additional information’s like parameters to passe to your notebook, you can setup you
parameters option.

Test connection and click Save.

Finally, in the Linked service List, we can find the new linked service.

We can drop it, configure some roles like service principal account
Now we can go ahead and create a new pipeline from the Integration Layer
From your left at the integration layer, you can access to integration component.

AS you can see, I already have two pipelines (Like Microsoft SSIS package)

From the Activities Tab, select at Databricks layer, notebook component and drag and drop it on the
canvas.

I rename it Databricks

Next step , after Renaming your component et to go at Azure Databricks tab . From there you need to
access to you Linked service , previously setup at the Azure Synapse Manager Layer.
Enabled the Intercative autoring

Next, Settings Tab, click on Browse button to find and select your notebook and then OK
At this point Your are good to go.

- Linked service is setup and up and run


- Azure Databricks component is also OK

Let’s make a validation of ower Pipeline before launching the process


Debug

Details to monitor your execution from your Databricks workspace by using the link below.
Databricks task finished successfully.

You might also like