0% found this document useful (0 votes)
16 views

Guide to Setting Up and Executing Ingestion Task From Azure SQL to Snowflake - Copy - Copy

Uploaded by

Phạm Văn Cao
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Guide to Setting Up and Executing Ingestion Task From Azure SQL to Snowflake - Copy - Copy

Uploaded by

Phạm Văn Cao
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 20

Guide to Setting Up and Executing

Ingestion Task from Azure SQL to


Snowflake
Table of Contents
1. Prepare the environment
2. Prepare the datasource
2.1. Source preparation(Azure SQL Database)
2.2. Target preparation(snowflake Database)
2.2.1. Create user
2.2.2. Configure Keypair
3. Create Connection on IDMC cloud
3.1. Azure SQL
4. Create Database Ingestion Task
1. Prepare the environment
1.1. Network Required
● Allow the following domains through your organization's firewall:
○ *.informatica.com
○ *.informaticacloud.com
● IP addresses and other domains to allow Match it with the POD of your organization in
Informatica. You can look it up here:https://docs.informatica.com/cloud-common-
services/pod-availability-and-networking/current-version.html
● Ensure the connection to the databases you intend to use

1.2. Enable Data Ingestion


In the admin console → Runtime Environments → select the secure agent that you intend
to run the database ingestion → Enable or Disable Services, Connectors.
● In the Services tab, check the Mass Ingestion - Databases option
● In the Connections tab, select Snowflake Data Cloud

2. Prepare the datasource


2.1. Source preparation(Azure SQL Database)
Change data capture with CDC tables only. Users must have at least SELECT permission on
the source and CDC tables. SQL Server CDC must be enabled on the source tables.For
incremental load jobs that use log-based CDC with transaction logs, ensure that the database
user you specify in the SQL Server source connection has the db_owner role and the VIEW
ANY DEFINITION privilege. To grant these privileges, use one of the following sets of SQL
statements, depending on your SQL Server source type.
USE master;

CREATE DATABASE <database>;

CREATE LOGIN <login_name> WITH PASSWORD = '<password>';

CREATE USER <user> FOR LOGIN <login_name>;


GRANT SELECT ON master.sys.fn_dblog TO <user>;

GRANT VIEW SERVER STATE TO <login_name>;

GRANT VIEW ANY DEFINITION TO <login_name>;

USE <db>;

CREATE USER <user> FOR LOGIN <login_name>;

EXEC sp_addrolemember 'db_owner', '<user>';

EXEC sys.sp_cdc_enable_db

2.2. Target preparation(snowflake Database)


2.2.1. Create user
● Create a Data Ingestion and Replication user. Use the following SQL statement:

create user {idmc_user} password 'Xxxx@xxx';

● Create a new user role and grant it to the Data Ingestion and Replication user. Use the
following SQL statements:

create role INFACMI;


grant role INFACMI to user {idmc_user};

● Grant usage on the Snowflake virtual warehouse to the new role. Use the following
SQL statement:
grant usage on warehouse {warehouse_name} to role INFACMI;

● Grant usage on the Snowflake database to the new role. Use the following SQL
statement:

grant usage on database {database_name} to role INFACMI;

● Create a new schema. Use the following SQL statements:

use database {database_name} ;


create schema {schema_name};
● Grant create stream, create view, and create table privileges on the new Snowflake
schema to the new role. Use the following SQL statement:
grant create stream, create view, create table, usage on schema {database_name}.{schema_name} to role INFACMI;

● Set the default role for the newly created user. Use the following SQL statement:
alter user {idmc_user} set default_role=INFACMI;

2.2.2. Configure Keypair


- To start, open a terminal window and generate a private key.You can generate either an
encrypted version of the private key or an unencrypted version of the private key.To generate
an unencrypted version, use the following command:

openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out rsa_key.p8 -nocrypt

To generate an encrypted version, use the following command, which omits -nocrypt:

openssl genrsa 2048 | openssl pkcs8 -topk8 -v2 des3 -inform PEM -out rsa_key.p8

The commands generate a private key in PEM format.

-----BEGIN ENCRYPTED PRIVATE KEY-----


MIIE6T...
-----END ENCRYPTED PRIVATE KEY-----

From the command line, generate the public key by referencing the private key. The following
command assumes the private key is encrypted and contained in the file named rsa_key.p8.

openssl rsa -in rsa_key.p8 -pubout -out rsa_key.pub

The command generates the public key in PEM format.


-----BEGIN PUBLIC KEY-----
MIIBIj...
-----END PUBLIC KEY-----
Execute an ALTER USER command to assign the public key to a Snowflake user.

ALTER USER {idmc_user} SET RSA_PUBLIC_KEY='MIIBIjANBgkqh...';


3. Create Connection on IDMC cloud
3.1. Azure SQL
● Log in to the IICS console and create a new connection

● Select the connection type as SQL Server and fill in the database information::

Where :
- Connection : Name of the connection.
- Runtime Environment: The name of the runtime environment where you want to run
the tasks.Specify a Secure Agent, Hosted Agent, or serverless runtime environment
- SQL Server Version: Microsoft SQL Server database version.
- Authentication types: SQL Server authentication
- Domain:The domain name of the Windows user
- User Name: User name for the database login. The user name can't contain a
semicolon.
To connect to Microsoft Azure SQL Database, specify the user name in the following
format: username@host
- Password : Password for the database login. The password can't contain a semicolon.
- Host : Name of the machine hosting the database server.To connect to Microsoft Azure
SQL Database, specify the fully qualified host name.
- Instance Name: Instance name of the Microsoft SQL Server database.
- Database Name: Database name for the Microsoft SQL Server target connection
- Schema : Schema used for the target connection.

● After filling in the information, click on 'Test Connection'

3.2. Snowflake
● Log in to the IICS console and create a new connection
● Select the connection type as Snowflake Data Cloud and fill in the database information

Where :
- Connection Name: Name of the connection.
- Runtime Environment : The name of the runtime environment where you want to run
tasks.
Select a Secure Agent, Hosted Agent, or serverless runtime environment.
- Account: The name of the Snowflake account. For example,
https://app.snowflake.com/us-east-2.aws/<123abc>/dashboard, your account name is
123abc.us-east-2.aws.
- Warehouse: The Snowflake warehouse name.
- Private Key File : Path to the private key file, including the private key file name, that
the Secure Agent uses to access Snowflake.key make in 2.2.2
- Additional JDBC URL Parameters : add database and schema values when you
connect to Snowflake:
db=mydb&schema=public

- Private Key File Password:If you generate a keypair with encypt . Password for the
private key file
● After filling in the information, click on 'Test Connection
4. Create Database Ingestion Task
● In the Informatica console, navigate to the Data Integration page.

● Click New → Data Ingestion and Replication → Database Ingestion and


Replication Task → create

● Configure the task information: :


Where :
- Name : Name of the Database Ingestion task .
- Runtime Environment : The name of the runtime environment where you want to run
tasks.
- Load Type: The load type determines the type of operation to use when the replication
task replicates data from the source to the target.An initial load task propagates a point-
in-time snapshot of data from a source to a target. An incremental load task propagates
data changes as well as column-level schema changes from a source to a target
continuously and in near real time. You can configure the start point from which to
capture changes.A combined initial and incremental load task performs an initial load of
point-in-time data to a target and then automatically switches to propagating data
changes and column-level schema changes to the target.
● In step 2, we choose tables for ingestion in datasource :
● Select CDC Script. If the table has a key, you can choose Enable CDC for primary
key columns; if not, you can select Enable CDC for all columns → Execute → Next.

● In Step 3, fill in the target information → Next


● If you are not using a key pair to log in to Snowflake, uncheck the box.

● In Step 4, select the Replication options → Save.


● Proceed with Deployment.
● Go to My Jobs, check the Task → search by name → Run
● Select the log and download it to check the logs.

● View the ingestion information of the ingestion tables in the Object Detail:

You might also like