SnowflakeDataCloudConnector en
SnowflakeDataCloudConnector en
This software and documentation are provided only under a separate license agreement containing restrictions on use and disclosure. No part of this document may be
reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica LLC.
U.S. GOVERNMENT RIGHTS Programs, software, databases, and related documentation and technical data delivered to U.S. Government customers are "commercial
computer software" or "commercial technical data" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such,
the use, duplication, disclosure, modification, and adaptation is subject to the restrictions and license terms set forth in the applicable Government contract, and, to the
extent applicable by the terms of the Government contract, the additional rights set forth in FAR 52.227-19, Commercial Computer Software License.
Informatica, Informatica Cloud, Informatica Intelligent Cloud Services, PowerCenter, PowerExchange, and the Informatica logo are trademarks or registered trademarks
of Informatica LLC in the United States and many jurisdictions throughout the world. A current list of Informatica trademarks is available on the web at https://
www.informatica.com/trademarks.html. Other company and product names may be trade names or trademarks of their respective owners.
Portions of this software and/or documentation are subject to copyright held by third parties. Required third party notices are included with the product.
The information in this documentation is subject to change without notice. If you find any problems in this documentation, report them to us at
[email protected].
Informatica products are warranted according to the terms and conditions of the agreements under which they are provided. INFORMATICA PROVIDES THE
INFORMATION IN THIS DOCUMENT "AS IS" WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING WITHOUT ANY WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND ANY WARRANTY OR CONDITION OF NON-INFRINGEMENT.
Table of Contents 3
Chapter 6: Targets for Snowflake Data Cloud. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Target properties for Snowflake Data Cloud. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Target objects and operations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Specify a target. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Override the update operation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Optimize the .csv file size. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Configure load properties in mappings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Capturing changed data from CDC sources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Configuring a mapping task to read from a CDC source. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Step 1. Configure the source. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Step 2. Configure the target. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Step 3. Configure the mapping task. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Viewing job statistics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Rules and guidelines for Snowflake Data Cloud target transformations. . . . . . . . . . . . . . . . . . . 38
4 Table of Contents
Functions with Snowflake Data Cloud. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Operators with Snowflake Data Cloud. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Variables with Snowflake Data Cloud. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Transformations with Snowflake Data Cloud. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Aggregator transformation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Expression transformation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Lookup transformation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Router transformation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Sequence Generator transformation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
SQL transformation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Union transformation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Update Strategy transformation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Snowflake Data Cloud sources, targets, and lookups. . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Amazon S3 V2 source. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Google Cloud Storage V2 source. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Microsoft Azure Data Lake Storage Gen2 source. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Creating temporary view for source overrides. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Configuring target copy command options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Verify the pushdown query in the session log. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Table of Contents 5
Preface
Use Snowflake Data Cloud Connector to learn how to read from or write to Snowflake. Learn to create a
connection, develop and run mappings, mapping tasks, and data transfer tasks in Data Integration. Learn how
to push down the transformation logic for processing to the Snowflake database.
Informatica Resources
Informatica provides you with a range of product resources through the Informatica Network and other online
portals. Use the resources to get the most from your Informatica products and solutions and to learn from
other Informatica users and subject matter experts.
Informatica Documentation
Use the Informatica Documentation Portal to explore an extensive library of documentation for current and
recent product releases. To explore the Documentation Portal, visit https://docs.informatica.com.
If you have questions, comments, or ideas about the product documentation, contact the Informatica
Documentation team at [email protected].
https://network.informatica.com/community/informatica-network/products/cloud-integration
Developers can learn more and share tips at the Cloud Developer community:
https://network.informatica.com/community/informatica-network/products/cloud-integration/cloud-
developers
6
https://marketplace.informatica.com/
To search the Knowledge Base, visit https://search.informatica.com. If you have questions, comments, or
ideas about the Knowledge Base, contact the Informatica Knowledge Base team at
[email protected].
Subscribe to the Informatica Intelligent Cloud Services Trust Center to receive upgrade, maintenance, and
incident notifications. The Informatica Intelligent Cloud Services Status page displays the production status
of all the Informatica cloud products. All maintenance updates are posted to this page, and during an outage,
it will have the most current information. To ensure you are notified of updates and outages, you can
subscribe to receive updates for a single component or all Informatica Intelligent Cloud Services
components. Subscribing to all components is the best way to be certain you never miss an update.
For online support, click Submit Support Request in Informatica Intelligent Cloud Services. You can also use
Online Support to log a case. Online Support requires a login. You can request a login at
https://network.informatica.com/welcome.
The telephone numbers for Informatica Global Customer Support are available from the Informatica web site
at https://www.informatica.com/services-and-training/support-services/contact-us.html.
Preface 7
Chapter 1
Additionally, you can read data from other applications, databases, and flat files and write data to Snowflake.
You can also use Snowflake Data Cloud Connector to read data from and write data to Snowflake that is
enabled for staging data in Azure, Amazon, Google Cloud Platform, or Snowflake GovCloud.
When you use Snowflake Data Cloud Connector, you can create a Snowflake Data Cloud connection and use
the connection in Data Integration mappings and tasks.
When you run a Snowflake Data Cloud mapping or task, the Secure Agent writes data to Snowflake based on
the workflow and Snowflake Data Cloud connection configuration.
When you use Snowflake Data Cloud Connector, you can include the following Data Integration assets:
For more information about configuring assets and transformations, see Mappings, Transformations, and
Tasks in the Data Integration documentation.
8
Chapter 2
Verify permissions
Permissions define the level of access for the operations that you can perform in Snowflake.
The following table lists the permissions that you require in the Snowflake account:
Database Usage
Schema Usage, Create Table, Create View, Create Procedure, Create Sequence
Table All
Sequence All
For more information, see Access Control Privileges in the Snowflake documentation.
9
Prepare for authentication
You can configure standard and KeyPair authentication types to access Snowflake.
Before you configure authentication, complete the prerequisites. Get the account name, warehouse, and role
details from the Snowflake account.
Keep the following details handy based on the authentication type that you want to use:
• Standard authentication. Requires the Snowflake account user name and password.
• KeyPair authentication. Requires the public and private key pair, along with the Snowflake account name.
• Authorization code authentication. Requires at a minimum the client ID, authorization URL, access token
URL, and the access token. You must first create an authorization integration in Snowflake and then get
the authorization details.
When you use authorization code authentication, you can also use external OAuth such as PingFederate.
If the access token expires, Informatica redirect URL, which is outside the customer firewall, tries to connect
to the endpoint and retrieves a new access token.
For more information about how to get the authorization details, see the Snowflake documentation.
For more information about configuring a key pair authentication for Snowflake, see the Snowflake
documentation.
You can configure the Secure Agent to use the proxy server on Windows and Linux. You can use the
unauthenticated or authenticated proxy server.
To configure the proxy settings for the Secure Agent, use one of the following methods:
• Configure the Secure Agent through the Secure Agent Manager on Windows or shell command on Linux.
For instructions, see the topic "Configure the proxy settings on Windows" or "Configure the proxy settings
on Linux," in Getting Started in the Data Integration documentation.
• Configure the JVM options for the DTM in the Secure Agent properties. For instructions, see the
Knowledge Base article 000185646.
• Configure the proxy server properties in the additional JDBC URL parameters in the Snowflake connection.
For more information, see “Additional JDBC connection parameters” on page 16.
The AWS or Azure Private Link setup ensures that the connection to Snowflake uses the AWS or Azure
internal network and does not take place over the public Internet.
To connect to the Snowflake account over the private AWS network, see AWS Private Link and Snowflake.
To connect to the Snowflake account over the private Azure network, see Azure Private Link and Snowflake.
• Standard. Uses Snowflake account user name and password credentials to connect to Snowflake.
• Authorization Code. Uses the OAuth 2.0 protocol with Authorization Code grant type to connect to
Snowflake. Authorization Code allows authorized access to Snowflake without sharing or storing your
login credentials.
• KeyPair. Uses the private key file and private key file password, along with the existing Snowflake account
user name to connect to Snowflake.
You create a Snowflake Data Cloud connection on the Connections page. You can then use the connection
when you read from or write data to Snowflake.
Standard authentication
When you set up a Snowflake Data Cloud connection, configure the connection properties.
The following table describes the Snowflake Data Cloud connection properties for the Standard
authentication mode:
Property Description
12
Property Description
Runtime The name of the runtime environment where you want to run the tasks.
Environment You can specify a Secure Agent or a Hosted Agent.
Authentication The authentication method that the connector must use to log in to Snowflake.
Select Standard. Default is Standard.
Property Description
Runtime The name of the runtime environment where you want to run the tasks.
Environment You can specify a Secure Agent or a Hosted Agent.
Authentication The authentication method that Snowflake Data Cloud Connector must use to log in to Snowflake.
Select AuthorizationCode.
Authorization The Snowflake server endpoint that is used to authorize the user request.
URL The authorization URL is https://<account name>.snowflakecomputing.com/oauth/authorize,
where <account name> specifies the full name of your account provided by Snowflake.
For example, https://<abc>.snowflakecomputing.com/oauth/authorize
Note: If the account name contains underscores, use the alias name.
You can also use the Authorization Code grant type that supports the authorization server in a
Virtual Private Cloud network.
Access Token The Snowflake access token endpoint that is used to exchange the authorization code for an
URL access token.
The access token URL is https://<account name>.snowflakecomputing.com/oauth/token-request,
where <account name> specifies the full name of your account provided by Snowflake.
For example, https://<abc>.snowflakecomputing.com/oauth/token-request
Note: If the account name contains underscores, use the alias name.
Client ID Client ID of your application that Snowflake provides during the registration process.
Scope Determines the access control if the API endpoint has defined custom scopes.
Enter space-separated scope attributes.
For example, specify session:role:CQA_GCP as the scope to override the value of the default
user role. The value must be one of the roles assigned in Security Integration.
Access Token Additional parameters to use with the access token URL.
Parameters Define the parameters in the JSON format.
For example, define the following parameters:
[{"Name":"code_verifier","Value":"5PMddu6Zcg6Tc4sbg"}]
Generate Token Generates the access token and refresh token based on the OAuth attributes you specified.
Connection Description
property
Runtime The name of the runtime environment where you want to run the tasks.
Environment You can specify a Secure Agent or a Hosted Agent.
Private Key File Path to the private key file, including the private key file name, that the Secure Agent uses to
access Snowflake.
Note: Verify that the keystore is FIPS-certified.
You can configure the following properties as additional JDBC URL parameters:
• To override the database and schema name used to create temporary tables in Snowflake, enter the
database and schema name in the following format:
ProcessConnDB=<DB name>&ProcessConnSchema=<schema_name>
• To view only the specified database and schema while importing a Snowflake table, enter the database
and schema name in the following format:
db=<database_name>&schema=<schema_name>
• To read UDF string and numeric data from Snowflake, enter the database and schema where the UDF is
created in Snowflake in the following format:
db=<database_name>&schema=<schema_name>
• To access Snowflake through Okta SSO authentication, enter the web-based IdP implementing SAML 2.0
protocol in the following format:
authenticator=https://<Your_Okta_Account_Name>.okta.com
A mapping defines reusable data flow logic that you can use in mapping tasks.
When you create a mapping, you define the Source, Target, and Lookup transformations to represent a
Snowflake Data Cloud object. Use the Mapping Designer in Data Integration to add the Source, Target, or
Lookup transformations in the mapping canvas and configure the Snowflake Data Cloud source, target, and
lookup properties.
You can add the following transformations and define the transformation details in a mapping:
Source transformation
Add a Source transformation to read data from Snowflake or other data sources. You can read from a
single or from multiple Snowflake tables. You can also use a query or parameter as the source type to
read from Snowflake.
You can use one or more Source transformations in a mapping. When you use a single transformation to
read from multiple tables, you can use a join. If you use two Source transformations in a mapping, use a
Joiner transformation to join the data.
Target transformation
Add a Target transformation to write data to Snowflake. You can use one or more Target
transformations in a mapping. When you configure a target transformation, you an use a single
Snowflake target object. You can use a new or an existing Snowflake target object. You can also use
parameters to define the target connection and target object when you run the mapping task.
Lookup transformation
Add a Lookup transformation to retrieve data based on a specified lookup condition. The lookup object
can be a single object or query. You can also parameterize the lookup objects and connections.
For more information about the objects and the properties that you can configure for each of these
transformations in a mapping, see the Chapter 5, “Sources for Snowflake Data Cloud” on page 24, Chapter 6,
18
“Targets for Snowflake Data Cloud” on page 28, and Chapter 7, “Lookups for Snowflake Data Cloud ” on
page 39 chapters.
You can also add other transformations to the mapping to transform the data. For more information about
configuring transformations, see Transformations in the Data Integration documentation.
For more information about configuring transformations, see Transformations in the Data Integration
documentation.
SQL transformation
You can configure an SQL transformation in a Snowflake Data Cloud mapping to process SQL queries and
stored procedures in Snowflake.
When you add an SQL transformation to the mapping, on the SQL tab, you define the database connection
and the type of SQL that the transformation processes.
You can choose to use a parameterized connection in a SQL transformation. You can also override the values
defined in a parameter file at runtime. To use a parameterized connection in an SQL transformation, first
create an SQL transformation in a mapping that uses a valid connection. Then, parameterize the connection
in the SQL transformation. You can also use an SQL transformation to read from Java or SQL user-defined
functions (UDF) in Snowflake.
You can use an SQL transformation to process the following types of SQL statements:
Stored procedure
You can configure an SQL transformation to call a stored procedure in Snowflake. The stored procedure
name is case-sensitive. You can select the stored procedure from the database, or enter the exact name
of the stored procedure to call in the SQL transformation. The stored procedure must exist in the
Snowflake database before you create the SQL transformation.
When you specify the Snowflake database, schema, and procedure name in the advanced SQL
properties, the agent considers the properties specified in the stored procedure first, followed by the
advanced source properties, then the additional JDBC URL parameters in the connection, and finally the
source object metadata.
If you add a new stored procedure to the database when the mapping is open, the new stored procedure
does not appear in the list of available stored procedures. To refresh the list, close and reopen the
mapping.
SQL Query
You can configure an SQL transformation to process an entered query that you define in the SQL editor.
Do not use more than one SQL query in an SQL transformation.
The SQL transformation processes the query and returns the rows. The SQL transformation also returns
any errors that occur from the underlying database or if there is an error in the user syntax.
For more information about SQL queries and stored procedures, see Transformations in the Data Integration
documentation.
SQL transformation 19
• Mappings fail when you specify schema override results in multiple stored procedures that contain the
same name and number of arguments.
• Mappings fail when the stored procedure contains Unicode characters.
• If NULL is returned from a stored procedure, a warning appears in the user interface and the session log.
However, the mapping continues to process the rows.
• The runtime processing ignores the following properties in an SQL transformation:
- In-out and input parameters
- Advanced properties
- Auto commit
- Transformation scope
- Stop on error
A schema change includes one or more of the following changes to the data object:
Configure schema change handling on the Schedule page when you configure the task. You can configure
asynchronous or dynamic schema change handling.
When you configure dynamic schema change handling, you can choose from the following options to refresh
the schema:
Alter and apply changes
Data Integration applies the following changes from the source schema to the target schema:
• New fields. Alters the target schema and adds the new fields from the source.
• Renamed fields. Adds renamed fields as new columns in the target.
• Data type and precision updates. Applies these changes to the target. Updates to the scale are not
applicable.
• Deleted fields. Ignores deleted fields.
Data Integration does not apply the schema changes to the target.
Drops the existing target table and then recreates the target table at runtime using all the incoming
metadata fields from the source.
• Do not include an override for the target or source from the advanced properties. If the mapping is
enabled for dynamic schema handling and contains a source or target override, the mapping fails with the
following error: Exception: class java.lang.NullPointerException occurred in writeMetadata
However, you can specify an SQL override in mappings enabled for dynamic schema handling.
Mapping example
An enterprise application uses the Oracle database to store product transaction details. You want to check
the availability of stocks based on completed and pending transactions from the source data. You want to
integrate the available stocks and transaction details to Snowflake for further analysis.
Mapping
The following image shows the Snowflake Data Cloud mapping for this use case:
You create a mapping to read the product transaction details from an Oracle source and apply a lookup
condition on the PRODUCTDET table in Snowflake which stores details of product and its availability.
Based on the availability and requirement, you write the transactions to the PENDINGTRANSACTION and
COMPLETEDTRANSACTION tables in Snowflake and update the INSTOCK field in the PRODUCTDET table
for the completed transactions.
You use the following transformations in the Snowflake Data Cloud mapping:
Source transformation
To read the product transaction details from an Oracle source, include an Oracle connection in the
Source transformation to connect to Oracle. The source object for the mapping task is an OracleSrc
table from Oracle.
The following image shows the transaction details stored in the OracleSrc table that you want to read:
Mapping example 21
Lookup transformation
The lookup object for the mapping task is PRODUCTDET table in Snowflake, which has details of the
product and its availability. Apply the lookup condition on the PRODUCTDET table in Snowflake which
stores details of product and its availability based on the product ID.
The following image shows the data stored in the PRODUCTDET table:
Expression transformation
Router transformation
The Router transformation filters data based on the availability of stocks and redirects completed
transactions, pending transactions, and product details to the appropriate target tables.
Target transformation
The mapping task has the following target objects to write the completed transactions, pending
transactions, and product details:
COMPLETEDTRANSACTION
The following image shows the data stored in the COMPLETEDTRANSACTION table:
PENDINGTRANSACTION
PRODUCTDET
The PRODUCTDET table includes the PRODUCTID, INSTOCK, PRODUCTDET, and PRICE fields. Based
on the completed transactions, the INSTOCK field is updated.
The following image shows the data stored in the PRODUCTDET table:
When you run the mapping, the agent reads the transaction details from source, fetches fields from the
lookup, and based on the conditions applied write the available quantity and transaction details to the target
tables.
Mapping example 23
Chapter 5
The following table describes the Snowflake Data Cloud source properties that you can configure in a Source
transformation:
Property Description
Parameter A parameter file where you define values that you want to update without having to edit the task.
Select an existing parameter for the source object, or click New Parameter to define a new parameter
for the source object.
The Parameter property appears only if you select parameter as the source type.
If you want to overwrite the parameter at runtime, select the Allow parameter to be overridden at run
time option when you create a parameter. When the task runs, it uses the parameters from the file that
you specify in the task advanced session properties.
24
Property Description
The following table describes the advanced properties that you can configure in a Source transformation:
Advanced Description
Property
Role Overrides the Snowflake role assigned to user you specified in the connection.
The warehouse name in the mapping overrides the warehouse name you specify in the connection.
Even though you provide an incorrect warehouse name in the connection properties, the connection
is successful. However, before you run the mapping, ensure that you specify the correct warehouse
name in the mapping properties.
Pre SQL The pre-SQL command to run on the Snowflake source table before the agent reads the data.
For example, if you want to update records in the database before you read the records from the
table, specify a pre-SQL statement.
The query must include a fully qualified table name. You can specify multiple pre-SQL commands,
each separated with a semicolon.
Post SQL The post-SQL command to run on the Snowflake table after the agent completes the read
operation.
For example, if you want to delete some records after the latest records are loaded, specify a post-
SQL statement.
The query must include a fully qualified table name. You can specify multiple post-SQL commands,
each separated with a semicolon.
Table Name Overrides the table name of the imported Snowflake source table.
SQL Override The SQL statement to override the default query used to read data from the Snowflake source.
Tracing Level Determines the amount of detail that appears in the log file. You can select Terse, Normal, Verbose
Initialization, or Verbose Data. Default value is Normal.
You can pass the db=<dbname>&schema<schemaname> values in the Additional JDBC URL Parameters field in
the Snowflake Data Cloud connection.
To read from multiple tables using a single Source transformation, select multiple object as the source type
and then configure a join to combine the tables. You can either add related objects with PK-FK relationships
that are already defined or you can define a relationship condition to join the tables. To set your own
conditions to define the relationship between the tables, select Advanced Relationship from the Related
Objects Actions menu, and then define the relationship. When you configure a join expression, select the
fields and define a join query syntax. You must specify only the condition and not the type of join in the query.
The condition you specify in the text box for the expression is appended to the join condition.
When you specify a join condition in the advanced relationship to join the tables, you cannot override the
database and schema names from the connection. You need to manually change the database and schema
name in the advanced relationship condition. If the condition includes columns with a fully qualified name
such as db.schema.tablename, do not configure an override. Delete the fully qualified database and schema
names from the advanced relationship condition and then run the mapping.
• A mapping fails when you read data from multiple tables joined using related objects and the tables and
column names have case-sensitive columns.
• A mapping configured with a join for one or more tables that have the same column names fails.
• A mapping that reads from multiple Snowflake objects that do not belong to the same database and
schema fails.
• A mapping configured with a join that reads from multiple tables fails if you specify an override to the
table, schema, or database in the Snowflake source advanced properties.
• Do not configure an override for the database and schema names in the connection.
• The following operations are not applicable for the query source type:
- Filter and sort options.
• You cannot read or write Snowflake table and column names that contain double quotes. The mapping
fails with the following error: SQL compilation error
• You cannot use system variables in filters.
Overriding SQL
When you override the object, database, schema, or role, you must follow some guidelines.
If you specify an SQL override query to override the custom query used for importing the metadata from
Snowflake tables, specify a fully qualified table name.
When you specify both an SQL override and overrides to the database, schema, warehouse, or role from the
advanced source properties, consider the following guidelines for the override combinations:
To override the SQL and the role, you can perform one of the following tasks:
• Grant the overriding role with the same access permissions as the role used for the Snowflake object that
you selected in the mapping.
• Specify an override to the table from the advanced properties. The specified override table is used in the
design time and the SQL override takes precedence at runtime.
Specify a valid table name that corresponds to the overriding database and schema name.
Overriding SQL 27
Chapter 6
Property Description
Parameter A parameter file where you define values that you want to update without having to edit the task.
Select an existing parameter for the target object or click New Parameter to define a new
parameter for the target object.
The Parameter property appears only if you select parameter as the target type.
If you want to overwrite the target object at runtime, select the Allow parameter to be overridden
at run time option. When the task runs, it uses the parameters from the file that you specify in the
task advanced session properties.
Object The target object for the task. Select the target object.
You can either select an existing table or create a new table. You can write data by selecting an
existing table or creating a new table in the target by using the Create New at Runtime option.
28
Property Description
Create New at Creates a Snowflake target table at runtime based on the table type and the path you specify.
Runtime To create a target table at runtime, provide the following parameters:
- Optional. Specify the table type as table.
- In the Path field, specify the Snowflake database name and schema in the following format:
<database name>/<schema>
The agent creates the target table based on the object name and the path you specify.
Note: You can edit the metadata of the source fields before creating the target.
Operation The target operation. Select Insert, Update, Upsert, Delete, or Data Driven.
Update columns The temporary key column to update data to or delete data from a Snowflake target.
Data Driven Enables you to define expressions that flag rows for an insert, update, upsert, or delete operation.
Condition
The following table describes the advanced properties that you can configure in a Target transformation:
Advanced Description
Property
UpdateMode Loads data to the target based on the mode you specify.
Applicable when you select the Update operation or the Data Driven operation.
Select from one of the following modes:
- Update As Update. Updates all rows flagged for update if the entries exist.
- Update Else Insert. The agent first updates all rows flagged for update if the entries exist in the
target. If the entries do not exist, the agent inserts the entries.
Database Overrides the database that you used to import the object.
Schema Overrides the schema that you used to import the object.
Role Overrides the Snowflake role assigned to the user specified in the connection.
Pre SQL The pre-SQL command to run before the agent writes to Snowflake.
For example, if you want to assign sequence object to a primary key field of the target table
before you write data to the table, specify a pre-SQL statement.
You can specify multiple pre-SQL commands, each separated with a semicolon.
Post SQL The post-SQL command to run after the agent completes the write operation.
For example, if you want to alter the table created by using create target option and assign
constraints to the table before you write data to the table, specify a post-SQL statement.
You can specify multiple post-SQL commands, each separated with a semicolon.
Truncate Target Truncates the database target table before inserting new rows. Select one of the following
Table options:
- True. Truncates the target table before inserting all rows.
- False. Inserts new rows without truncating the target table
Default is false.
Table Name Overrides the table name of the Snowflake target table.
Update Override Overrides the default update query that the agent generates for the update operation with the
update query.
Forward Determines whether the transformation passes rejected rows to the next transformation or drops
Rejected Rows rejected rows. By default, the agent forwards rejected rows to the next transformation.
When you choose the target type, you can select the operation to insert, update, upsert, or delete data in a
Snowflake target. You can also use the data driven operation to define expressions that flag rows for an
insert, update, delete, or reject operation.
Update columns
You need to specify the temporary key column to update data to or delete data from a Snowflake target. If
the Snowflake target does not include a primary key column, click Add to add a temporary key. You can
select multiple columns.
If the records from the source tables contain duplicate primary keys, perform one of the following tasks in
mappings to update or delete records in Snowflake:
• Before you import the target table, define multiple primary keys in the target table.
• Define more than one custom key for the target object using the Update Columns option in the advanced
target properties.
Specify a target
You can use an existing target or create a new target to write data to Snowflake. If you choose to create a
new target, the agent creates the target when it runs the task.
Ensure that the path for the schema and database that you specify for the target object is in uppercase
letters. Specify the Snowflake database name and schema in the following format: <database name>/
<schema name> for the target object is in uppercase letters. If you do not enter the path, the Secure Agent
considers the schema and database name you specified in the Additional JDBC URL Parameters field from
the Snowflake Data Cloud connection properties.
The following image shows an example of a target object configured to create at runtime:
• If the Secure Agent is installed on Windows, you cannot write data to a Snowflake target table when the
table names contain the following special characters: /\:*?"<>|
When you configure an update override, the Secure Agent uses the query that you specify, stages the data in
files, and then loads that data into a temporary table using the Snowflake's loader copy command. The data
from the temporary table is then loaded to the Snowflake target table. The syntax that you specify for the
update query must be supported by Snowflake.
The Secure Agent replaces :TU. with a temporary table name while running the mapping and does not
validate the update query.
When you configure an update override in a mapping to write to Snowflake, consider the following rules:
• Ensure that the column names for :TU matches the target table column names.
• Ensure that the column names are fully qualified names.
• Specify the update query with a valid SQL syntax because Snowflake Data Cloud Connector replaces :TU
with a temporary table name and does not validate the update query.
• Do not change the order of the column in the mappings when you configure the update override option.
• The update query in the mapping must not contain unconnected fields to the target.
The following table lists some of the additional runtime parameters that you can specify to load data to
Snowflake:
Property Description
copyEmptyFieldAsEmpty The COPY command option to set incoming empty fields as null.
Type is Boolean.
onError=ABORT_STATEMENT&oneBatch=true Load the entire data in single batch and to stop the task if an error
occurs. Simultaneously, validate the user-specified reject file path and
write the error records to this file and to the session log.
Type is onError - String or oneBatch - Boolean.
When you set the values in the additional runtime parameters field, every configured partition initializes a
new loader instance and the configured values apply similarly across all the partitions.
If you specify both the options as true, Snowflake considers the compressDataBeforePut option.
Specify the copyEmptyFieldAsEmpty Boolean option and set the value to true or false based on your
requirement.
Consider the following scenarios before you configure the copyEmptyFieldAsEmpty Boolean parameter:
• If you do not configure this parameter, Null values are received as NULL, and empty values are received as
Empty. This is the default behavior.
• If you set the parameter copyEmptyFieldAsEmpty=false, Null values as received as Null and empty values
are received as Null.
• If you set the parameter copyEmptyFieldAsEmpty=true, Null values are received as empty, while empty
values are received as empty.
While loading in a single batch, if an error occurs, the Secure Agent checks for the specified reject file name,
runs the COPY command, validates the reject file, and then passes the file name to capture the errors, if any.
When you configure a mapping, add the CDC sources and then run the associated mapping task to write the
changed data to Snowflake. If you define a column as required in the Snowflake target table, map a column in
the CDC source to the required column in the Snowflake target in the mapping before you run the task.
When the mapping task processes the changed data from a CDC source, Snowflake Data Cloud Connector
creates a state table in Snowflake. When the changed data is received from the CDC source, Snowflake Data
Cloud Connector uploads the changed data to the staging table. Then, it generates a Job_Id and writes the
Job_Id to the state table along with the restart information.
The connector then merges the stage table with the actual target table in Snowflake. Each time you run the
mapping task, Snowflake Data Cloud Connector creates the state table, if it does not exist, to store the state
information.
Snowflake Data Cloud Connector uses the following naming convention for the tables:
Mapping
You can use a Snowflake target in a CDC mapping to write data from CDC sources. You cannot use a
Snowflake source or lookup in a CDC mapping.
The optimization property that you set for staging data in the DTM of the Secure Agent is not applicable.
If you run a mapping with both CDC and staging property enabled, the mapping runs successfully.
However, staging optimization is disabled and you can view a message logged in the session logs.
When you run a CDC mapping to write data from a CDC source to a Snowflake target created at runtime,
you might encounter the following error:
Error occured while initializing CCI State Handler
com.informatica.cci.runtime.internal.utils.impl.CExceptionImpl: Internal error: Recovery
Init failed
To avoid this error, you must have the grant permissions to create a table in Snowflake.
Add the CDC source in the mapping, and then run the associated mapping task to write the changed data to
the Snowflake target. You can also configure multiple pipelines in a single mapping to write the captured
changed data to a Snowflake target.
When you configure a mapping to write changed data from a CDC source to Snowflake, you can configure the
following advanced properties in the Snowflake Target transformation:
• Database
• Schema
• Warehouse
• Role
• Pre SQL
• Post SQL
• Truncate target table
• Table name
• Update override
1. In the Source transformation, specify a name and description in the general properties.
2. In the Source tab, select any configured CDC connection and specify the required source properties.
You can only configure a single Snowflake Data Cloud target transformation in a mapping to write changed
data from a CDC source.
1. On the Target tab, perform the following steps to configure the target properties:
a. In the Connection field, select theSnowflake Data Cloud connection.
b. In the Target Type field, select the type of the target object.
c. In the Object field, select the required target object.
d. In the Operation field, select Insert or Data Driven.
Note: Update, upsert, and delete target operations are not applicable. Ensure that the target tables
do not have a primary key defined in Snowflake.
e. If you select Data Driven Condition, specify the DD_INSERT condition.
Note: Ensure that the target tables do not have a primary key defined in Snowflake.
f. Keep the Update Column field empty.
It is recommended that the source object contains a primary key.
g. Configure the applicable advanced target properties for the CDC mode.
2. On the Field Mapping tab, map the incoming fields to the target fields. You can manually map an
incoming field to a target field or automatically map fields based on the field names.
If you define a column as required in the Snowflake target table, map a column in the CDC source to the
required column in the Snowflake target in the mapping.
• Success. The Secure Agent applied all rows of insert, update, and delete operations.
• Warning. The Secure Agent rejected one or more rows. The Rows Processed field in the My Jobs page
reflects the total number of rows that the Secure Agent processed.
• Failed. The job did not complete because it encountered errors.
The following image shows the My Jobs page that shows the details of the state and the number of
processed rows of a Snowflake Data Cloud job:
To view how many among the processed rows were a success and how many resulted in an error, select the
specific instance name and view the Results section. You can view the number of success rows and error
rows.
The following image shows the details of the Snowflake Data Cloud task:
You can download the session log to get details of the number of output rows, affected rows, applied rows,
and rejected rows.
You might also encounter the following scenarios of target statistics for Snowflake Data Cloud write
operations:
Constraint violation
In insert, update, or delete operation scenarios where the Secure Agent rejects rows due to a constraint
violation, a warning appears in the Job Properties page. You can download the session log to view the
target statistics.
In update or delete operation scenarios where the Secure Agent does not find a match for some records,
that number does not reflect in the My Jobs page and the session log. For example, if there are 5 input
rows and the Secure Agent updates only 4 target rows, the status of the number of processed rows stills
reflects as 5. This issue occurs when Snowflake does not return an error message for rejected rows.
Non-unique match
In update or delete operation scenarios where the Secure Agent updates or deletes more rows because
of a non-unique match, that actual number of updated or deleted records does not reflect both in the My
Jobs page and in the session log. For example, if there were 5 input records and the Secure Agent
updated 10 target rows, the My Jobs page reflects only 5 processed rows.
The number of success rows for the target object in the Job Properties page is not updated and remains
the same as the number of incoming rows. For example, while writing 5 records to the target, if two
records are rejected, the number of success rows still reflects as 5.
• If some records are rejected while writing large amounts of data to Snowflake, the rejected file might not
display some of the rejected records even though the statistics of rejected records appear correctly in the
session logs.
You can look up Snowflake data values based on a condition that you configure. In the Lookup
transformation, select the lookup connection and object. Then, define the lookup condition and the outcome
for multiple matches.
The mapping queries the lookup source based on the lookup fields and the defined lookup condition. The
lookup operation returns the result to the Lookup transformation, which then passes the results downstream.
• Connected. You can use a cached or uncached connected lookup for mappings. You can also use a
dynamic lookup cache to keep the lookup cache synchronized with the target.
• Unconnected. You can use a cached lookup. You need to supply input values for an unconnected Lookup
transformation from a :LKP expression in a transformation that uses an Expression transformation.
For more information about Lookup transformation, see Transformations in the Data Integration
documentation.
Property Description
Source Type Type of the source object. Select Single Object, or Parameter.
39
Property Description
Parameter A parameter file where you define values that you want to update without having to edit the task.
Select an existing parameter for the lookup object or click New Parameter to define a new parameter
for the lookup object.
The Parameter property appears only if you select parameter as the lookup type.
If you want to overwrite the parameter at runtime, select the Allow parameter to be overridden at run
time option.
When the task runs, it uses the parameters from the file that you specify in the task advanced
session properties.
The following table describes the Snowflake Data Cloud lookup object advanced properties that you can
configure in a Lookup transformation:
Advanced Description
Property
Role Overrides the Snowflake role assigned to the user specified in the connection.
Parameterization
When you parameterize lookup objects and connections, you must follow some rules to override the object
name.
If you enable the Allow parameter to be overridden at run time option in a transformation, you cannot
override the object name using the fully qualified name such as db.schema.tablename.
You must pass the db=<dbname>&schema<schemaname> values in the Additional JDBC URL Parameters field in
the Snowflake Data Cloud connection.
• If the multiplicity option is set to Report error, the task does not fail.
• When you configure a lookup override without caching, the return first row and return last row multiple
matches options are not applicable.
• Mappings configured for an uncached lookup and multiple matches fails when the Snowflake source
contains case-sensitive tables and column names.
When you select Lookup Caching Enabled, Data Integration queries the lookup source once and caches the
values for use during the session, which can improve performance. You can specify the directory to store the
cached lookup. You can configure dynamic and persistent lookup caches.
For information about lookup caching, see the chapter "Lookup transformations" in Transformations in the
Data Integration documentation.
• When you configure dynamic lookup cache, set the On Multiple Matches property to Report Error. To reset
the property, change the dynamic lookup to a static lookup, change the property, and then change the
static lookup to a dynamic lookup. Ensure that all the lookup fields are mapped.
• A lookup condition in dynamic lookups can use only an equal operator.
• Pushdown optimization is not applicable.
• You cannot parameterize a Lookup transformation enabled to use dynamic lookup cache.
Data Integration logs the uncached lookup queries for a connected and unconnected lookup in the session
log. Enable the Lookup Caching Enabled property in the mapping, and then enable the verbose mode in the
Snowflake Data Cloud mapping task.
When you run the mapping, Data Integration logs the uncached lookup queries in the session logs.
Pushdown optimization
You can use pushdown optimization to push the transformation logic to the Snowflake database.
When you run a task configured for pushdown optimization, Data Integration converts the transformation
logic to an SQL query or a Snowflake command. Data Integration sends the query to the database, and the
database runs the query. The amount of transformation logic that Data Integration pushes to the database
depends on the database, the transformation logic, and the mapping configuration. Data Integration
processes all transformation logic that it cannot push to a database.
Configure pushdown optimization for a mapping in the tasks properties. Full pushdown optimization is
enabled by default in mapping tasks.
The source or target database executes the SQL queries or Snowflake commands to process the
transformations. The amount of transformation logic you can push to the database depends on the database,
transformation logic, and mapping and session configuration. Data Integration processes all the
transformation logic that it cannot push to a database.
Data Integration pushes down as much transformation logic as possible to process in the target
database.
For mappings that read from and write to Snowflake, Data Integration analyses all the transformations
from the source to the target. If all the transformations are compatible in the target, it pushes the entire
mapping logic to the target. If it cannot push the entire mapping logic to the target, Data Integration first
pushes as much transformation logic to the source database and then pushes as much transformation
logic as possible to the target database.
Note: If the source and target Snowflake accounts are separate and reside in different regions but are
hosted on the same cloud platform, you can configure full pushdown optimization. However, ensure that
the Snowflake account user and role of the target Snowflake account has access to the Snowflake
source account.
43
Source
Data Integration pushes down as much as the transformation logic as possible to process in the source
database.
Snowflake source Reads from and writes to Snowflake tables using the Snowflake Data Cloud connection.
Snowflake target You can also read from Snowflake external tables, views, and materialized views.
Pushdown optimization type is Source or Full.
Amazon S3 source Reads from Amazon S3 using an Amazon S3 V2 connection and writes to Snowflake
Snowflake target using a Snowflake Data Cloud connection.
Pushdown optimization type is Full.
Google Cloud Storage Reads from Google Cloud Storage using a Google Cloud Storage V2 connection and
source writes to Snowflake using a Snowflake Data Cloud connection.
Snowflake target Pushdown optimization type is Full.
Microsoft Azure Data Lake Reads from Microsoft Azure Data Lake Storage Gen2 using a Microsoft Azure Data Lake
Storage Gen2 source Storage Gen2 V2 connection and writes to Snowflake using a Snowflake Data Cloud
Snowflake target connection.
Pushdown optimization type is Full.
You can use the Secure Agent or Hosted Agent to push mapping logic to the database.
Note: You can also configure pushdown for a mapping that uses a Snowflake ODBC connection to read from
and write to Snowflake. Informatica recommends that you use the Snowflake Data Cloud connection in
mappings to configure pushdown optimization. If you cannot push specific transformation logic using the
Snowflake Data Cloud connection, you can explore configuring pushdown optimization using the Snowflake
ODBC connection. The Snowflake ODBC connection uses the Snowflake ODBC 64-bit drivers on Windows and
Linux systems. For more information, see the How-To Library article,
Configuring pushdown optimization for Snowflake using the ODBC Connector.
The Storage Integration contains the details of the enabled Google Cloud Storage buckets from which you
want to read data. Snowflake Data Cloud Connector creates a temporary external stage that uses the Cloud
Storage Integration you created.
After you create the Cloud Storage Integration in Snowflake, specify the Cloud Storage Integration name in
the Snowflake Data Cloud connection properties. Specify the name in the Additional JDBC URL Parameters
field. The Storage Integration value is case-sensitive.
You can refer to the Snowflake documentation to perform the following steps:
The Storage Integration contains the details of the enabled Microsoft Azure Data Lake Storage Gen2
container from which you want to read data. The Snowflake Data Cloud Connector creates a temporary
external stage that uses the Cloud Storage Integration you created.
After you create the Cloud Storage Integration in Snowflake, specify the Cloud Storage Integration name in
the Snowflake Data Cloud connection properties. Specify the name in the Additional JDBC URL Parameters
field. The Storage Integration value is case-sensitive.
Configuring storage integration for Microsoft Azure Data Lake Storage Gen2
Create a storage integration to allow Snowflake to read data from the Microsoft Azure Data Lake Storage
Gen2 container.
You can refer to the Snowflake documentation to perform the following steps:
1. Run the DESCRIBE INTEGRATION command to retrieve the following consent URL: desc storage
integration <integration_name>;
where integration_name is the name of the integration you created.
The URL in the AZURE_CONSENT_URL column has the following format:
https://login.microsoftonline.com/<tenant_id>/oauth2/authorize?
client_id=<snowflake_application_id
Copy the value in the AZURE_MULTI_TENANT_APP_NAME column. This is the name of the Snowflake
client application created for your account. You need this information to grant this application the
required permissions to get an access token for the storage locations.
2. In a web browser, navigate to the URL in the AZURE_CONSENT_URL URL column.
The page displays a Microsoft permissions request page.
3. Click Accept.
This allows the Azure service principal created for your Snowflake account to obtain an access token on
any resource inside your tenant. The access token is generated successfully only if you grant the service
principal the appropriate permissions on the container.
4. Log into the Microsoft Azure portal.
5. Navigate to Azure Services > Storage Accounts, and then click the name of the storage account for
which you want to grant the Snowflake service principal access to.
6. Click Access Control (IAM) > Add Role Assignment.
7. Select the required role to grant to the Snowflake service principal:
• Storage Blob Data Reader: Grants read access only. You can load data from files staged in the
storage account.
• Storage Blob Data Contributor: Grants read and write access. You can load data from or unload data
to files staged in the storage account.
8. Search for the Snowflake service principal.
This is the identity in the AZURE_MULTI_TENANT_APP_NAME property in the DESC STORAGE
INTEGRATION output in Step 1. It might take an hour or longer for Azure to create the Snowflake service
principal requested through the Microsoft request page. If the service principal is not available
immediately, it is recommended that you wait for an hour or two and then search again. If you delete the
service principal, the storage integration stops working.
After you select the required pushdown optimization options and run the preview, Data Integration creates
and runs a temporary pushdown preview mapping task. When the job completes, Data Integration displays
the SQL queries to be executed and any warnings in the Pushdown Optimization panel. The warning
messages help you understand which transformations in the configured mapping are not applicable for
pushdown optimization. If pushdown optimization fails, Data Integration lists any queries generated up to the
point of failure. You can edit the mapping and fix the required transformations before you run the mapping
for pushdown optimization.
You can also view the temporary job created under My Jobs and download the session log to view the
queries generated.
For more information about how to preview pushdown optimization, see the topic "Pushdown optimization
preview" in Mappings in the Data Integration documentation.
You can specify the Optimization context type on the Schedule tab in the task properties. Based on the
optimization context that you specify, Data Integration combines queries issued from multiple targets and
constructs a single query for pushdown optimization and the task is optimized.
Enable this mode when you insert data from a Snowflake source to multiple Snowflake targets. Data
Integration combines the queries generated for each of the targets and issues a single query.
Enable this mode when you write to two Snowflake targets, where you use one target to insert data and
the other target to update data. Data Integration combines the queries for both the targets and issues a
Merge query.
Default is None.
The following image shows a mapping that writes slowly changing dimension data to a Snowflake target
table:
Add lookup and expression transformations to compare source data against the existing target data. You
enter the lookup conditions and source columns that you want the Data Integration to compare against the
existing target.
For each source row without a matching primary key in the target, the Expression transformation marks the
new row. For each source row with a matching primary key in the target, the Expression compares user-
defined source and target columns. If those columns do not match, the Expression marks the row changed.
The mapping then splits into two data flows.
The first data flow uses the Router transformation to pass only new rows to the Expression transformation.
The Expression transformation inserts new rows to the target. A Sequence Generator creates a primary key
for each row. The Expression transformation increases the increment between keys by 1,000 and creates a
version number of 0 for each new row.
In the second data flow, the Router transformation passes only changed rows to pass to the Expression
transformation. The Expression transformation inserts changed rows to the target. The Expression
transformation increments both the key and the version number by one.
Restrictions
You cannot use a filter, joiner, and a custom SQL query in an SCD type 2 merge mapping.
To use cross-schema pushdown optimization, create two connections and specify the schema in each
connection. Ensure that the schema in the source connection is different from the schema in the target
connection, but both the schemas must belong to the same database.
You can configure cross-database pushdown optimization in the mapping task. Ensure that the Snowflake
source and target transformations in the mapping uses two different Snowflake Data Cloud connections, or
Snowflake ODBC connections.
You can set the task to fail or run without pushdown optimization. You can use this functionality only when
you configure full pushdown optimization for a task that runs using the Snowflake Data Cloud connection in
the Source and Target transformations.
1. In the Pushdown Optimization section on the Schedule tab, set the pushdown optimization value to Full
for the selected mapping.
2. To determine what action Data Integration must perform when pushdown optimization does not occur,
enable or disable the If pushdown optimization is not possible, cancel the task option based on your
requirement:
a. Enable to cancel the task when pushdown does not occur.
b. Disable to run the task without pushdown optimization.
Default is disabled.
Use the Clean Stop option on the My Jobs page in Data Integration and the All Jobs and Running Jobs page
in Monitor.
See the following exceptions before you clean stop a pushdown optimization task:
• When you clean stop a task enabled for source pushdown optimization that reads from or writes to
Snowflake and the target or source properties in the mapping contains pre-SQL or post-SQL statements,
the job continues to run the target post-SQL query even though the select query is terminated.
• When you run a mapping configured to create a new target at runtime and clean stop the job immediately,
Data Integration creates the target table even if the job is terminated.
When you run a task configured for pushdown optimization, the task converts the transformation logic to
Snowflake queries. The task sends the queries to Snowflake and the mapping logic is processed in the
Snowflake database.
You can configure pushdown optimization for a mapping in the following scenarios:
Snowflake to Snowflake
Read from and write to Snowflake using a Snowflake Data Cloud connection.
Amazon S3 to Snowflake
Read from Amazon S3 using an Amazon S3 V2 connection in the Source transformation and write to
Snowflake using a Snowflake Data Cloud connection in the Target transformation.
Read from Microsoft Azure Data Lake Storage Gen2 using a Microsoft Azure Data Lake Storage Gen2
connection in the Source transformation and write to Snowflake using a Snowflake Data Cloud
connection in the Target transformation.
Read from Google Cloud Storage using a Google Cloud Storage connection in the Source transformation
and write to Snowflake using a Snowflake Data Cloud connection in the Target transformation.
Example
You work for healthcare solutions and your organization provides healthcare technology to pharmacies and
pharmacy chains. You enable pharmacies to process prescriptions, store and provide access to healthcare
records, and improve patient outcomes. Your organization stores its data in Google Cloud Storage.
The management wants to create a patient-centric pharmacy management system. The organization plans to
leverage the warehouse infrastructure of Snowflake and load all its data to Snowflake so that they can make
operational, financial, and clinical decisions with ease.
To load data from a Google Cloud Storage object to Snowflake, you must use ETL and ELT with the required
transformations that support the data warehouse model.
Use a Google Cloud Storage V2 connection to read data from a Google Cloud Storage bucket and a
Snowflake Data Cloud connection to write to a Snowflake target. Configure full pushdown optimization in the
mapping to optimize the performance. The Google Cloud Storage source data is uploaded to the Snowflake
stage using the PUT command. The Snowflake COPY commands are used to convert the transformations to
52
the corresponding SQL functions and expressions while loading the data to Snowflake. Pushdown
optimization enhances the performance of the task and reduces the cost involved.
When you use pushdown optimization, the Secure Agent converts the expression in the transformation by
determining equivalent operators, variables, and functions in the database. If there is no equivalent operator,
variable, and function, the Secure Agent processes the transformation logic.
The tables summarizes the availability of pushdown functions that you can push to Snowflake using full
pushdown optimization:
ISNULL() SIGN() -
*You cannot pass the filter condition argument in the STDDEV() function.
Note: If you specify a function that is not supported for Snowflake full pushdown optimization, the task runs
either with partial pushdown optimization or without full pushdown optimization.
The tables lists the operators that you can push to Snowflake:
Operator Operator
+ >=
- <=
* !=
/ AND
% OR
|| NOT
> IS NULL
You can use full pushdown to push the following transformations to Snowflake:
• Aggregator
• Expression
• Filter
• Joiner
• Lookup
• Normalizer
• Rank
• Router
• Sequence Generator
• SQL
• Sorter
• Union
• Update Strategy
Aggregator transformation
You can configure full pushdown optimization to push an Aggregator transformation to process in
Snowflake.
Aggregate calculations
You can perform the following aggregate calculations:
• AVG
• COUNT
• MAX
• MIN
• MEDIAN
• SUM
• VARIANCE
Incoming ports
When you configure an Aggregator transformation and the incoming port is not used in an aggregate function
or in a group by field in mappings, the ANY_VALUE() function is used for columns that are not part of the
Expression transformation
You can configure full pushdown optimization to push an Expression transformation to process in Snowflake.
You can add an Expression transformation to each of the sources in the mapping, followed by a join
downstream in the mapping. Additionally, you can add multiple Expression transformations that branch out
from a transformation and then branch in into a transformation downstream in the mapping.
When you configure an Expression transformation, consider the following rules to include variables in the
expression:
• You cannot use variables where you are using the value assigned while processing a previous row for
calculations in the current row. if you do, the mapping runs without pushdown optimization.
• The variables can be nested, but you cannot refer to a variable before it is defined in the expression.
If the variables are not defined in that order, the mapping runs without pushdown optimization.
For example,
var: AGEPLUS2 = AGEPLUS1 + 1
var: AGEPLUS1 = AGE + 1
out: NEXTAGE = AGEPLUS2 + 1
Here, AGE +1 is defined later. AGEPLUS2 in the first variable refers to AGEPLUS1 and remains unresolved.
To resolve this, specify the variables in the following order:
var: AGEPLUS1 = AGE + 1
var: AGEPLUS2 = AGEPLUS1 + 1
out: NEXTAGE = AGEPLUS2 + 1
• The variables cannot have an expression that is cyclic or refers to itself:
For example,
var: AGEPLUS1 = AGEPLUS2 + 1
var: AGEPLUS2 = AGEPLUS1 + 1
out: NEXTAGE= AGEPLUS2
Here, AGEPLUS1 refers to AGEPLUS2 and remains unresolved.
Lookup transformation
You can configure full pushdown optimization to push a Lookup transformation to process in Snowflake. You
can push both a connected and an unconnected lookup.
When the mapping contains an unconnected lookup, you can also nest the unconnected lookup function with
other expression functions. For example, :LKP.U_LOOKUP(Upper(argument1), argument)
Lookup objects
Consider the following rules when you configure lookups:
• You can configure a lookup for Snowflake when the Source transformation uses the following sources:
- Amazon S3
- Snowflake source
Note: You can configure a lookup for an Amazon S3, Google Cloud Storage, or Microsoft Azure Data Lake
Storage Gen2 object only when the Source transformation uses the corresponding Amazon S3, Google Cloud
Storage, or Microsoft Azure Data Lake Storage Gen2 source.
• In an unconnected lookup, ensure that you always select the Multiple Matches option to Report Error.
When you look up data and the lookup condition finds multiple matches, all the matching rows are
selected and the task runs with pushdown optimization. If you enabled Multiple Matches to any option
other than Report Error, the mapping runs without pushdown optimization.
• In a connected lookup, you can set the Multiple Matches option to Return all rows or Report Error. When
you set the Multiple Matches option to Report Error, you can set the Lkp_apdo_allow_report_error custom
flag in the task advanced session properties to determine how Data Integration handles multiple matches:
- When you set the property to Yes and if there are multiple matches in the data, the multiple match policy
is ignored and the job runs successfully with pushdown optimization.
- When you do not set the property, and if there are multiple matches in the data, Data Integration
considers the policy and displays a warning message. Pushdown optimization is ignored and the task
fails.
FileName port
When you configure a lookup for an Amazon S3 source in a mapping that contains an Amazon S3 source and
Snowflake target, remove the filename port from both the Amazon S3 source and lookup object. The
FileName port is not applicable.
Router transformation
You can configure source pushdown optimization to push a Router transformation to the database for
processing.
When you configure a Router transformation, connect or map only one output group to the target
transformation.
You can push a Sequence Generator transformation with the following restrictions:
SQL transformation
You can use an SQL transformation only to push certain functions and shared sequence.
You must use only the SELECT clause SQL statement to push a function. Specify the column name in the
select query or function. Do not push functions using statements such as "SELECT * FROM TABLE".
• UUID_STRING
• RANDOM
• RANDSTR
• SIGN
• CURRENT_REGION
• CURRENT_ACCOUNT
• CURRENT_ROLE
• CURRENT_USER
• CURRENT_DATABASE
• CURRENT_SCHEMA
• DAYNAME
• SPLIT
• SPLIT_PART
Get the shared sequence from Snowflake and define the sequence in an entered query in an SQL
transformation.
Specify the shared sequence in the entered query in the following syntax: Select
<Snowflake_schema_name>.<Snowflake_database_name>.<sequence_name>.NEXTVAL
• The Source transformation in the mapping must only include Snowflake source objects.
• A mapping runs without pushdown optimization when the source is Amazon S3, Google Cloud Storage, or
Microsoft Azure Data Lake Storage Gen2.
You can instead use the update and upsert operations in the Target transformation to write to Snowflake.
Features
You can configure pushdown optimization for a mapping that reads from the following sources and writes to
a Snowflake target:
• Snowflake source
• Amazon S3 source
• Google Cloud Storage source
• Microsoft Azure Data Lake Storage Gen2 source
When you configure a mapping, some parameters are not supported for a mapping enabled for pushdown
optimization. You can refer to the list of parameters that each source supports.
Source properties
You can configure the following properties in a Snowflake source transformation:
Features 59
• Role override
• Table name override
• SQL override
• Tracing level
Target properties
You can add multiple Snowflake targets in a mapping. The target can be the same Snowflake target table
added multiple times or different Snowflake target tables.
• Update Mode
• Batch row size
• Number of local staging files
• Rejected File Path
• Update Override
• Forward Rejected Rows
Lookup properties
When you enable pushdown optimization, you can configure the following properties for Snowflake
connected and unconnected lookups:
• Pre SQL
• Post SQL
• For a target created at runtime, ensure that the Snowflake source does not contain records with the Time
data type.
• When you configure filters, consider the following guidelines:
- If a mapping contains a Filter transformation and also a filter in the Source transformation, the mapping
consolidates the filter conditions from both these transformations to filter the records. However, it is
recommended that you use only one of these filters at a time in a mapping.
- You cannot use system variables in filters.
- You cannot apply a filter for query and multiple source objects.
- When you configure an IS_date function in an Expression transformation, specify the format for this
function. Else, the mapping populates incorrect data.
- When you configure two Sequence Generator transformations to write to two Snowflake targets, and the
sequence objects have the same sequence name in the custom properties, data populates incorrectly.
• For mappings that read from and write to Snowflake, consider the following guidelines:
- You cannot use a query to read from stored procedures.
- Even if you decrease the precision of the Snowflake String data type in a Source transformation to write
to a Snowflake table, the mapping passes without truncating the data.
- You can configure pre-SQL and post-SQL in mappings enabled for source pushdown optimization that
read from and write to Snowflake. Pre-SQL and post-SQL are not applicable for mappings enabled with
full pushdown optimization.
- When you configure a mapping for source or partial pushdown optimization, do not connect the Source
transformation to more than one transformation in the mapping. However, in a mapping enabled with full
pushdown optimization, the Source transformation can branch out to multiple transformations in the
mapping pipeline.
- You can configure a custom query in a Source transformation to read from Java or SQL user-defined
functions (UDF) in Snowflake.
- When the mapping runs with full or source pushdown optimization, some of the queries in the session
log are not aliased correctly. The alias for simple queries reflects properly.
- A mapping fails to read data from multiple tables joined using related objects, where the tables and
column names have case-sensitive, special, and unicode characters.
Features 61
- A mapping that reads from multiple Snowflake objects that do not belong to the same database and
schema fails.
- When you use the is_number function, the data populated for some values such as inf,inf and NaN in
Snowflake differs with and without pushdown optimization applied.
- When you use the IS_NUMBER function in a transformation and the input data contains d or D, for
example, in formats such as +3.45d+32 or +3.45D-32, the function returns False or 0.
- When you use the IS_DATE function in a transformation, do not use the J, MM/DD/YYYY SSSSS,
MM/DD/Y, and MM/DD/RR formats.
- Mappings that read from or write to Snowflake with multibyte characters in the table or column names
might fail. Before you configure a mapping to read from or write data with multibyte characters, set the -
DdisablePDOAdvancedAliasing property in the JVM options in the Secure Agent properties.
- When you pass columns with Null values in a Normalizer transformation, Null values are not written to
the target.
Amazon S3 V2 source
The mapping supports the following properties for an Amazon S3 V2 connection:
• Access Key
• Secret Key
A mapping enabled for pushdown optimization that reads from an Amazon S3 V2 source and writes to a
Snowflake target has some restrictions.
Authentication
When you read multiple Avro files using an Amazon S3 connection enabled for IAM authentication, specify
the right access key and the secret key in the Amazon S3 connection. For more information, see the help for
Amazon S3 V2 Connector.
• To write data from file data types such as Avro, ORC, or Parquet from Amazon S3 to Snowflake, you must
delete the Filename field.
• Mappings fails with a casting error when the table name contains Unicode characters.
Data types
A mapping has the following restrictions for certain data types:
For information on how to configure the supported properties, see the Amazon S3 V2 Connector
documentation.
For information on how to configure the supported properties, see the Google Cloud Storage V2 Connector
documentation.
• Account Name
• File System Name
The mapping supports the following properties for a Microsoft Azure Data Lake Storage Gen2 source:
Features 63
• Tracing Level
For information on how to configure the supported properties, see the Microsoft Azure Data Lake Storage
Gen2 Connector documentation.
Before you configure a custom query as the source object or you configure an SQL override, perform the
following task:
In the Pushdown Optimization section on the Schedule tab, select Create Temporary View.
Note: If you do not set the Create Temporary View property, the mapping runs without pushdown
optimization.
Specify the options in the Additional Write Runtime Parameters field in the Snowflake Data Cloud advanced
target properties of the Target transformation.
When you specify multiple copy command options, separate each option with an ampersand &.
For more information about the supported copy command options, see the Snowflake documentation at the
following website: https://docs.snowflake.com/en/sql-reference/sql/copy-into-table.html
Check the queries in the session logs to verify if the mapping applied pushdown optimization.
For example, the following query is generated in the session log for a mapping enabled with full pushdown
optimization:
The session log provides the pushdown status. You can check the details to troubleshoot the error.
For example, the session log shows the following error details in the query:
When you do not enable pushdown optimization in a mapping, separate select and insert statements are
generated for the read and write operations:
READER_1_1_1>SNOWFLAKECLOUDDATAWAREHOUSE_1000 [2020-09-10 14:09:29.4781] [INFO]
The Snowflake Connector uses the following SQL query to read data: SELECT "DEPTID",
"DEPTNAME" FROM "DEPT" WHERE
( 'DEPT"."DEPTID" >=103) ORDER BY "DEPT"."DEPTOD" desc
• Snowflake native data types appear in the source and target transformations when you choose to edit
metadata for the fields.
• Transformation data types. Set of data types that appear in the transformations. These are internal data
types based on ANSI SQL-92 generic data types, which the Secure Agent uses to move data across
platforms. They appear in all transformations in a mapping.
When the Secure Agent reads source data, it converts the native data types to the comparable transformation
data types before transforming the data. When the Secure Agent writes to a target, it converts the
transformation data types to the comparable native data types.
Snowflake Data Cloud Data Transformation Data Type Range and Description
Type
Float (Double, Double double Floating point numbers with double-precision (64
precision, Real, Float, Float4, bit).
Float8) Maximum value: 1.7976931348623158e+307
Minimum value: -1.79769313486231E+307
Number (Decimal, Numeric) decimal Number with 28-bit precision and scale.
NUMBER (Int, Integer, Bigint, decimal Number with 28-bit precision and scale as 0.
Smallint, Tinyint, Byteint) Maximum value: 9.99999999999999E+27
Minimum value: -9.99999999999999E+26
66
Snowflake Data Cloud Data Transformation Data Type Range and Description
Type
Array, object, and variant from the Snowflake source are mapped to the String data type in Cloud Data
Integration. While writing to the target, these strings can be written as Array, Object, or Variant columns to the
Snowflake target. The strings that you write to Snowflake must be in a serialization format, just as they
appear after the read operation.
When you use the Create New at Runtime option to write the Variant data type from a source to the
Snowflake target, Data Integration writes Variant as Varchar to the target. You can edit the field mapping to
map Varchar to Variant before you write to the target. In a completely parameterized mapping, you cannot
edit the target metadata from the default Varchar data type to variant.
The following table lists the semi-structured data types that you can read from Snowflake and the
corresponding transformation data types that these map to in Cloud Data Integration:
Note: The default size to read or write semi-structured data types is set to 65536 bytes. To increase the limit,
add the following parameter and set the required value in the Additional JDBC URL Parameters field of the
Snowflake Data Cloud connection properties: semiStructuredDTPrecision=<size>
Mappings
Consider the following rules and guidelines for mappings:
• You can read or write data of Binary data type that is in Hexadecimal format.
• The agent reads or writes the maximum float value 1.7976931348623158e+308 as infinity.
• You can use the following formats to specify filter values of the Datetime data type:
- YYYY-MM-DD HH24:MI:SS
- YYYY/MM/DD HH24:MI:SS
- MM/DD/YYYY HH24:MI:SS
• If a Snowflake Cloud Data lookup object contains fields with the String data type of maximum or default
precision and the row size exceeds the maximum row size, the task fails.
• The performance of a write operation slows down if the data contains the Date fields.
• A task that captures changed data from a CDC source fails when the Snowflake target contains a
repeated column of the Record data type.
• When you handle dynamic schemas in mappings, the following updates are not applicable:
- Schema updates to the Timestamp and Date data types.
- Schema updates that involve a decrease in the precision of the Varchar data type.
You can enable bulk processing of records, set logging for large records, configure a local staging directory,
optimize data staging, and improve memory requirements for read operations.
This update does not apply for mappings configured with pushdown optimization using the Snowflake ODBC
connection.
1. In Administrator, select the Secure Agent listed on the Runtime Environments tab.
2. Click Edit.
3. In the System Configuration Details section, select Data Integration Service as the service and DTM as
the type.
4. Edit the JVMOption1 property, and enter -Xmx256m.
5. Click Save.
Perform the following steps to configure bulk processing before you run a mapping:
1. In Administrator, select the Secure Agent listed on the Runtime Environments tab.
2. Click Edit.
69
3. In the System Configuration Details section, select Data Integration Server as the service and DTM as
the type.
4. Edit the JVM option, and enter -DENABLE_WRITER_BULK_PROCESSING=true.
5. Click Save.
Note: This update does not apply for mapping tasks configured with pushdown optimization using the
Snowflake ODBC connection or the Snowflake Data Cloud connection.
If you do not set this logging property, the following errors might appear in the session logs:
net.snowflake.client.jdbc.internal.apache.http.impl.execchain.RetryExec execute
To configure a different directory for the local staging files, perform the following steps:
If you do not set the staging property, Data Integration performs staging without the optimized settings,
which might impact the performance of the task.
• If you run a mapping enabled for pushdown optimization, the mapping runs without pushdown
optimization.
• If the data contains timestamp data types with time zone, the job runs without staging the data in the
local flat file.
• If the mapping contains Oracle CDC as the source and Snowflake as the target, the job runs without
staging the data in the local flat file.
• You can determine the behavior of writing data with empty and null values to the target when you enable
the staging property and set the copyEmptyFieldAsEmpty in the Additional Write Runtime Parameters
field in the Target transformation properties.
The table describes the behavior of empty and null values when you set these properties:
Additional Write Runtime Parameters Enable staging optimization Disable staging optimization
Perform the following tasks to set the staging property for the Tomcat in the Secure Agent properties:
When you run the mapping, the flat file is created in the following directory in your machine: C:\Windows
\Temp\snowflake\stage\<Snowflake_Target.txt>
When you run the mapping, the flat file is created in the following directory in your machine:
• Read operation: If the staging is done through the flat file successfully, Data Integration logs the following
message in the session log: Staging mode is enabled to read data.
• Write operation: You can check the session logs. If the staging is done through the flat file successfully,
Data Integration logs the following message in the session log: The INFA_DTM_STAGING is successfully
enabled to use the flat file to create local staging files.
A lookup (continued)
warehouse 39
authentication lookup caches
OAuth 2.0 authorization code 13, 15 dynamic 41
Lookup transformation
lookup caching 41
C
cache
enable lookup cache 41
M
Cloud Application Integration community maintenance outages 7
URL 6 mappings
Cloud Developer community database 24
URL 6 example 21
connections lookup overview 39
Snowflake Data Cloud 12 lookup properties 39
connector overview 8 Post-SQL 24
Pre-SQL 24
role 24
D schema 24
source properties 24
Data Integration community warehouse 24
URL 6
data types
native data types 66, 67
overview 66
N
transformation data types 66, 67 not parameterized sort 24
dynamic schema handling 20
P
F parameterized sort 24
filter 24 pushdown optimization
flat file functions 53, 54
staging data 71 transformations 53–55
Pushdown optimization
preview 47
73
Snowflake Data Cloud (continued) system status 7
target 8
transformations 8
Snowflake data Cloud connection
configuration 47
T
Snowflake Data Cloud connection transformations
configuration 47, 49 pushdown optimization 55
Snowflake Data Cloud connector trust site
rules and guidelines 68 description 7
Snowflake ODBC connection
configuration 49, 50
pushdown optimization overview 43
sort 24
U
SQL transformations upgrade notifications 7
configuration 19
staging data
flat file 71
status
W
Informatica Intelligent Cloud Services 7 web site 6
74 Index