CDI SnowflakeCloudDataWarehouseV2ConnectorGuide en
CDI SnowflakeCloudDataWarehouseV2ConnectorGuide en
This software and documentation are provided only under a separate license agreement containing restrictions on use and disclosure. No part of this document may be
reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica LLC.
U.S. GOVERNMENT RIGHTS Programs, software, databases, and related documentation and technical data delivered to U.S. Government customers are "commercial
computer software" or "commercial technical data" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such,
the use, duplication, disclosure, modification, and adaptation is subject to the restrictions and license terms set forth in the applicable Government contract, and, to the
extent applicable by the terms of the Government contract, the additional rights set forth in FAR 52.227-19, Commercial Computer Software License.
Informatica, Informatica Cloud, Informatica Intelligent Cloud Services, PowerCenter, PowerExchange, and the Informatica logo are trademarks or registered trademarks
of Informatica LLC in the United States and many jurisdictions throughout the world. A current list of Informatica trademarks is available on the web at https://
www.informatica.com/trademarks.html. Other company and product names may be trade names or trademarks of their respective owners.
Portions of this software and/or documentation are subject to copyright held by third parties. Required third party notices are included with the product.
The information in this documentation is subject to change without notice. If you find any problems in this documentation, report them to us at
[email protected].
Informatica products are warranted according to the terms and conditions of the agreements under which they are provided. INFORMATICA PROVIDES THE
INFORMATION IN THIS DOCUMENT "AS IS" WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING WITHOUT ANY WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND ANY WARRANTY OR CONDITION OF NON-INFRINGEMENT.
Table of Contents 3
Creating a mass ingestion task. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Viewing mass ingestion task details. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Running a mass ingestion task. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4 Table of Contents
Preface
The Cloud Data Integration Snowflake Cloud Data Warehouse V2 Connector Guide contains information about
how to set up and use Snowflake Cloud Data Warehouse V2 Connector. The guide explains how organization
administrators and business users can use Snowflake Cloud Data Warehouse V2 Connector to read data
from or write data to Snowflake Cloud Data Warehouse.
Informatica Resources
Informatica provides you with a range of product resources through the Informatica Network and other online
portals. Use the resources to get the most from your Informatica products and solutions and to learn from
other Informatica users and subject matter experts.
Informatica Documentation
Use the Informatica Documentation Portal to explore an extensive library of documentation for current and
recent product releases. To explore the Documentation Portal, visit https://docs.informatica.com.
If you have questions, comments, or ideas about the product documentation, contact the Informatica
Documentation team at [email protected].
https://network.informatica.com/community/informatica-network/products/cloud-integration
To find resources on using Application Integration (the Informatica Cloud Real Time service), access the
community at:
https://network.informatica.com/community/informatica-network/products/cloud-integration/cloud-
application-integration/content
Developers can learn more and share tips at the Cloud Developer community:
5
https://network.informatica.com/community/informatica-network/products/cloud-integration/cloud-
developers
https://marketplace.informatica.com/community/collections/cloud_integration
To search the Knowledge Base, visit https://search.informatica.com. If you have questions, comments, or
ideas about the Knowledge Base, contact the Informatica Knowledge Base team at
[email protected].
Subscribe to the Informatica Intelligent Cloud Services Trust Center to receive upgrade, maintenance, and
incident notifications. The Informatica Intelligent Cloud Services Status page displays the production status
of all the Informatica cloud products. All maintenance updates are posted to this page, and during an outage,
it will have the most current information. To ensure you are notified of updates and outages, you can
subscribe to receive updates for a single component or all Informatica Intelligent Cloud Services
components. Subscribing to all components is the best way to be certain you never miss an update.
To subscribe, go to the Informatica Intelligent Cloud Services Status page and click SUBSCRIBE TO
UPDATES. You can then choose to receive notifications sent as emails, SMS text messages, webhooks, RSS
feeds, or any combination of the four.
For online support, click Submit Support Request in Informatica Intelligent Cloud Services. You can also use
Online Support to log a case. Online Support requires a login. You can request a login at
https://network.informatica.com/welcome.
The telephone numbers for Informatica Global Customer Support are available from the Informatica web site
at https://www.informatica.com/services-and-training/support-services/contact-us.html.
6 Preface
Chapter 1
You can create a Snowflake Cloud Data Warehouse V2 connection and use the connection in mass ingestion
tasks, mappings, and mapping tasks. When you run a Snowflake Cloud Data Warehouse V2 mapping or
mapping task, the Secure Agent writes data to Snowflake based on the workflow and Snowflake Cloud Data
Warehouse V2 connection configuration. When you run a mass ingestion task, the Secure Agent transfers
files from any source that the mass ingestion task supports to a Snowflake Cloud Data Warehouse target.
Informatica recommends that you use Snowflake Connector if you want to create a mapping task to read
data from and write data to Snowflake Cloud Data Warehouse. For more information about using Snowflake
Connector, see the Snowflake Connector Guide.
7
The following table provides the list of tasks and object types supported by Snowflake Cloud Data
Warehouse V2 Connector:
You create a Snowflake Cloud Data Warehouse V2 connection on the Connections page. You can then use
the connection in the Mapping Designer when you create a mapping or in the MAss Ingestion Designer when
you create a mass ingestion task.
9
Snowflake Cloud Data Warehouse V2 connection
properties
When you set up a Snowflake Cloud Data Warehouse V2 connection, you must configure the connection
properties.
The following table describes the Snowflake Cloud Data Warehouse V2 connection properties:
Connection Description
property
Runtime The name of the runtime environment where you want to run the tasks.
Environment Snowflake Cloud Data Warehouse V2 also supports the Hosted Agent.
Username The user name to connect to the Snowflake Cloud Data Warehouse account.
Password The password to connect to the Snowflake Cloud Data Warehouse account.
You can configure partitioning to optimize the mapping performance at run time when you read data from
Snowflake Cloud Data Warehouse. The partition type controls how the agent distributes data among
partitions at partition points. You can define the partition type as key range partitioning. With partitioning, the
agent distributes rows of source data based on the number of threads that you define as partition.
11
The following table describes the Snowflake Cloud Data Warehouse V2 source properties that you can
configure in a Source transformation:
Property Description
Source Type Type of the source object. Select Single Object, Multiple Objects, Query, or Parameter.
Note: When you use a custom SQL query to import Snowflake Cloud Data Warehouse tables, the Secure
Agent fetches the metadata using separate metadata calls.
Object The source object for the task. Select the source object for a single source. When you select the
multiple source option, you can add source objects and configure relationship between them.
Filter Filters records based on the filter condition. Configure a simple filter.
Sort Sorts records based on the conditions you specify. You can specify the following sort conditions:
- Not parameterized. Select the fields and type of sorting to use.
- Parameterized. Use a parameter to specify the sort option.
The following table describes the advanced properties that you can configure in a Source transformation:
Advanced Description
Property
Warehouse Overrides the Snowflake Cloud Data Warehouse name specified in the connection.
Role Overrides the Snowflake Cloud Data Warehouse role assigned to user, specified in the
connection.
Pre SQL The pre-SQL command to run on the Snowflake Cloud Data Warehouse source table before the
Secure Agent reads the data.
For example, if you want to update records in the database before you read the records from the
table, specify a pre-SQL statement.
Post SQL The post-SQL command to run on the Snowflake Cloud Data Warehouse table after the Secure
Agent completes the read operation.
For example, if you want to delete some records after the latest records are loaded, specify a
post-SQL statement.
Table Name Overrides the table name of the imported Snowflake Cloud Data Warehouse source table.
SQL Override The SQL statement to override the default query used to read data from the Snowflake Cloud
Data Warehouse source.
Tracing Level Determines the amount of detail that appears in the log file. You can select Terse, Normal,
Verbose Initialization, or Verbose Data. Default value is Normal.
12 Chapter 3: Mappings and mapping configuration tasks with Snowflake Cloud Data Warehouse V2 Connector
field that you define as partition keys. The agent compares the field value to the range values for each
partition and sends rows to the appropriate partitions.
Use key range partitioning for columns that have an even distribution of data values. Otherwise, the partitions
might have unequal size. For example, a column might have 10 rows between key values 1 and 1000 and the
column might have 999 rows between key values 1001 and 2000. If the mapping includes multiple sources,
use the same number of key ranges for each source.
When you define key range partitioning for a column, the agent reads the rows that are within the specified
partition range. For example, if you configure two partitions for a column with the ranges as 10 through 20
and 30 through 40, the agent does not read the rows 20 through 30 because these rows are not within the
specified partition range.
You can configure a partition key for fields of the following data types:
• Integer
• String
• Any type of number data type. However, you cannot use decimals in key range values.
• Datetime. Use the following format to specify the date and time: YYYY-MM-DD HH24:MI:SS. For example,
1971-01-01 12:30:30
Note: If you specify the date and time in any other format, the task fails.
You can write data to an existing table or create a table in the target by using create target option.
You can configure partitioning to optimize the mapping performance at run time when you write data to
Snowflake Cloud Data Warehouse V2 targets. The partition type controls how the agent distributes data
among partitions at partition points. You can define the partition type as passthrough partitioning. With
The following table describes the Snowflake Cloud Data Warehouse V2 target properties that you can
configure in a Target transformation:
Property Description
Object The target object for the task. Select the target object.
You can either select an existing table or create a new table.
Operation The target operation. Select Insert, Update, Upsert, Delete, or Data Driven.
Update columns The temporary key column to update data to or delete data from a Snowflake Cloud Data
Warehouse target.
If you perform an update, update else insert, or delete operation and the Snowflake Cloud Data
Warehouse V2 target does not include a primary key column, click Add to add a temporary key. You
can select multiple columns.
The following table describes the advanced properties that you can configure in a Target transformation:
Advanced Description
Property
Warehouse Overrides the Snowflake Cloud Data Warehouse name specified in the connection.
Role Overrides the Snowflake Cloud Data Warehouse role assigned to user specified in the connection.
Pre SQL The pre-SQL command to run before the Secure Agent writes to Snowflake Cloud Data
Warehouse.
For example, if you want to assign sequence object to a primary key field of the target table
before you write data to the table, specify a pre-SQL statement.
Post SQL The pre-SQL command to run after the Secure Agent completes the write operation.
For example, if you want to alter the table created by using create target option and assign
constraints to the table before you write data to the table, specify a post-SQL statement.
Batch Row Size Number of rows that the agent writes in a batch to the Snowflake Cloud Data Warehouse target.
Number of local Enter the number of local staging files. The agent writes data to the target, after the specified
staging files number of local staging files are created.
14 Chapter 3: Mappings and mapping configuration tasks with Snowflake Cloud Data Warehouse V2 Connector
Advanced Description
Property
Truncate Target Truncates the database target table before inserting new rows. Select one of the following
Table options:
- True. Truncates the target table before inserting all rows.
- False. Inserts new rows without truncating the target table
Default is false.
Table Name Overrides the table name of the Snowflake Cloud Data Warehouse target table.
Rejected File Path The filename and path of the file on the Secure Agent machine where the Secure Agent writes
records that are rejected while writing to the target.
For example, \rejectedfiles\reject7
Forward Rejected Determines whether the transformation passes rejected rows to the next transformation or drops
Rows rejected rows. By default, the agent forwards rejected rows to the next transformation.
When you configure a lookup in Snowflake Cloud Data Warehouse V2, you select the lookup connection and
lookup object. You also define the behavior when a lookup condition returns more than one match.
The following table describes the Snowflake Cloud Data Warehouse V2 lookup object properties that you can
configure in a Lookup transformation:
Property Description
Source Type Type of the source object. Select Single Object or Parameter.
Multiple Matches Behavior when the lookup condition returns multiple matches. Select Return any row, Return all
rows, or Report error.
Role Overrides the Snowflake role assigned to user specified in the connection.
1. To select the Snowflake object in the Source or Target transformation, click Select in the Object field.
2. In the Select Source Object dialog box, select the database and schema and then select the tables from
which you want to read from or write to.
The following image shows the employee details tables from a Snowflake database and schema:
You can select the required table in the Source or Target transformation.
16 Chapter 3: Mappings and mapping configuration tasks with Snowflake Cloud Data Warehouse V2 Connector
Snowflake Cloud Data Warehouse V2 mapping
example
An enterprise application uses the Oracle database to store the product transaction details. You use
Snowflake data warehouse to analyze the completed transactions, pending transactions and availability of
stocks. You read the product transaction details from an Oracle source and apply lookup condition on the
PRODUCTDET table in Snowflake Cloud Data Warehouse which stores details of product and its availability.
Based on availability and requirement, you write the transactions to the PENDINGTRANSACTION and
COMPLETEDTRANSACTION tables in Snowflake Cloud Data Warehouse and update the INSTOCK field in
PRODUCTDET table based on the completed transactions. You use the following objects in the Snowflake
Cloud Data Warehouse V2 mapping:
Source Object
The source object for the mapping task is OracleSrc table in Oracle. Use an Oracle connection to
connect to Oracle and read data from the OracleSrc object.
The following image shows the transaction details stored in the OracleSrc table:
Lookup Object
The lookup object for the mapping task is PRODUCTDET table in Snowflake Cloud Data Warehouse,
which has details of product and its availability.
The following image shows the data stored in the PRODUCTDET table:
Target Object
PENDINGTRANSACTION
The following image shows the data stored in the PENDINGTRANSACTION table:
PRODUCTDET
The PRODUCTDET table includes the PRODUCTID, INSTOCK, PRODUCTDET, and PRICE fields. Based
on the completed transactions, the INSTOCK field is updated.
The following image shows the data stored in the PRODUCTDET table:
Mapping
The following image shows the Snowflake Cloud Data Warehouse V2 mapping:
When you run the mapping, the agent reads the transaction details from source, fetches fields from the
lookup, and based on the conditions applied write the available quantity and transaction details to the target
tables.
18 Chapter 3: Mappings and mapping configuration tasks with Snowflake Cloud Data Warehouse V2 Connector
Rules and guidelines for Cloud Data Warehouse V2
objects
Consider the following rules and guidelines for Snowflake Cloud Data Warehouse V2 objects used as
sources, targets, and lookups in mappings:
• You can read or write data of Binary data type, which is in Hexadecimal format.
• You cannot write semi-structured data to the target. For example, XML, JSON, AVRO, or PARQUET data.
• You cannot specify more than one Pre-SQL or Post-SQL query in the source or target transformation.
• The agent reads or writes the maximum float value, which is 1.7976931348623158e+308, as infinity.
• If a Snowflake Cloud Data Warehouse V2 lookup object contains fields with String data type of maximum
or default precision and the row size exceeds the maximum row size, the task fails.
• You can use the following formats to specify filter values of Datetime data type:
- YYYY-MM-DD HH24:MI:SS
- YYYY/MM/DD HH24:MI:SS
- MM/DD/YYYY HH24:MI:SS
• When you provide a warehouse name in the connection properties and the mapping properties, the
warehouse name in the mapping overrides the warehouse name you specify in the connection. Even
though you provide an incorrect warehouse name in the connection properties, the connection is
successful. However, before you run the mapping, ensure that you specify the correct warehouse name in
the mapping properties.
• When you use a SQL override query to override the custom query used for importing the metadata from
Snowflake Cloud Data Warehouse tables, you must specify a fully qualified table name.
• You can read or write to Snowflake Cloud Data Warehouse tables whose table name or field name can
contain uppercase, lowercase, and mixed case alphabets, including numbers and special characters. You
cannot write data when the Snowflake Cloud Data Warehouse table contains field names with the # and @
characters. If the Secure Agent is installed on Windows, you cannot write data to the Snowflake Cloud
Data Warehouse target table when the table names contain the following special characters: /\:*?"<>|
• When you use the Create Target option to create a table in Snowflake Cloud Data Warehouse, special
characters in the column name are replaced by the _ character.
• Ensure that the table name that you specify in the query to read from Snowflake Cloud Data Warehouse
contains a fully qualified table name.
• Snowflake Cloud Data Warehouse V2 Connector does not support the following features when you use the
Query source type option:
- Filter and sort options.
- Source partitioning.
• Before you import the target table, define multiple primary keys in the target table.
• Define more than one custom key for the target object using the Update Columns option in the advanced
target properties.
The following image shows the My Jobs page that shows the details of the state and the number of
processed rows of a Snowflake Cloud Data Warehouse V2 job:
To view how many among the processed rows were a success and how many resulted in an error, select the
specific instance name and view the Results section. You can view the number of success rows and error
rows.
The following image shows the details of the Snowflake Cloud Data Warehouse V2 task:
You can also download the session log to get details of the number of output rows, affected rows, applied
rows, and rejected rows.
You might also encounter the following scenarios of target statistics for Snowflake Cloud Data Warehouse
V2 write operations:
• In insert, update, or delete operation scenarios where the Secure Agent rejects rows due to a constraint
violation, a warning appears in the Job Properties page. Download the session log to view the target
statistics.
20 Chapter 3: Mappings and mapping configuration tasks with Snowflake Cloud Data Warehouse V2 Connector
• In update or delete operation scenarios where the Secure Agent does not find a match for some records,
that number does not reflect in the My Jobs page and the session log.For example, if there are 5 input
rows and the Secure Agent updates only 4 target rows, the status of the number of processed rows stills
reflects as 5. This issue occurs when Snowflake Cloud Data Warehouse V2 does not return an error
message for rejected rows.
• In update or delete operation scenarios where the Secure Agent updates or deletes more rows because of
a non-unique match, that actual number of updated or deleted records does not reflect both in the My
Jobs page and in the session log. For example, if there were 5 input records and the Secure Agent
updated 10 target rows, the My Jobs page reflects only 5 processed rows.
• The number of success rows for the target object in the Job Properties page is not updated and remains
the same as the number of incoming rows. For example, while writing 5 records to the target, if two
records are rejected, the number of success rows still reflects as 5.
To configure a different directory for the local staging files, perform the following steps:
Create a Snowflake Cloud Data Warehouse V2 connection and use the connection to perform a mass
ingestion task. When you create a mass ingestion task, select the target connection and specify which files
you want to move from the source to the Snowflake Cloud Data Warehouse target.
22
• Source and target connections exist based on the sources from where you want to transfer files and the
targets to where you want to transfer files.
The following table describes the Snowflake Cloud Data Warehouse V2 target properties that you can
configure in a mass ingestion task:
The following table describes the Snowflake Cloud Data Warehouse V2 advanced target properties that you
can configure in a mass ingestion task:
Property Description
Warehouse Overrides the Snowflake Cloud Data Warehouse name specified in the Snowflake Cloud Data
Warehouse V2 connection.
Target Table The table name of the Snowflake Cloud Data Warehouse target table.
Name
Role Overrides the Snowflake Cloud Data Warehouse user role specified in the connection.
Pre SQL SQL statement to run on the target before the start of the write operation.
Post SQL SQL statement to run on the target table after the write operation completes.
Truncate Target Truncates the database target table before inserting new rows. Select one of the following
Table options:
- True. Truncates the target table before inserting all rows.
- False. Inserts new rows without truncating the target table
Default is false.
File Format and The copy option and the file format to load the data to Snowflake Cloud Data Warehouse.
Copy Options The copy option specifies the action that the task performs when an error is encountered while
loading data from a file:
You can specify the following copy option to abort the COPY statement if any error is
encountered:
ON_ERROR = ABORT_STATEMENT
When you load files, you can specify the file format as CSV and define the rules for the data files.
The task uses the specified file format and rules while bulk loading data into Snowflake Cloud
Data Warehouse tables.
Specify the following format:
file_format = TYPE = {CSV} [formatTypeOptions]
External Stage Specifies the external stage directory to use for loading files into Snowflake Cloud Data
Warehouse tables.
Ensure that the source folder path you specify is the same as the folder path provided in the URL
of the external stage for the specific connection type in Snowflake Cloud Data Warehouse.
Applicable when the source for mass ingestion is Microsoft Azure Blob Storage and Amazon S3.
The external stage is mandatory when you use the connection type Microsoft Azure Blob Storage
V3, but is optional for Amazon S3 V2. If you do not specify an external stage for Amazon S3 V2,
Snowflake Cloud Data Warehouse creates an external stage by default.
File Compression Determines whether or not files are compressed before they are transferred to the target
directory.
The following options are available:
- None. Files are not compressed.
- GZIP. Files are compressed using GZIP compression.
Applicable for all sources that support the mass ingestion task except for Microsoft Azure Blob
Storage V3 and Amazon S3 V2.
Select a Snowflake Cloud Data Warehouse V2 connection in a mass ingestion task and then specify the copy
option and the file format in the target options to determine how to load the files to a Snowflake Cloud Data
Warehouse target table.
The copy option specifies the action that the task performs when an error is encountered while loading data
from a file.
You can specify the following copy option to abort the COPY statement if any error is encountered:
ON_ERROR = ABORT_STATEMENT
Note: The mass ingestion task for Snowflake Cloud Data Warehouse V2 is certified for only the
ABORT_STATEMENT for ON_ERROR copy option.
When you load files, you can specify the file format as CSV and define the rules for the data files. The task
uses the specified file format and rules while bulk loading data into Snowflake Cloud Data Warehouse tables.
24 Chapter 4: Mass ingestion tasks with Snowflake Cloud Data Warehouse V2 Connector
Specify the following format:
file_format = TYPE = {CSV} [formatTypeOptions]
The following list describes some of the format type options:
• RECORD_DELIMITER = '<character>' | NONE. Single character string that separates records in an input
file.
• FIELD_DELIMITER = '<character>' | NONE. Specifies the single character string that separates records
in an input file.
• FILE_EXTENSION = '<string>' | NONE. Specifies the extension for files unloaded to a stage.
• SKIP_HEADER = <integer>. Number of lines at the start of the file to skip.
• DATE_FORMAT = '<string>' | AUTO. Defines the format of date values in the data files or table.
• TIME_FORMAT = '<string>' | AUTO. Defines the format of time values in the data files or table.
• TIMESTAMP_FORMAT = <string>' | AUTO. Defines the format of timestamp values in the data files or
table.
For example, you want to create a CSV file format and define the following rules to load files from Amazon S3
to Snowflake Cloud Data Warehouse:
Create a CSV file format that defines the following rules for data files:
Specify the following file format: file_format = (type = csv field_delimiter = '|' skip_header = 1)
You can specify both the copy options and file format by using the following character: &&
For more information about the various file formats that you can specify and the copy option, see the
Snowflake Cloud Data Warehouse documentation at the following website:
https://docs.snowflake.net/manuals/sql-reference/sql/copy-into-table.html#copy-options-copyoptions
External stage
When you configure a mass ingestion task to load files from a Microsoft Azure Blob Storage or Amazon S3
source to the Snowflake Cloud Data Warehouse tables, specify the external staging directory to use in
Snowflake.
You must specify the external stage name for the specific connection type that you want to use in the Target
Options section in the mass ingestion task.
The external stage field value is mandatory when you run a mass ingestion task to load files from Microsoft
Azure Blob Storage to Snowflake Cloud Data Warehouse where the connection type in the source is Microsoft
Azure Blob Storage V2. When the source connection type is Amazon S3 V2, and you do not specify an
external stage for Amazon S3 V2 in the Snowflake Cloud Data Warehouse target options, Snowflake creates
an external stage directory by default.
Ensure that the source directory path in the Source Options of the mass ingestion task is the same as the
directory path provided in the URL of the external stage created for the Microsoft Azure Blob Storage V2 or
Amazon S3 V2 connection in Snowflake Cloud Data Warehouse.
For example, an external stage for Microsoft Azure Blob Storage created using an Azure account name and a
blob container with a folder path has the following stage URL: 'azure://<Blob SAS Token URL/<blob
The following image shows the stage name and the stage URL for a Microsoft Azure Blob Storage V2
connection in Snowflake Cloud Data Warehouse:
When you create a mass ingestion job, in the Folder Path field in the Source Options of the Microsoft Azure
Blob Storage V2 source, specify the following <Blob Container>/<folder path> path from the stage URL: /
snowflakemi/MI
26 Chapter 4: Mass ingestion tasks with Snowflake Cloud Data Warehouse V2 Connector
The following image shows the specified source folder path in the Source Options section:
In the Target Options for Snowflake Cloud Data Warehouse V2, specify the following name of the created
external stage: MFT_BLOB1
28 Chapter 4: Mass ingestion tasks with Snowflake Cloud Data Warehouse V2 Connector
3. In the Definition tab, configure the following properties:
Property Description
Task Name Name of the mass ingestion task. The names of mass ingestion tasks must be unique
within the organization. Task names can contain alphanumeric characters, spaces, and
underscores. Names must begin with an alphabetic character or underscore.
Task names are not case sensitive.
Runtime Runtime environment that contains the Secure Agent used to run the task. The Mass
Environment Ingestion application must run on the Secure Agent.
4. Click Next.
The Source tab appears.
5. On the Source Details page, select connection from a list of configured connections in the Connection
Type field.
You can select one of the following sources that mass ingestion task supports:
• Local folder
• Advanced FTP V2
• Advanced FTPS V2
• Advanced SFTP V2
• Amazon S3 V2
• Google Cloud Storage V2
• Microsoft Azure Blob Storage V3
• Hadoop Files V2
• Microsoft Azure Data Lake Store V3
• Azure Data Lake Store Gen2
6. Click View to view the connection details.
7. Click Test to test the connection in the View Connection dialog.
8. Click Next.
The Target tab appears.
9. On the Target Details section, select the Connection Type as Snowflake Cloud Data Warehouse V2 and
configure the Snowflake Cloud Data Warehouse target properties.
10. Click View to view the connection details.
11. Click Test to test the connection in the View Connection dialog.
12. Click Next.
The Schedule tab appears where you can select whether to run the task on a schedule or without a
schedule.
13. Click Run this task on schedule to run a task on a schedule and select the schedule you want to use.
If you want to remove a task from a schedule, click Do not run this task on a schedule.
14. Click Finish to save and close the task wizard.
1. To run a mass ingestion task manually, on the Explore page, navigate to the task. In the row that
contains the task, click Actions and select Run.
Alternatively, you can run the task manually from the Task Details page. To access the Task Details
page, click Actions and select View. In the Task Details page, select Run.
2. To run a mass ingestion task on a schedule, edit the task in the mass ingestion task wizard to associate
the task with a schedule.
30 Chapter 4: Mass ingestion tasks with Snowflake Cloud Data Warehouse V2 Connector
Chapter 5
Snowflake pushdown
optimization
This chapter includes the following topics:
• Pushdown optimization, 31
• Pushdown optimization functions, 31
• Configuring a Snowflake ODBC connection, 34
• Create an ODBC connection, 38
• Cross-schema pushdown optimization, 39
• Rules and guidelines for functions in pushdown optimization, 40
• Troubleshooting, 41
Pushdown optimization
When you use a Snowflake ODBC connection and select the ODBC subtype as Snowflake, you can configure
pushdown optimization in a mapping to push transformation logic to the Snowflake Cloud Data Warehouse
source or target database. The ODBC connection must use the Snowflake ODBC driver.
When you run a task configured for pushdown optimization, the task converts the transformation logic to an
SQL query. The task sends the query to the database, and the database executes the query. Use pushdown
optimization to improve the performance of the task.
You can configure full and source pushdown optimization in a Snowflake Cloud Data Warehouse mapping.
31
Warehouse database by using source-side or full pushdown optimization. Columns marked with a dash (-)
symbol indicate that the function cannot be pushed to the database.
The following table lists the pushdown operators that can be used in a Snowflake Cloud Data Warehouse
database:
Operator Pushdown
+ Supported
- Supported
* Supported
/ Supported
% Supported
|| Supported
> Supported
= Supported
>= Supported
<= Supported
!= Supported
AND Supported
OR Supported
NOT Supported
^= Supported
After you create a Snowflake ODBC connection, add the Pushdown Optimization property under Advanced
Session Properties tab when you create a mapping task and select Full or To Source in the Session Property
Value field. You cannot configure target-side pushdown optimization by using Snowflake ODBC driver. To
verify that the pushdown optimization has taken place, you can check the session log for the job. In Monitor,
view the log for jobs.
Snowflake supports Snowflake ODBC drivers on Windows and Linux systems. You must install the Snowflake
ODBC 64-bit driver based on your system requirement.
1. Download the Snowflake ODBC driver from your Snowflake Cloud Data Warehouse account.
You must download the Snowflake ODBC 64-bit driver.
2. Install the Snowflake ODBC driver on the machine where the Secure Agent is installed.
3. Open the folder in which ODBC data source file is installed.
4. Run the odbcad32.exe file.
The ODBC Data Source Administrator dialog box appears.
5. Click System DSN.
6. Click Add.
The Create New Data Source dialog appears. The following image shows the Create New Data Source
dialog where you can select the Snowflake Cloud Data Warehouse data source:
Property Description
Tracing (0-6) Determines the amount of detail that appears in the log file. You can specify the following values:
- 0. Disable tracing.
- 1. Fatal error tracing.
- 2. Error tracing.
- 3. Warning tracing.
- 4. Info tracing.
- 5. Debug tracing.
- 6. Detailed tracing.
After you configure the Snowflake ODBC connection, you must create an ODBC connection to connect to
Snowflake Cloud Data Warehouse.
1. Download the Snowflake ODBC driver from your Snowflake Cloud Data Warehouse account.
You must download the Snowflake ODBC 64-bit driver.
2. Install the Snowflake ODBC driver on the machine where the Secure Agent is installed.
3. Configure the odbc.ini file properties in the following format:
[ODBC Data Sources]
driver_name=dsn_name
[dsn_name]
Driver=path/driver_file
Description=
Server=domain_name
role=role
4. Specify the following properties in the odbc.ini file:
Property Description
After you configure the Snowflake ODBC connection, you must create an ODBC connection to connect to
Snowflake Cloud Data Warehouse.
Perform the following steps to create a Snowflake ODBC connection on the Connections page:
Property Description
4. Configure the following connection details in the ODBC Connection Properties section:
Property Description
Runtime Environment Runtime environment that contains the Secure Agent you can use to access the
system.
User Name Username to log in to the Snowflake Cloud Data Warehouse database.
Data Source Name Enter the name of the ODBC data source name that you created for the Snowflake
Cloud Data Warehouse database.
Code Page The code page of the database server or flat file defined in the connection.
ODBC Subtype Enter the value of the ODBC Subtype field as Snowflake.
Driver Manager for Linux The driver that the Snowflake ODBC driver manager sends database calls to.
To use cross-schema pushdown optimization, create two Snowflake Cloud Data Warehouse ODBC
connections and specify the schema in each connection. Ensure that the schema in the source connection is
different from the schema in the target connection, but both the schemas must belong to the same database.
When you configure pushdown optimization for the mapping task, enable cross-schema pushdown
optimization in the advanced session properties. By default, the check box is selected.
1. Create the following two Snowflake Cloud Data Warehouse ODBC connections, each defined with a
different schema:
a. Create an sf_odbc1 Snowflake Cloud Data Warehouse ODBC connection and specify CQA_SCHEMA1
schema in the connection properties.
b. Create sf_odbc2 Snowflake Cloud Data Warehouse ODBC connection and specify CQA_SCHEMA2
schema in the connection properties.
2. Create a Snowflake Cloud Data Warehouse mapping, m_sf_pdo_acrossSchema. Perform the following
tasks:
a. Add a Source transformation and include a Snowflake Cloud Data Warehouse source object and
connection sf_odbc1 to read data using CQA_SCHEMA1.
b. Add a Target transformation and include a Snowflake Cloud Data Warehouse target object and
connection sf_odbc2 to write data using CQA_SCHEMA2.
3. Create a Snowflake Cloud Data Warehouse mapping task, and perform the following tasks:
a. Select the configured Snowflake Cloud Data Warehouse mapping, m_sf_pdo_acrossSchema.
b. In the Advanced Options on the Schedule tab, add Pushdown Optimization and set the value to Full.
c. Select Enable cross-schema pushdown optimization.
The following image shows the configured Enable cross-schema pushdown optimization property:
• To push the TRUNC(DATE) function to the Snowflake database, you must define the date and format
arguments.
• The Snowflake aggregate functions accept only one argument, which is a field set for the aggregate
function. The agent ignores any filter condition defined in the argument. In addition, ensure that all fields
mapped to the target are listed in the GROUP BY clause.
• To push the TO_CHAR() function to the Snowflake database, you must define the date and format
arguments.
- DDD
- HH
- MI
- MM
- SS
- YYYY
Troubleshooting
When you select the truncate table option for a Snowflake target that contains special characters and enable pushdown
optimization, the mapping fails.
You must add the custom property AddQuotesAlways and set the value to Yes for the Data Integration
Server in the Secure Agent properties.
The following image shows the custom configuration that must configure:
You can then run the Snowflake ODBC mapping with the truncate table option and with pushdown
optimization enabled.
Troubleshooting 41
Chapter 6
• Snowflake Cloud Data Warehouse V2 native data types appear in the source and target transformations
when you choose to edit metadata for the fields.
• Transformation data types. Set of data types that appear in the transformations. These are internal data
types based on ANSI SQL-92 generic data types, which the Secure Agent uses to move data across
platforms. They appear in all transformations in a mapping.
When the Secure Agent reads source data, it converts the native data types to the comparable transformation
data types before transforming the data. When the Secure Agent writes to a target, it converts the
transformation data types to the comparable native data types.
42
Snowflake Cloud Data Transformation Data Type Range and Description
Warehouse V2 Data Type
FLOAT (DOUBLE, DOUBLE double Floating point numbers with double-precision (64
PRECISION, REAL, FLOAT, bit).
FLOAT4, FLOAT8) Maximum value: 1.7976931348623158e+307
Minimum value: -1.79769313486231E+307
NUMBER (DECIMAL, NUMERIC) decimal Number with 28-bit precision and scale.
NUMBER (INT, INTEGER, decimal Number with 28-bit precision and scale as 0.
BIGINT, SMALLINT, TINYINT, Maximum value: 9.99999999999999E+27
BYTEINT)
Minimum value: -9.99999999999999E+26
C mappings (continued)
Pre-SQL 11, 13
Cloud Application Integration community role 11, 13
URL 5 schema 11, 13
Cloud Developer community source properties 11
URL 5 target properties 13
connections warehouse 11, 13
Snowflake V2 10 mass ingestion task
connector overview 7 example 28
overview 22
running 30
D viewing details 30
mass ingestion tasks
Data Integration community prerequisites 22
URL 5
data types
native data types 42
overview 42
N
transformation data types 42 not parameterized sort 11
F P
filter 11 parameterized sort 11
partitioning
configuring key range partitioning 13
L full pushdown 31
GET_DATE_PART() 40
local staging files ODBC subtype 38
directory configuration 21 pushdown functions 31
JVM option 21 pushdown operators 31
lookup REPLACECHR() 40
database 15 REPLACESTR() 40
multiple matches 15 rules and guidelines 40
role 15 source pushdown 31
schema 15 SYSDATE() 40
warehouse 15 SYSTIMESTAMP() 40
target pushdown 31
TO_BIGINT() 40
M TO_CHAR() 40
TO_INTEGER() 40
maintenance outages 6 TRUNC(DATE) 40
mappings
database 11, 13
example 17
lookup overview 15
S
lookup properties 15 Snowflake Cloud Data Warehouse V2
Post-SQL 11, 13 connection properties 10
44
Snowflake Cloud Data Warehouse V2 (continued) sort 11
connections 9 status
connector 7 Informatica Intelligent Cloud Services 6
lookup 7 system status 6
mapping example 17
object types 7
source 7
target 7
T
task types 7 trust site
Snowflake Cloud Data Warehouse V2 connector description 6
datetime format 19
rules and guidelines 19
Snowflake Cloud Data Warehouse V2 targets
mass ingestion task 23
U
properties 23 upgrade notifications 6
Snowflake ODBC connection
configuration on linux 37
configuration on windows 34
odbc.ini file 37
W
system DSN 34 web site 5
Snowflake ODBC driver 34
Index 45