0% found this document useful (0 votes)
594 views14 pages

Partitioning in Informatica Cloud (IICS) - ThinkETL

The document discusses partitioning in Informatica Cloud, including how it works, the types of partitioning supported, and guidelines for using partitioning. Partitioning allows parallel processing of data through separate pipelines, improving performance. Key range partitioning distributes data based on a partitioning key, while fixed partitioning is for non-relational sources. Higher numbers of partitions and records generally yield better performance gains.

Uploaded by

MILIND
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
594 views14 pages

Partitioning in Informatica Cloud (IICS) - ThinkETL

The document discusses partitioning in Informatica Cloud, including how it works, the types of partitioning supported, and guidelines for using partitioning. Partitioning allows parallel processing of data through separate pipelines, improving performance. Key range partitioning distributes data based on a partitioning key, while fixed partitioning is for non-relational sources. Higher numbers of partitions and records generally yield better performance gains.

Uploaded by

MILIND
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

12/25/23, 10:30 PM Overview of Partitioning in Informatica Cloud (IICS) - ThinkETL

ThinkETL

-30% -14%

 MENU 

Overview of Partitioning in Informatica Cloud (IICS)


February 20, 2021

Spread the love

https://thinketl.com/overview-of-partitioning-in-informatica-cloud-iics/#:~:text=Partitioning is nothing but enabling,partitioning occurs throughout t… 1/14


12/25/23, 10:30 PM Overview of Partitioning in Informatica Cloud (IICS) - ThinkETL

Contents [ hide ]
1. Introduction
2. How Partitioning works in Informatica Cloud?
3. Types of Partitions supported in Informatica Cloud
3.1 Key Range Partitioning
3.2 Fixed Partitioning
4. Guidelines to Informatica Cloud Partitioning
5. Informatica Cloud Partitioning vs Informatica Powercenter Partitioning
6. Conclusion

1. Introduction
Informatica offers several performance tuning and optimization techniques. Partitioning is one such performance tuning option offered
in Informatica cloud which enables you to optimize the performance of mapping tasks.

In this article, let us understand the Partitioning options available in Informatica Cloud Data Integration, its advantages and limitations
and how it differs from partitioning available in Powercenter.

2. How Partitioning works in Informatica Cloud?


The Data Transformation Manager (DTM) process is the operating system process that the Data Integration Service creates to carry
out session tasks by creating and managing threads.

For each Flow Run Order in a mapping, several threads are created depending on the design of the mapping and the transformations
used.

The following shows a simple mapping in Informatica Cloud with no partitioning enabled.

With no partitioning enabled the mapping is considered to running on a single partition. In this case DTM process creates one reader,
one transformation, and one writer thread to process the data.


The reader thread controls how the Data Integration Service process extracts source data.
The transformation thread controls how the Data Integration Service process processes the data.
The writer thread controls how the Data Integration Service process loads data into the target.

With the Partitioning enabled, you can select the number of partitions for the mapping. The DTM process then creates a reader thread,
transformation thread and writer thread for each partition allowing the data to be processed concurrently, thereby reducing the
execution time of the task.

Partitioning is nothing but enabling the parallel processing of the data through separate pipelines.

Partitions can be enabled by configuring the Source transformation in mapping designer. When the partitions are configured in Source
transformation, partitioning occurs throughout the mapping.

3. Types of Partitions supported in Informatica Cloud


There are two major partitioning methods supported in Informatica Cloud Data Integration.

1. Key Range Partitioning


2. Fixed Partitioning

https://thinketl.com/overview-of-partitioning-in-informatica-cloud-iics/#:~:text=Partitioning is nothing but enabling,partitioning occurs throughout t… 2/14


12/25/23, 10:30 PM Overview of Partitioning in Informatica Cloud (IICS) - ThinkETL
The Key Range partitioning method is supported for relational sources and Fixed partitioning is supported for non-relational sources
such as flat file sources.

3.1 Key Range Partitioning


The Key Range Partitioning distributes the data into multiple partitions based on the partitioning key selected and range of the values
defined for it. You must select a field as a partitioning key and defined the start and end ranges of the value.

The data types supported by Key Range Partitioning are

String
Number
Date/time

To enable the key range partitioning

Navigate to Partitions tab of a relational source in Source transformation.


Select Key Range as Partitioning type from drop down. The default value will be None.
Once key range partitioning type is selected, you will be prompted to select a source field as Partitioning Key.
After selecting the Partitioning Key, select the required number of partitions and specify the start and end range values for each
partition accordingly.

In the below example with a test data of million records of employee data, the Partitioning Key is selected on the field EMPLOYEE_ID.
There were three partitions created and the start and end range values of each partition are as shown below.

Use a blank value for the start range to indicate the minimum value. Use a blank value for the end
range to indicate the maximum value.

Without the partitioning enabled, all the 1 million records are read and processed in a single pipeline.

Discover related topics

50 Informatica Interview Scenarios Solved Free Download

Cognizant Informatica Interview Questions

Informatica Interview Questions

Informatica Testing Interview Questions

Capgemini Informatica Interview Questions

https://thinketl.com/overview-of-partitioning-in-informatica-cloud-iics/#:~:text=Partitioning is nothing but enabling,partitioning occurs throughout t… 3/14


12/25/23, 10:30 PM Overview of Partitioning in Informatica Cloud (IICS) - ThinkETL
With Partitioning enabled, three different pipelines are created and the data is read and processed concurrently. Informatica creates
three different queries based on partitions and values range specified as below.

SELECT HR.EMPLOYEE.EMPLOYEE_ID, HR.EMPLOYEE.FIRST_NAME, HR.EMPLOYEE.LAST_NAME, HR.EMPLOYEE.EMAIL,


HR.EMPLOYEE.PHONE_NUMBER, HR.EMPLOYEE.HIRE_DATE, HR.EMPLOYEE.JOB_ID, HR.EMPLOYEE.SALARY,
HR.EMPLOYEE.MANAGER_ID, HR.EMPLOYEE.DEPARTMENT_ID FROM HR.EMPLOYEE WHERE HR.EMPLOYEE.EMPLOYEE_ID < 333333

SELECT HR.EMPLOYEE.EMPLOYEE_ID, HR.EMPLOYEE.FIRST_NAME, HR.EMPLOYEE.LAST_NAME, HR.EMPLOYEE.EMAIL,


HR.EMPLOYEE.PHONE_NUMBER, HR.EMPLOYEE.HIRE_DATE, HR.EMPLOYEE.JOB_ID, HR.EMPLOYEE.SALARY,
HR.EMPLOYEE.MANAGER_ID, HR.EMPLOYEE.DEPARTMENT_ID FROM HR.EMPLOYEE WHERE HR.EMPLOYEE.EMPLOYEE_ID >=
333334 AND HR.EMPLOYEE.EMPLOYEE_ID < 666666

SELECT HR.EMPLOYEE.EMPLOYEE_ID, HR.EMPLOYEE.FIRST_NAME, HR.EMPLOYEE.LAST_NAME, HR.EMPLOYEE.EMAIL,


HR.EMPLOYEE.PHONE_NUMBER, HR.EMPLOYEE.HIRE_DATE, HR.EMPLOYEE.JOB_ID, HR.EMPLOYEE.SALARY,
HR.EMPLOYEE.MANAGER_ID, HR.EMPLOYEE.DEPARTMENT_ID FROM HR.EMPLOYEE WHERE HR.EMPLOYEE.EMPLOYEE_ID >=
666667

The final source/target results are as below

In my testing to load a million records from an Oracle database into a flat file, it took 28 seconds without partitioning. With partitioning
enabled, it took 20 seconds to load the data. If there is some transformation logic also  involved, the difference would be much higher
as that would also be processed concurrently in three different pipelines.

It can also be understood that higher the number of records, higher the performance boost that can be obtained from partitioning.

3.2 Fixed Partitioning


Fixed Partitioning can be enabled for sources which are not relational or support key range partitioning.

To enable Fixed partitioning

Navigate to Partitions tab of a non-relational source in Source transformation.


Select Fixed as Partitioning type from drop down. The default value will be None.
Once key range partitioning type is selected, you will be prompted to select the number of partitions in the mapping.
Enter the number of desired partitions.

In the below example, three partitions are created on a flat file source with million records.

https://thinketl.com/overview-of-partitioning-in-informatica-cloud-iics/#:~:text=Partitioning is nothing but enabling,partitioning occurs throughout t… 4/14


12/25/23, 10:30 PM Overview of Partitioning in Informatica Cloud (IICS) - ThinkETL

In my testing it took 3 minutes, 25 seconds to load a million records from a flat file into an oracle table. With partitioning enabled, it
took 1 minute, 39 seconds to load the data.

The difference in processing time is clearly evident in this case.

4. Guidelines to Informatica Cloud Partitioning


The maximum number of partitions that could be specified is 64.
Consider the number of records to be passed in the mapping to determine an appropriate number of partitions for the mapping.
For a small number of records, partitioning might not be advantageous.
If the mapping includes multiple sources, specify the same number of partitions for each source so that data is processed in
consistent pipelines.
Partitioning is not supported when mapping uses a parameterized source or source query.
Partitioning is not supported when mapping includes Hierarchy Parser or Web services transformation.
When a Sorter transformation is in a mapping with partitioning enabled, the task sorts data in each partition separately.
Sequence numbers generated by Normalizer and Sequence Generator transformations might not be sequential for a partitioned
source, however they are unique.
Parameters cannot be used for key range values.

5. Informatica Cloud Partitioning vs Informatica Powercenter Partitioning


Informatica Powercenter offers a wide range of partitioning options compared to Informatica Cloud Data Integration. The
partitioning types supported in Powercenter are
Database partitioning
Round-robin
Pass-through
Hash auto-keys
Hash user keys
Key range
Partition points mark the boundaries between threads in a pipeline and dividethe pipeline into stages. Informatica Powercenter
supports partition points at various transformations in the mapping. When you add partition points, you increase the number of
transformation threads, which can increase session performance. However in Informatica Cloud, you should use only default
partition points created by data integration service.
When you create a partition point at various transformations, the Workflow Manager in Powercenter sets the default partition
type. You can change the partition type depending on the transformation type. In Informatica Cloud, partitioning is allowed only
at source transformation end.
Informatica Powercenter shows visually the details of default partition points and pipeline stages in a mapping in workflow
manager. There is no visual representation of default partition points provided in Informatica Cloud.

6. Conclusion
Partitioning is a great way to improve the task performance when working with large amounts of data. At present Informatica Cloud
offers only two partitioning types and only on Source transformation which makes it easy to work with as you need not worry about
selecting an optimal partitioning type from a broad list of partitioning types at various partition points. Just make sure an optimal
number of partitions are selected for the amount of data you are processing and you are good to go.

There is a lot of catching up to do for Informatica cloud in terms of partitioning options compared to Powercenter and hope to see them
getting added in future releases.

https://thinketl.com/overview-of-partitioning-in-informatica-cloud-iics/#:~:text=Partitioning is nothing but enabling,partitioning occurs throughout t… 5/14


12/25/23, 10:30 PM Overview of Partitioning in Informatica Cloud (IICS) - ThinkETL

Check out our Instagram Posts on Informatica Cloud Partitioning

thinketl View profile


523 followers

thinketl View profile


523 followers

 Advanced Concepts, Informatica Cloud Data Integration, Performance Tuning

https://thinketl.com/overview-of-partitioning-in-informatica-cloud-iics/#:~:text=Partitioning is nothing but enabling,partitioning occurs throughout t… 6/14


12/25/23, 10:30 PM Overview of Partitioning in Informatica Cloud (IICS) - ThinkETL

8 thoughts on “Overview of Partitioning in Informatica Cloud (IICS)”

Omkar
February 26, 2021 at 5:46 pm

Hello Thanks for the explanation.


One doubt-When we want to upload big size file (File >1 GB) on Amazon S3 we Files can be uploaded in multiple parts into
Amazon S3 bucket using the Amazon S3 V2 connector. The Amazon S3 connector works on ‘pass-through partitioning’ and
allows a file to be uploaded in multiple parts when using Amazon S3 as a target.
But is it possible when we are using Custom Queries on the Source Level?Like I am using Source query to fetch the details
from the table and loading the .csv and .control files to S3 from that mapping

Reply

ThinkETL
February 28, 2021 at 7:10 pm

Currently Partitioning is not supported when using custom query in source transformation in IICS.

Reply

AK
April 17, 2021 at 1:16 am

Hi i ahve scenario like this,

I am reading data from 3 diff regions (mysql is the src database) and merging thru union and finally loading into Oracle tgt.In
this case i need to give same partitions across all 3 src regions ?

Reply 

ThinkETL
April 18, 2021 at 1:29 am

Yes, if the mapping includes multiple sources, specify the same number of partitions for each source so that data is
processed in consistent pipelines.

Reply

https://thinketl.com/overview-of-partitioning-in-informatica-cloud-iics/#:~:text=Partitioning is nothing but enabling,partitioning occurs throughout t… 7/14


12/25/23, 10:30 PM Overview of Partitioning in Informatica Cloud (IICS) - ThinkETL

Shop Now Shop Now Shop Now Shop Now

MN
February 25, 2022 at 6:42 pm

Hi,

Is there a possibility for dynamic partitioning in Informatica cloud?

Reply

ThinkETL
March 14, 2022 at 12:01 am

No

Reply

MN
February 25, 2022 at 6:44 pm


Also, can you please share an example on how to use Key Range partitioning using string data type?

Reply

ThinkETL
March 14, 2022 at 12:01 am

It seems to be not supported currently. Alternatively you can assign a unique number for each key value of the string field on
which you want to partition and use that value while defining the Key Range partitioning.

Reply

https://thinketl.com/overview-of-partitioning-in-informatica-cloud-iics/#:~:text=Partitioning is nothing but enabling,partitioning occurs throughout t… 8/14


12/25/23, 10:30 PM Overview of Partitioning in Informatica Cloud (IICS) - ThinkETL

-30%

Leave a Comment

Name *

Email *

Website

Save my name, email, and website in this browser for the next time I comment.

Post Comment

Related Posts 

User-Defined Functions (UDFs) in Informatica Cloud (IICS)

Mapplets in Informatica Cloud (IICS)

https://thinketl.com/overview-of-partitioning-in-informatica-cloud-iics/#:~:text=Partitioning is nothing but enabling,partitioning occurs throughout t… 9/14


12/25/23, 10:30 PM Overview of Partitioning in Informatica Cloud (IICS) - ThinkETL

Pushdown Optimization Preview in Informatica Cloud (IICS)

Take up the Quiz

IICS Interview Questions Quiz-1


IICS Interview Questions Quiz-2
IICS Interview Questions Quiz-3
IICS Interview Questions Quiz-4
IICS Transformations Quiz-1
IICS Transformations Quiz-2
IICS Taskflows Quiz
IICS Expression Macros Quiz
IICS Certification Practice Test
SQL Analytic Functions Quiz

Informatica Cloud Professional Certification Practice Tests

https://thinketl.com/overview-of-partitioning-in-informatica-cloud-iics/#:~:text=Partitioning is nothing but enabling,partitioning occurs throughout … 10/14


12/25/23, 10:30 PM Overview of Partitioning in Informatica Cloud (IICS) - ThinkETL

Deal extended: Courses Up To 85% Off

Subscribe

Name:

Email:

Submit

Informatica Cloud Data Integration - Automation Project

Recent Posts

https://thinketl.com/overview-of-partitioning-in-informatica-cloud-iics/#:~:text=Partitioning is nothing but enabling,partitioning occurs throughout … 11/14


12/25/23, 10:30 PM Overview of Partitioning in Informatica Cloud (IICS) - ThinkETL

DATEDIFF Function in Snowflake: Calculate the difference between two dates

GREATEST and LEAST Functions in Snowflake

Change Data Capture using Snowflake Dynamic Tables

QUALIFY in Snowflake: Filter Window Functions

GROUP BY ALL in Snowflake

https://thinketl.com/overview-of-partitioning-in-informatica-cloud-iics/#:~:text=Partitioning is nothing but enabling,partitioning occurs throughout … 12/14


12/25/23, 10:30 PM Overview of Partitioning in Informatica Cloud (IICS) - ThinkETL

-20%

-20% -20%
About ThinkETL

ThinkETL is your go to resource for learning Informatica Cloud and Snowflake Concepts, Interview preparation and Automation
Ideas and strategies that work.

Pages
-20%

About

Catalogue
-20%
Certification

Contact
Courses

Privacy Policy

Subscribe
Terms Upgrade
of Use your sneaker…
Superkicks

Automation

Certification
Concepts

Informatica
Interview Questions

Snowflake

Follow Us

Copyright © 2023 ThinkETL

https://thinketl.com/overview-of-partitioning-in-informatica-cloud-iics/#:~:text=Partitioning is nothing but enabling,partitioning occurs throughout … 13/14


12/25/23, 10:30 PM Overview of Partitioning in Informatica Cloud (IICS) - ThinkETL

https://thinketl.com/overview-of-partitioning-in-informatica-cloud-iics/#:~:text=Partitioning is nothing but enabling,partitioning occurs throughout … 14/14

You might also like