100% found this document useful (1 vote)
206 views150 pages

IICS

Uploaded by

kumar99.vallam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
206 views150 pages

IICS

Uploaded by

kumar99.vallam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 150

IICS

main thing is data integration server - cdi

for cai it is process server

for informatica powercenter we not needed network but for iics we need bcoz secure agent we
will be runnnig on cloud

on premise serveràinformatica server will be on top of unix server informatica powercenter will
be maintained by particular company by getting license and any issues with informatica
powercenter will raise vendor request to informatica.

Will have designer,workflow manager,workflow monitor,repository manager

On premise repository(rdms) is maintained to store metadata

Pay for enterprise license,immaterial usage

Informatica cloudà informatica data will be maintained by informatica organization, iics will store
metadata by informatica.

It can connect with any data sources main diff.

Linux server is on top is maintained by organizational

Web based cloud integration.No installation needed

On cloud metadata manged by informatica

Pay for what we use

The Informatica Cloud Secure Agent is a lightweight program that runs all tasks and enables
secure communication across the firewall between your organization and Informatica Intelligent
Cloud Services. When the Secure Agent runs a task, it connects to the Informatica Cloud
hosting facility to access task information. It connects directly and securely to sources and
targets, transfers data between them, orchestrates the flow of tasks, runs processes, and
performs any additional task requirement.

If the Secure Agent loses connectivity to Informatica Intelligent Cloud Services, it tries to
reestablish connectivity to continue the task. If it cannot reestablish connectivity, the task fails.
Dev/QA Secure agenet will be on premise(machine) same server

Prod secure will be on different server

Bussiness data wont be stored on informatica cloud or maintained by informatica.

IICS is Integration Platform as Service(iPAAS)

Informatica enterprise iPaaS includes multiple cloud data management products designed to
accelerate productivity and improve speed and scale

Cloud data integration,cloud application integration,cloud data quality,mdm


Practical

For developing mapping first u need to chck Adminstrator that whether all services are running
and up mainly the data integration server

Add connections:àIn IICS Admin there you go to connections in that new connection and add
connection details

In this u need to mention connection name and description and the Type of connection(for which
platform it connector ex.mysql,oracle,db2 and ….)

Runtime Environment if your database in machine then need to select on premise or else
Informatica hosted agent

In username give the database username following with password in next blank

Host and port and servicename as follows

Main thing code pageàUTF8 and any other according values present in data.
New à creating any informatica thing using this

My JobsàFor checking what tasks are running and completed and any other status

According to requirement need to develop any of above mentioned


Flat File Connector creation

If file is present in Local then need to use this or else FTP/SFTP and mention path

We don’t have any source qualifier(as source in iics),update strategy transformation(can be


made in target) in iics

IMP

nformatica Powercenter ICS


Mapping à m_ Mapping à m_
Session à s_m_ Mapping task à mtt_ or mapping configuration
ask à mct_ or mapping taskàmt_
f we want to use sequence generator so if
need cache then its must to use mapping task
Workflowàwf_, worklets Taskflow à tf_ ,no worklet , inside one taskflow
u can add another taskflow that will act as
worklet(in terms of informatica )
Reusable session, transformations No resuable session,no reusable
ransformations
Just need to use mapplets for single reusable
ransformation also
N powercenter we need to have workflow n IICS, mapping can be executed , mapping
session to run the mapping ask can be executed , taskflow can be
execute
Stored Procedure Transformation Separate n SQL Transforamtion u will get store
SQL Transformation procedure
When we will use SQL Transformation à
ncase in project we are having plsql functions
hen to invoke stored procedure we use this
ransformation

SEQUENCE GENERATOR

1st day Primary key SEQ à 1,2,3,4,5,6

2nd day it should start from continued yesterday or previous run à7,8,9,10,11

So that’s y storing the values of 1st day in memory we need to use mapping task(in realtime this
is practiced that each mapping associated with mapping task)

IICS Data Integration à Bundles (are used for application integration part)

Interview Question

Can we run the Informatica Powercenter workflow in IICS

Yes,We can run in TasksàPowercenter Tasks (Import Powercenter Workflow so you can run it as
Cloud Data Integration Task) But the main condition is Only disadvantage is Workflow
should contain only one session and U cannot edit anything in that imported one

IICS Migration Factory will be most useful in above case Learn it

UNICODE CODE PAGE


While creating the ff connector u need to mention the UTF as 8

And main thing is that $ , @ , ó These character need to be printed in target as it is present in
source data files

To do so important thing is Name Varchar(20) source to make these characters fit in Name column u
need to increase the precise to more and datatype of Varchar(20) à Nvarchar(100)

Bcoz we don’t know how much space these Unicode characters space may take in db size.

Scheduling

Blackout period -->A blackout period prevents all scheduled tasks and linear taskflows from
running

You need to mention the start and end with timezone

The scheduling can be applied only on taskflow and mapping task

It will run according to schedule created in Administration-->schedule-->new schedule

Based on Repeats under Scheduling options it will give more flexibility of which time,day and
hour

Delimitter
For Data is having , and u need that ask source team to change the delimiter to anything different as
double quotes(“) or pipe(|)

Example:

Data incoming

101,Navin , kumar ,Vallamdas,20

In this Navin , kumar is actual data it is not separated data

So instead comma we can make delimiter other to avoid issues at target loading

101|Navin , kumar|Vallamdas|20

Now it will consider Navin , Kumar as single data value

In Source informatica

Native Type Column à Source Side Data type

Type à Means Informatica Data type

Session logs will be stored on clouds it wont get overwrite

Powercenter we had debugger to view data so that how data is flowing and find error in iics we have
preview data instead of it for the same.

Create mapping task-->1.Definition ... mention mapping details

2.Schedule-->u can mention schedule which u created

3.Email Notification --> for getting email on certain actions of job

a.Failure email notification

b.Warning email notification

c.Success Email notification

Normally we will give failure email so that on failure it should notify prod support team about it

4.Advanced Options

Pre processing commands-->zipping the file archival process unzip


Post processing commands--> zip the file

Maximum no,of log files --> 10 means last 10 log files will be saved

In powercenter only 1 log files it used to overwrite the files

If any services are partially running the test connection will not succeded so try to make all services
up and running

Indirect File Loading(PC)-->FileList(IICS)

In IICS Filelist,We need to mention all files in our files in config.txt

Config.txt will contain data

SRC_FF_EMPLOY1.csv

SRC_FF_EMPLOY2.csv

D:\Downloads\SRC_FF_EMPLOY3.csv

So informatica will read the config.txt file and go to that paths of file if mentioned at different location
than the config file

For this all the three files mentioned in config should be of same schema(same structure of FF)

Important Realtime Project:


Incase yours source system is placing 1st day 2 files and 2nd day 10 files so files count is
dynamically changing day to day so incase while running informatica mapping if config file is having
name and that file isnt present at that location then mapping will get failed

To avoid we need to generate dynamically the config file so we can use unix bash pre processing
commands(before starting session this command will execute) in mapping task to generate the
config file this should be done before reading the data from source that config file

If the columns name are different then need to manual do field mapping,

Smart map may be going wrong in lookup things.

FLAT FILE FIXED WIDTH

Means certain column will have only this length of values only

Src file as

In this each field is having specific number of positions

Step 1: Create Flat file width component


Step 2: Mention all details such as Sample flat file connection and Sample Object the exact file
Configure format options u need to select column spacing and edit column u can make column
names in that

BUT the only issue is incase u load the data then in database select that data u will notice that it
will have spaces also in it

So to avoid it we can use expression transformation in between target and source and apply
rtrim and ltrim on columns u feel would have spaces
Mapping source part

Checking Dependies in IICS

Select the object and click on three buttons and u will get option is Show dependies

And in iics there is no concept of shortcut of object


Generating target as csv file by creating the file at runtime

Use dynamic file name means u will get file name according to expression u mentioned such as
in this case T_EMPLOYMAIN||TO_CHAR(SYSDATE,’MMDDYYYY’)||’.csv’

Here we use to_char to avoid the issues in filenaming like it will show error as naming errror

ANYWAYS ALL FILES IN REALTIME PROJECT WILL BE ON SERVER FILE TRANSFER


PROTOCOL(SFTP) that is on unix server

While source team sends data they will have header and footer on the file that is footer will have
no.of records in file so that we can validate the file.

IF target you dont want footer and header if we dont give command then it wont create

But if in source data it is there and u want to remove header and footer then u need to process
by command task of unix

WORKING WITH XML AND JSON FILE

Hierarchical Schema-->

Extensible markup language->xml which as tags and texts data

json javascript object notation which as key value pair as data

Csv file is relational model and it will have in delimitter as one column value

But in xml and json file are having hierarchical format parent and child relation ship
1.Hierarchical Schema --> To define Structure(XML,JSON)

2.Hierarchy Parser --> To read data from json or xml convert to relational output

3.Hierarchy Builder --> To create hierarchical file output

The Hierarchical Parser transformation converts hierarchical input into relational output.The
transformation processes XML or JSON input from the upstream transformation and provides
relational output to the downstream transformation.

To parse complex hierarchical structures, consider using the strucure parser transformation for a
more comprehensive handling of hierarchical file inputs

You can configure a hierarchical schema that defines the expected hierarchy of the output data
from a sample file or schema file. The hierarchy Parser Transformation converts hierarchical
input based on the hierarchical schema that you associate with transformation. You can use an
existing hierarchical schema or configure one.

To use the hierarchy parser transformation in mapping,perform the following steps:

Create hierarchical schema.

Add Hierarchy parser transformation to the mapping

Associate a hierarchical schema with the hierarchy parser transformation

Configure the field mapping to select which schema elements provide relational outputs

Input file (actual location of File similar to config in filelist)--> Hierarchial Parser --> Relation
model

Hierarchical schema(structure of file defined)

Based on hierarchical schema hierarchical parser will convert the data into rdbms

Hierarchical builder will create xml file when given the input of relational database
Step 1 : Hierarchial Schema creation --> New --> Components--> Hierachial Schema

Schema Root is xml files main root component header

Step 2:Mapping
In Source u need to mention details as below u need to mention input.txt as object source

U need to select File as it xml file select schema created


In field mapping there is root data of xml files

As soon we generate a file at output at runtime then there is an extra output file also generated
we can stop generating that by using scripts

FOR JSON_DATA

It is same as xml file things

INCASE COLUMN IS HAVING DATA SUCH AS ‘’$12,000.00” AND U NEED TO REMOVE


COMMA AND DOLLAR SIGN

NEED TO USE THE EXPRESSION TRANSFORMATION AND CREATE EXPRESSION FOR


THAT COLUMN

o_price --> replacechr(0,replacechr(0,price,',',''),'$','')

This 0 means at first position case insensitive


Hierarchical Builder

Create a hierarchy parser schema

With sample file as we can also not give values we need just only the structure for reference

Mapping with source as db and target as dynamic xml file


With target as dynamic file name

No field mapping would be available at target

EXCEL TO DB FILE LOADING

In this we will be having xlsx file to database loading

Two main components for it:

1.Intelligent Structure model – in this we will pass the excel data for reference

It was visual representation for accessing the data present in the file

2.Structure Parser --> The Structure Parser transformation transforms your input data into a
user-defined structured format based on an

intelligent structure model. You can use the Structure Parser transformation to analyze data
such as log files, clickstreams, XML or JSON files, Word tables, and other unstructured or
semi-structured formats.
ISM-->Under Components

Display will have more dropdowns if we have more than one sheet present in excel

Mapping

After creating ism model u need to make the source file as

In source select the csv file and single object

Filepath

Actual path of File(Excel)

And take structure parser use it and in that select the ism model and after selecting only u will
get option enabled for connecting the source to structure parser

In fields option of Structure Parser U need to take source filepath field map it to filepath of
structure parser
For source u need to give the config.txt which will have the filepath of xslx file

In structure parser u need to mention the ism model u created

U need to select all elements which u want at the output

Note if u have more than one sheet in excel then accordingly u need to create the no.of targets
in the mapping

If there is date column in source then u need to add one expression transformation in between
structure parser and target to make date format compatible

For example the below one


Then map the fields accordingly and run the mapping

Note : Go through oracle database collect stats, performance tuning

SOURCE TRANSFORMATION—IT is similar to SQ transformation(PC)

In this we have filter condition we prefer this to filter out the data from source itself so that
performance is good the example is you have 5 years data in source and u need to work etl on
only 1year data so in this case its very helpful for performance perspective.

In filter option we need to just mention the filter condition informatica will convert/add into sql
whole query

In target table we have option of creating table automatically at runtime

If we select any source table we will get filter and sort option and other options in source
transformation
IF we go for SQL Override then the filter and sort option gets disabled

If we want to execute the sql queries we can use sql override

We can select Source Type as query it will just fetch data as mentioned in query

We can give source type as query which we not need to mention any particular table and we
can write join query sql for processiong as source itself
We can create target table at runtime u need to select connection so that table will get created
in that respective database schema --> Informatica will decide what datatypes to be there in
target fields depending on the target type connection incase it has teradata ,snowflake target
then according to that platforms it will change target datatypes
filter option two types:

Sort option

We can add more no.of columns sorting for having firstname same then we can go for lastname
sorting (more than one column sorting)
We can give source type as multiple objects where more than one related tables(object)
are used as source

If relation not mentioned between the objects(tables) then we need to create custom
relation ship and add them up
FILTER TRANSFORMATION

--Active and Connected

Filter transformation is used to filter out record any where in the pipeline

SQ Filter--> used to filter out record from source incase it is flat file we cannot use these filter
feature of SQ Transformations

Active : same no.of records at output as the input --> 50 records passed as input getting 30
records as output

Passive: expression doing some trim for values -->50 records passed as input and getting same
50 no.of records

Properties:

Filter condition :- advanced filter condition

True(default) --It will pass all the record

False --> It will block all the record example when we want to check the
connection at production so that we not need any loading of data at target

Condition – passes --> satisfied record

In filter condition need to write conditions as expression but in sq we write as condition

Commsion_pct not is null ----->SQ

NOT ISNULL(Commision_pct)-->Filter

JOB_ID LIKE %REP% ---> SQ

INSTR(JOB_ID,’REP’) -->Filter

DEPARTMENT_ID=30 OR DEPARTMENT_ID=40 or DEPARTMENT_ID=20

Instead use--> IN(DEPARTMENT_ID,30,40,50)


Salary

10000

50000

Null not isnull(salary)

6000

12000

I have a ff and it has record for all countries , i want to load india data to target

S+sq+filter (country =’India’ AND salary>5000)

India

india

INDIA
S+sq+filter (lower(country)=’india’ AND salary>5000)

IN IN(department_id,40,50)

Like – instr()

NOTE:

If we get date of different formats in one source file then u need consider it as string and
make the required changes to the column

Or else make the format as one format and then proceed further

Expression Transformation(Main bussiness Logic)

Passive Connected

IN expression u cannot filter out any columns

U need to learn SQL single row functions which are used in informatica expression
transformations

Some intermediate logic we need to do in iics then we use variable port and it wont be taken as
output if we wanna take then need to use variable port value in another output port

We can give the output field and mention the expression with IIF statement
Scenario 1: Source and target same database

Now main thing is after loading data we need to chck whether the data is loaded

By doing so,

Target query

Minus

Source Query transformed accoridng to target

The o/p is should be minus be zero (null)

If the source and target is present in the oracle database itself then we can go with above
queries
The o/p query is null no values means data loaded without any error unit testing thing.

Scenario 2:Database Source and Target is different db service

We need to create a another table in target without any transformation i.e 1 to 1 mappings
Need to minus query for chcking whether data load done is correct

U will get no o/p if the data transformed is correct

If wrong then u will get to see o/p on running the minus query

Unit testing is if developer is testing how the logic is working

QA team is doing then it different than developer testing

Another way is taking exporting the source data into excel and target also into excel and
comparing both source and target excel to validate whether data got correctly loaded

Query Surge Tool Data validating tool


Above are few expressions transformations fields example

V_seq to generate sequence number

O_commission_pct will chck for null if null it will replace with 0 or else as its is value

O_phn_number will addition of ‘+1-’ in front of value and to remove ‘. ‘

O_hiredate will give o/p whether hired year was leap year or not

O_increment salary means it will chck first if salary is less than 10,000 then salary will be
increased by 20% i.e *0.2 and else previosu condition doesnt satisfy then salary will be
increased by 10% i.e *0.1

IF U HAVE SPACES IN VALUES THEN U CAN GO WITH BELOW ONE:


INDIA --> expresssion --> INSTR(TRIM(UPPER(country)),’INDIA’)

EXPRESSION MACROS

For example U have a data as below

And incase there are spaces for values present so u will be applying trim functions to columns
to remove the extra spaces u have in value

So in informatica u need to create each o/p column for the column logic will be applied
BUt in the case of IICS we can use MACROS OPTION

Three types of Macros

Vertical

Horizontal

Hybrid Combination of Vertical+Horizontal

Vertical Macros

Vertical Macro expands an expression vertically that implies vertical macro generates a set of
same expression condition on multiple incoming fields

The macros will be applied in vertical manner means to multiple columns we can apply that logic

For example we want to apply logic as trim and replace the ‘$’ of all columns by using vertical
macros we can apply that writing once at multiple columns instead of applying for each column
separately

INPUT MACRO FIELD --> Where u need to mention all columns as input to macros mention

% % --> This indicates MACROS

In this configure we need to mention all columns we need as input to MACROS


U need to get create output macro field where need to mention data type and length precision
approx
In this u need to configure u need to create expression that would applied on column just thing
is replacing column name with input macro port name

If we validate the expression with macro then it wont validate-->


RTRIM(LTRIM(REPLACECHR(0,%in_PORT%,'$','')))

IF u want to validate the expression replacing with column name instead of input macro will
validate

RTRIM(LTRIM(REPLACECHR(0,FIRST_NAME,'$','')))
AT TARGET SIDE:

U need to create a parameter value then u need to create a mapping task and field mapping of
macros in mct
U need to map the macros columns which will have suffix _out with columns

Horizontal Macros
Use a horizontal macros to generate a single complex expression that includes a set of
incoming fields or set of constants

In a horizontal macro,a macro input field can represent set of incoming field or set of constants

In horizontal macro the expression represents calculations that you want to perform with
incoming fields or constants.

The expression must include a horizontal expansion function

A horizontal macro produces one result so a transformation output field passes the result to the
rest of the mapping.You configure the horizontal macro expression in the transformation output
field

The result of expressino pass to downstream transformation with the default field rule.You do
not need additional field rule to include the results of horizontal macro in mapping

To write the results of horizontal macro to the target,connect the transformation output field to a
target field in the target transformation

FLAG: %OPR_SUM[IIF(ISNULLL(%in_PORT%,1,0))]%

At runtime the application expands the expression horizontally as follows to include the field that

IIF(ISNULL(First_Name,1,0))+IIF(ISNULL(Last_Name,1,0))+IIF(ISNULL(Phone_number,1,0),1,0
)+IIF(ISNULL(Job_ID,1,0),1,0)

SNOWFLAKE CONNECTION

Creating snowflake connection -->IICS>Administration

Add-on-connectors search for snowflake connector -->strt free trial and go to connections
-->new connection and select type of connector as below

And enter all details of snowflake such as username,password,account link


Mapping part
JOINER TRANSFORMATION

This used for heterogenous sources and incase u need to join two data pipelines at any part of
mappings

IF TWO PIPELINES AND DATA YOU GET IS ACTIVE AND PASSIVE DATA THEN ONLY
YOU CAN DIRECTLY JOIN

BUT INCASE U R GETTING ACTIVE AND ACTIVE DATA THEN YOU NEED TO USE
SORTER TRANSFORMATION BEFORE JOINER AND TICK SORTED INPUT

Source transformation --> for joining to same sources and can be applied at source only
Always take big table(no.of columns and no.of records) table as detail table.

Always take small table as master for having less cache to store

If it creates small table for cache bcoz it will improve the performance

4types of Joins
Left Circle--> Detail table

Right Circle-->Master table

1.Normal Join (Inner Join)

2.Master Outer(Left Outer Join)==>All Records from Detail and only matching records from
Master

3.Detail Outer(RIght Outer Join)==> All records from Master Table and only matching from
Detail

4.Full Outer(All records)

Join Condition is common column between two sources and you need to have common column

N number of heterogenous sources then n-1 Joiner need to use in mapping

Ex:2 number of heterogenous sources then 1 joiner need to use in mapping

Joiner takes only 2 sources only and if more no.of sources you need to source joiner

Incase u need to filter out data then u need to apply filter using source filter as it will read less
data from source and incase it is details source then less amount of cache to store

mapping

Joiner transformation condition of joining


SORTER TRANSFORMATION

It is Active and Connected Transformation

It has DIstinct option in Properties due to which it is active transformation

INCASE U WANT TO REMOVE FULL ROW DUPLICATE ANY WHERE IN PIPELINE U CAN
USE SORTER

It is similar to SQL ORDER BY Clause

ASCII VALUES

A-->Z (65-90)

a-->z(97-122)

U need to select which column by sorting and which way of sorting ascending or descending
In this Case Sensitive means if turned on

Arun

Baba

arun

Means it will sort first the Upper case and then lowercase characters

Distinct Can be enabled if we want remove full row duplicates

BY default null is treated as high in all oracle db side incase u need to change u need to enable
it

Aggregate Function IN SQL

Aggregation Transformation

Active and Connected

Min()

MAx()
Sum()

Avg()

Count()

Group by

First it sorts data and then performs aggregator on top of it

To improve the performance of aggregator we can give sorted data as input

IN SQL

Select employee_id,first_name,sum(salary) from employees;

This above statement can be fetched by using group by on non aggregator columns

BUT IN INFORMATICA
It allows to get data for non aggregate transformation without applying group by on non
aggregate columns

And incase in GROUP BY u r using then only u need to use SORTER BEFORE
AGGREGRATOR TRANSFORMATION the column which u made as GROUP BY That same
column need to be USED SORTED BY in Sorter Transformation

INCASE u gave Sorted input and u dont have group by mentioned in AGGREGRATOR
TRANSFORMATION THEN ITS FINE

IF there is more number of rows then when applied aggregator SUM on Salary then it will show
last record and SUM(SALARY) in salary column

Incase u enabled GROUP BY APPLIED ON DEPARTMENT_ID and AGGREGATOR APPLIED


ON SALARY
For each department_id it will take last records of that and will give data of other column

Sorter-->Aggregator this will have effective performance incase of records are more in input
Incase u enable sorted input and u dont provide then it will throw the below error

DATA CACHING--->Incase u have data we wanna apply sum and other aggregators then it will
store alll data value and then u will aggregate sum by using whole records data

INDEX CACHING--->It will make indexing based on sorting

ROUTER TRANSFORMATION

If u want to make one data pipeline splitting into many data pipeline then we can go for Router
Transformation

It will filter data with conditions and will create by default one group which will have data which is
not satisfied by filter conditions mentioned
Example You have source data with multiple mode of payments you can use router
transformation to split data with mode of payments Such as UPI,CARD,NETBANKING

You can convert single data pipeline you can use router (single to many -->router)

And for multiple data pipeline into single data pipeline you need to union(many to single -->
union)

Incase Value of Column is ‘Grocery’ INcase u put this on filter sometime it may having matching
issue instead we can upper that value and chk whether its matching

RANK TRANSFORMATIONS

SALARY RANK() DENSE RANK() ROW NUMBER


7000 1 1 1
6000 2 2 2
4500 3 3 3
4000 4 4 4
4000 4 4 5
3200 6 5 6
3000 7 6 7

Reference how the rank works

Rank Transformation will take time to complete as it need to compare each record and make
ranking based on values of that records

Note:Performance Can be imporved by using sorter before rank transformation

Group BY isnt mandatory for RANK Transformation

Best Example: You have 100 students of college u applied rank of these students based on
cgpa

But incase you want to know which year these students belong to then we will group them by
year and apply rank on students based on cgpa

NOTE: BY using Informatica u can only do Rank() u cannot perform dense rank()

For performing dense rank() u need to use expression transformation

U need to select rank by which column and Rank Order means least to high value for Bottom
and High value to least value for Top
No.of rows --> means incase no group by and u give 100 then it will pass 100 records

Incase u apply group by some column example department_id then it will give records -->2 it will
pass for that department_id only 2 rows that may be same rank ones or different ranks as 1 ,1
or rnk 1,2

You can parameterize the number of rows u need to rank 100 rows or less number or more

This below example is of no.of rows =2 and group by department_id and rank by salary

You can apply group by also by mentioning the column name


Transformation scope means

The above part is regarding cache how its storedd and other things about it

It is also available for only 3 transformations i.e. Rank,Sorter,Aggregator

LOOKUP TRANSFORMATION

Lookup is always left outer join

CONNECTED LOOKUP

It is similar to Informatica PC but few small changes

Lookup will lookup on another table to fetch records similar to join but in lookup u can go with
nonequi joins

Source is Mart_data where product_id and transactions details such as price ,quantity, and so
on...

So to get which product it is like product_name ,product_description from Product_data table

Then performing a lookup on product_table with condition as


Product_data.Product_id=Mart_data.Product_data

Connected ones:This connected lookup will be part of pipeline will be connected to other
transformations

Lookup SQL Override : This we can use incase we need only few other conditions based results
from lookup table such as only active records

Lookup Source Filter: Here you can mention the filter conditions to apply on lookup tables

Informatica will create lookup cache at very first time so that each every time of running
it will not go to lookup table for comparison instead it will check from lookup cache
created

By default lookup will created it will create static cache


The below image we can see field name issue actually in Powercenter when drag and
drop columns incase transformation has column name present already then it will add
1,2, numbering for same column names

But in IICS we need to make column_name should be given which prefix or suffix

Incase we give suffix as ‘emp’ for example then all source fields will have emp as suffix
before the column name that is emp_first_name
Below image in return fields u can select which all columns needed and remove
unnecessary columns in lookup table to avoid creating cache on it.

Mapping

SOURCE TABLE(LEFT TABLE) AND LOOKUP TABLE (RIGHT TABLE)

LEFT OUTER JOIN MEANS SOURCE LEFT OUTER JOIN WITH LOOKUP TABLE

INNER JOIN BY USING FILTER CONDITION AS NOT ISNULL(COLUMN_NAME_OF


LOOKUP TABLE)

CONNECTED LOOKUP WITH 2 TABLES DEPARTMENT TABLE AND LOCATION TABLE


UNCONNECTED LOOKUP—Only 1 return port u can get it and it also left outer join

:LKP Then informatica understands that it’s related to unconnected lookup trans

Create o/p port only for that column which u need the data from lookup table for example above
picture we need department_name from department lookup table so mention department_name
as o/p port same follows for location_id as o/p port

Why Do we need unconnected lookup as it is doing same job as connected lookup?

The main advantages using unconnected are

Reusability—You can use or create one time only the unconnected lookup but u can use it
multiple times anywhere in that mapping

Condition base Lookup--- incase a table Product_data where we have product_details such as
name description and so on..

Source is Dmart_data where u have product_description and product_id but u have around
1000 records intotal out of which 100records having product_description as “Null” value

If we use connected in this case it go on search for again 1000 records


But if we use unconnected lkp then we can make condition for lookup

As logic IIF(ISNULL(prod_desc),:lkp.u_lookup(product_id),prod_desc)

Unconnected lookup will return only one port then how to get more than one port ?

U need to mark this for making a lookup transformation to unconnected transformation

Need to add incoming field manually by using the + symbol


Need to mention lookup sql override in that mention the alias names perfectly or else it will
throw errors and do not use ‘;’ in informatica query override

Mention the lkp_condition

Select the return port and precision high for data to acomodate as data would be concatenated
one
IN EXPRESSION

Use the expressions to derive the data

Substr(v_expop,-4) means will extract the data of last 4 positions that’s y used –4

Need to create o/p port for the columns need to be derived from concatenation example
department_name had two columns data that is department_name and location id both with
delimiter as | in between

Lookup Policy Match


For example source has product_id 1201

And lookup table Products is having 1201 2 values /multiple values in that case

We can go with multiple scenarios according to selection

Like only first value among the values

Or only last value among the values

And throw error and fail session

Or use any values among the values present (default)

Or

Return all Lookup values (in this case lookup is active transformation)

As 1 lkp value is expected for 1 source value

So incase passed 100 will get 100 but due return all it might give more than no.of rows
passed

2 lkp values may give o/p as 101 recordsC

INCASE LOOKUP VALUE HAS 2

1201--> BANKING

1201-->PURCHASING

But when we select first row lookup it order by ascending order the order will be

First row-->BANKING

Second row/last row--> PURCHASING

Example of All rows values


WHEN U HAVE LOOKUP DATA IS HAVING FLAT FILE AND LOAD MORE THAN ONE
COLUMN DATA USING UNCONNECTED LOOKUP

Then u need to use two seperate unconnected lookup transformation with each column
deriving from one transformation bcoz in u cannot do concatenation of column data in
case of FlatFile data

TRANSACTION CONTROL TRANSFORMATION

It very rarely used transformation

Whenever we want dynamic file in target we can go for transaction control transf

It is Active and connected transformation

SQL--->TCL--->Commit,Rollback

Infa-->TC_Commit_Before

TC_Commit_After

TC_Rollback_before

TC_Rollback_after

TC_Continue_Transaction

Example:If we acheive below scneario we can do Transaction control

In Target u need to create file on every 5 records of source

Prd Number Seq No

100 1

102 2

103 3

104 4

105 5 Whenever encounter 5 we can go for commit after means


it will go commit after this

106 1 Wheneever encounter 1 we can for commit before means


it will go

Commit all records and create a file for that before records
107 2 Whenevre it encounter other number other than 5 or 1 then
it will

Do TC_Continue_Transaction

108 3

109 4

110 5 Whenever encounter 5 we can go for commit after means


it will go commit after this

111 1

112 2

113 3

114 4

115 5

Rollback before means roll back of before 1 all records

Rollback after means after 1 all things will be roll backed

Example:

ON sorting

Product_id

100 100

101 100

102 100

103 101

104 101

100 101

101 102

102 102

103 102

100 103
101 103

102 103

103 104

104 104

It will create each file for product_id 100,101,102,103,104 we can go for commit before in
change in column data means any value change then 100 to 101 then 100 will go to one file

This will seperate the files according to product_id column value change

We can go different country different file

Mapping:

Transaction control Transformation Properties

You can do sorting of data before using it in transaction control transformation

And In Transaction Control Condition --> If field Value Changes means that As soon field name
of counrty_name column changes it values for example India-->China then it wil execute the
operation mentioned in properties

Sorting before Transaction control Transformation


NORMALIZER TRANSFORMATION

Incase we are getting data from Mainframe – VSAM Files this will be having denormalized data
levels forms

To read this data we can use Normalizer after source(VSAM) and target as relational database

Whenever i need to Transpose data you need to use Normalizer

In Normalizer will have only 2 types of data types that is string and number so u need to
change the other data type if any into string/ number and again back to original data type
by using expression transformation
Incase u need to increment the data and example u loaded yesterday 16 and next day u
are loading data around 15 so incase u need to start from 17 u need to use mapping task
as it holds value incase of sequence generating case

If we dont use mapping task then it will just create again from 1

Mapping:

Normalizer for transposing the data

Rank Transformation for getting top category data

Expression to decode the data values

Normalizer Trans:

Normalized fields is which field u need to make transpose of and gc_id->generated column

Gk-->generated key

Main part is occurence of column need to mentioned


Field mapping map the fields accordingly

Decode means if condition then kind of statements only


At Target

Target table data :


SEQUENCE GENERATOR

It is connected and passive transformation

Wherever we want to have unqiue column value like surogate key then we can use seq
generator

It can be used anywhere on pipeline

NextVal--> It is used to generate next next value ,You can pass this port to any table or
transformation

CurVal--->It will store the current value of number

CURVAL>NEXTVAL

NextVal -->need to passed to O/P

To reset the value of sequence it’s easy to handlde with help of sequence generator
transformation

For example, You have generated 1 to 5 and you want reset value to 1 again after 5 this kind of
scenario we can use sequence generator.
Use Shared Sequences --> incase you have two mappings and their target is same table then
we can go for it. Means both mappings will share sequence number m1 & m2 will have seq s1

Increment by Means increase by which number of

End Value is 19 Digit Number

Initial Value-->Start value for sequence

Cycle-->It shouldnt be enabled incase we are keeping seq generator column as primary key

Cycle means once it reaches the end value it will start again from the initial Value

Cycle start value---> if we enable by which value the cycle should start

Reset---> Means it will reset the values with each new session run of each mapping

No.of Cached Values --> If by default given 0 means it wont store any sequence number
values under informatica memory

But if we give any specific number such as 100 then it will keep it in informatica memory and
once the input data is received it will assign that values to data

Incase we have large amount of data 1million then assigning each data and seq number will be
time taking so instead generating some seq number and then assigning them value

Incase we wanna make Sequence Generator Transformation standalone then we can


select disable incoming fields
Mapping Task need to be created incase using seq generator as it holds the values of
sequence or else it will just reset the values to intial on each run

Without mct-mapping task

With mct-mapping task

Mapping-
TO FETCH ONLY 4TH RECORD

SHARED SEQUENCE

N number of mappings can use shared sequences

In this concept, the mapping ‘m1’ and mapping ‘m2’ having same target table ‘t’ then both can
have shared sequence generator

Means m1 will have reserved 1 to 1000 sequence


M2 will have 1001 to so on..

So incase m1 as used 1 to 150 also

M2 will start from 1001 only

IF in first runs u r getting 1250

So it will have 1 to 1250 it will reserve another 1000 records from 1001 to 2000

For next mapping m2 will start from 2001

JAVA TRANSFORMATION

How to use java code in Informatica is main aim of this transformation

The Java transformation provides a simple, native programming interface to define


transformation functionality with Java Programming Language

For example we may want to use looping,encryption and decryption can be achieved easily by
Java Transformation

Active or Passive.

If active, the transformation can generate more than one output row for each input row

If passive, the transformation can generate one output row for each input row

Default is Active

SQL Transformation

In informatica we had stored procedure transformation seperately

But in IICS we dont have seperate store procedure

In any where between pipeline if we need to use or call sql statements then we can go for SQL
Transformation

4 Transformation we can use SQL queries

1.Source Transf

2.Target Update statement pre –sql and post sql


3.Lookup –Lookup Override

4.SQL Transformation

Sequnce can be generated by using sequence generator transformation and by using the SQL
transformation by calling the database object

Mapping:

SQL Transformation :

Where u get different options such as SQL Query,Stored Procedure,Stored Function

U get option to enter query and load the saved query from loacl .sql file
In the above image we can find the query :
SELECT

DEPARTMENT_NAME,

LOCATION_ID

FROM

DEPARTMENTS

WHERE

DEPARTMENT_ID = ?DEPARTMENT_ID?;

In this ?? Means at run time it will take the values and proceed fruther

?DEPARTMENT_ID?; is called parameter binding

In output fields need to mention the two columns of tables u r fetching the data

For example in above select query we are fetching location_id and department_name so need to mention that
columns in output fields and sql error show the error occured if needed in db
And pass through fields are like do u need all incoming fields to be passed to downstream or u
need to exclude few

At target u can check that all columns names while field mapping and the outputfields + pass
through fileds of SQL Transformations

Output Oracle fetching department_name and location_id from departments table for the data
available on employee table similar thing as lookup
Dynamic SQL File Loading

PARTIAL DYNAMIC SQL

SOURCE:

Created file table_names.txt with Header mentioned as TABLE_NAME

Mapping:

SQL TRANSFORMATION
Create the o/p column for the columns name which are present in select query

You can create table and load data in that or else can create the table at run time for storing
data

o/p at database
A Static SQL Query runs the same query statement for each input row in SQL
transformation. But the data of the query can be changed for each input row using the
parameter binding in the SQL editor.

The string or column of static SQL is enclosed with question mark (?)

-----------------------------------------------------------------------------------------------------------------------

A Dynamic SQL query can execute different query statements for each input row. The
SQL query executed for each input row is changed using the string variables in the query
which link to the input fields passed to the SQL transformation.

To configure a string variable in the query, identify an input field by name in the query
and enclose the name in tilde characters ( ~). The query changes based on the value of
the data in the field.

FULL DYNAMCI QUERY


The entire SQL query can be substituted with query statements from source.

PARTIAL DYANAMIC QUERY

A portion of SQL query (like table names) can be substituted with input fields from source.

SQL Transformation: Query Mode in Informatica Cloud (IICS) - ThinkETL

For reference above link

FULL DYNAMIC SQL QUERY:

Source:

Mapping:

Source Transformation with delimiter as Semicolon(;)


SQL Transformation: Use query_field enclosed with tidle characters

We have select query in the input file so need to give that column as o/p fields

Configure the target as normally with runtime table creation or loading in already available data

o/p at database level


TARGET TRANSFORMATION

SCD TYPE 1 IMPLEMENTATION

We dont have update strategy transformation in iics We will have option in Target transformation

Insert,update,delete ,upsert(insert and update logic),data driven

In Data Driven---> u will get options such as dd_update,dd_insert,dd_delete,dd_reject u


need to mention this under data driven condition
In above picture if u go for insert in target u will get to see as inserted rows

When u select update at target it will ask on which column basis u will update the table

When u open session log it will have updated rows incase also its just normal insert to table
IF UPDATE-->

IT WILL UPDATE THE EXISTING RECORD WITH ANY VALUE CHANGE IN FIELDS

IF DELETE-->

IF BOTH SOURCE AND TARGET MATCHING RECORDS WILL BE DELETED AND


UNMATCHED RECORD WILL BE KEPT AT TARGET

IF UPSERT –INSERT ELSE UPDATE

IF 100 PRESENT IN TARGET IT WILL UPDATE 100

IF 101 NOT PRESENT IN TARGET IT WILL INSERT 101


UPDATE AND INSERT OPERATIONS IN TARGET CANNOT BE DONE ON FLAT FILE AS
TARGET

Mapping:

First u will take source data and then lookup the target data with primary key or particular field

For example--> employee table

Lkp condition :--> employee.emp_id=src_employee.emp_id

Loookup between target and source

We will generate flags for insert and update

Ins_flg condition is --> IIF(ISNULL(EMPLOYEE_ID),1,0) means if employee_id is null in target


then will insert the data

Upd_flg condition is -->


IIF(EMPLOYEE_ID=src_EMPLOYEE_ID AND FIRST_NAME!=src_FIRST_NAME OR
LAST_NAME!=src_LAST_NAME OR EMAIL!=src_EMAIL OR
PHONE_NUMBER!=src_PHONE_NUMBER OR HIRE_DATE!=src_HIRE_DATE OR
JOB_ID!=src_JOB_ID OR SALARY!=src_SALARY OR
COMMISSION_PCT!=src_COMMISSION_PCT OR MANAGER_ID!=src_MANAGER_ID OR
DEPARTMENT_ID!=src_DEPARTMENT_ID,1,0)

Means it will chk whether employee_id already present if present chcks all other columns are
similar with src using lookup column values

Then in Router we will decide to make different groups of insert and update

1.INSERT --> INS_FLG

2.UPDATE --> UPD_FLG

In advance filter condition need to mention above conditions

If we mention just that column name of flag then it will consider that true only then only pass to
that groups

In target need to select data drive update and select dd_upadte

Another target with normal insert and truncate target option disabled

UNION TRANSFORMATION

It is active bcoz it changes row_id of rows


Incase we have multiple files with same structure then we can go for indirect file loading using
filelist concept

But for tables we cannot use filelist instead we can have union concept for tables

UNION--> multiple source to single o/p

Multiple source having same structure

We can also go for hetrogenous source like one file and anothe table but the same structure of
both

Data type of similar structure should be matching

If we do union we will get union all at o/p

To change union all to union with the help of sorter after union transf.

UNION WILL NOT ALLOW DUPLICATES

UNION ALL WILL ALLOW DUPLICATES

Mostly we will take some logic from source by using router will convert the single input
data into seperate groups o/p and that o/p is applied with various transformations logic
and at the end need to combine all this seperated group o/p to one using union
transformation

Mapping:
Incase you want to add another new group click on + symbol

For each input group you need to do field mapping


If u directly map the union o/p to target u will get duplicates record that union all

Incase you want to remove duplicates record then need to perform sorter transformation and
select any column for sorting and in Advanced select distinct option

Map the fields to target and validate.

WE CAN USE BELOW TO CONSTRAINTS OF TABLE

select * from all_constraints where owner='CORE' AND


lower(table_name)='t_employees';

select * from all_cons_columns where owner='CORE' AND


lower(table_name)='t_employees';

In above statement owner is schema name and we gave lower of table_name to avoid
case sensitivity issues.

SCD TYPES IMPLEMENTATION

SLOWLY CHANGING DIMENSIONS

TYPE 1-NO HISTORY WILL BE MAINTAINED

SIMPLE UPDATE

TYPE 2-HISTORY WILL BE MAINTAINED

ROW LEVEL HISTORY WILL BE MAINTAINED

a.METHOD 1 – FLAG

b.METHOD 2- VERSION

c.METHOD 3-DATE

TYPE 3-RECENT HISTORY WILL BE MAINTAINED

COLUMN LEVEL HISTORY WILL BE MAINTAINED

Scd TYPE 2
FLAG METHOD

Mapping

1 Target for normal insertion of data

Another target update_insert is for inserting updating records

Another target update_update is for setting the flag to 0 for that old record

In source,s_customer table and target is t_customer_flag

Need to lookup the target table on basis of customer_id

And need to lookup only target data which has flag 1 means we will lookup only on active
records
Creating expression transformation for updating and inserting the data into the table

Creating insert and update flag logic and insert_active_flag and update_active_flag for flag
column of target

Create a router groups for insert and update

Insert will go for normal insertion of new data into target_insert

Insertion of updated record into target update_insert

Updating the old record as flag 0 into target update_update

Sequence to passed to target_insert and target_update_insert so new sequence is created for


each time data is inserted into target whether

The most important thing


Keeping target_update_update as operation as data driven-->DD_UPDATE---> Update on
basis of column customer_key

TASKS

1.Mapping task – Session in PC


Whatever mapping we created we create a mapping task re

2.Replication Task (It is similar to materialized views)

Use the replication task to replicate the data to target .You might replicate data to back up the
data or perform offline reporting

We can schedule replication task

You can replicate data in salesforce objects or database tables to database or flat files.You can
configure a task to replicate all rows of a source object each time the task runs or to only
replicate the rows that changed since the last time the task was run.You can use a replication
task to reset the target tables and create target tables.

Why is load type disable when oracle/sql server/my sql is source in DRS?

DRS will always do full load when database is Source,user cannot change the load type as it is
as designed in task .To do incremental load,columns like created date,modified date are
necessary and these columns may not be available with all tables in database.Hence by default
task will do only full load

What is workaround to do incremental load if DRS do not support it?

Workaround is used to data sync task and use $LastRunTime or any other appropriate data
filter which can mimic the incremental load

For eg. U have data on US Server and u are reporting from EU server so u need to replicate
data from US sever to EU server that is offline reporting

If we try to fetch data from another server location we may face latency issues that’s y we opt for
replication task to create replica of data

And for example u have target table and u need lookup on it but due but its taking time so we
will create local server replica of that table and perform lookup

If target is not availabe it will create and runtime and load data and if its availabe it will
incrementally load data
Under load type u will get options of delete records which will be enabled only if u have audit
columns in ur table aud_cols—create_date,update_date,update_user.
Mostly for backup tables for case of replication factor

Offline reporting means it is similar to materialized views

3.Synchornization Task--

Synchronization task is used to synchronize the data between source and target

We should have Target for Synchronization task

It supports FF connection ,RDBMS connection

Use the synchronization task to synchronize data between a source and target
For example.you can read sales leads from your sales database and write them into salesforce
you can also use expressions to transform the data according to your business logic or use data
filters to

You can use the following source and target types in synchronization tasks

Database

Flat file

Salesforce

Synchornization task u can load one table at a time u can use multiple table into one using join
sql query

Synchornization task has option ofupdate insert delete

1st load initial load all data is inserted

2nd load incremental load but in these unlike scd it will just update all rows with actual updated
row of source
U can add mapplet and edit data types in synchronization task
U have options to create expression or use functions on any column when u click on specific
column name
4.Powercenter Task – We can ran powercenter mappings,sessions,wfs in IICS but you cannot
have more than 1 session,1 mapping,1 workflow if it has then it will fail

Incase we have changes in connections we can do it in this powercenter task

You need to mention all parameters mentioned for powercenter in iics powercenter task

We cannot do any major changes such as transformation logics and all .

5.Data Transfer Task

This is for loading data to target from source with out creating the mappings by mentioning
details such source-pre sql,sql override,post sql,2nd source(will act as join),target details,field
mapping,and runtime options(scheduling)
Fields will get automapped
6.Dynamic Task

Run multiple data pipeline jobs from one mapping

Parameterization of mapping if we have multiple sources it should belong to only one database

Created a mapping which is having only parameter values no direct values of tables source and
target

Adding values to that mapping by using dynamic mapping task where its will act as one
mapping but multiple data pipelines

Parameterized Mapping: for this field mapping wont be enable so u need to take care that
both target and source have columns for proper data loading
By using new paramter option we can create parameters

For this task first u need to create one dummy mapping which will have all values as
parameterized
Here at Jobs section we can mention about as many sources and targets we want to process

We can assign them groups


According to group number they will runned

Incase both jobs have same group they will run at same time

Group 2 jobs will run after completion of group 1 jobs


Reusable Transformation or Mapplet:

In IICS, we dont any sepearte resuable transformation as similar to Powercenter

We do have mapplet
Using Mapplet in Mapping:
TASKFLOWS(WORKFLOW IN PC)

Different Taskflows

1.Linear Taskflow --> start -->task1-->task2-->task3 linear way of execution

In this cannot add task steps in it.If we want to have different tasks in sequential flow of task
execute

It has predefined email no customize email body notification

You can change the order by changing the numbering of respective tasks
2.Taskflow

In this we can all the componets such as taskflows,tasksteps for integration and work on
it.

3.Parallel Tasks

Running the different task at same time parallely and main thing u shouldnt use same target
table while using parallel task

In this adding plus symbol will create another linking path for another parallel task

U need to enter existing mapping task details in data task


IN Error handling u will get stop on error and stop on warning options

4.Parallel Tasks with decision

Normal parallel task with addition of decision means if the mentioned taskflow gets failed or
succeed or any other condition will proceed with the action mentioned in decision task
Below picture means the on mapping task we are putting the decision dependency where if it
equals to 1 means succeeds then notification task of email success will be sent

And another path2 where the notification task of email failed will be sent

Under Notification task step


U will have option of to and so on.. And html will have formatted mail in proper way

U can also customize what all details u needed of current run in mail

On successful completion we can see below interface at my jobs


5.Sequential Tasks

In this we can add any task steps in the taskflow.If we want have different tasks in sequential
flow of task execute

6.Sequential Tasks with decision

7.Single Task
TASK STEP

1.Assignment

Assigning a value based on that value further action can be performed

For assigning values u need to have any variable to get stored that value

Use this string of text format in assignment task

Assign expression to that str variable

Means if status of two task is success that is 1 then it will have value as Pass or else Fails
In decision mention the values based on str.

2.Data Task

3.Notification - Email

4.File Watch – Event Task

You need to create first file listener component first and mention that component under taskflow
file watch task
A file listener listens to files on a defined location.

A file event occurs when new files arrive to the monitored folder or the files in the monitored folder
are updated or deleted.

At which interval and between which time line it should chck for the file in Schedule
For above scenario SO it works like as soon we start running taskflow it will be searching for the
file with mentioned directory and filename if it finds then it deletes it

Realtime usage in Project

Incase Source teams as upload the files they are informed to upload indicator file once all files
are uploaded

So file listener will listen to that file and will delete it and further process will start running

So we have one wf will start everyday at 7.30pm but it has dependency of files and data in
specific tables then only it will start that workflow running.

So we can have file listener /file watch for above scenario to proceed further

5.Sub task - Task inside task

Decision

Wait

Parallel

Jump

Throw --> control task in pc

Command --> batch command

Using this we can process commands in command task


In taskflow the command task will get exceuted first and then data task which as mapping task
under it the

By the time data task get executed the command task will successfully excute command and
create master file i.e config .txt

Script option where u need to enter details related to command file

And main important is runtime environment value

For example, we have used a windows batch command but in realtime project we will be unix
based commands

For windows -->This file should have extension .bat (batch file)

In this command it will list down files under mentioned fodler and subfolder and will create
mention folder name at mentioned path

Realtime time project sceanrio


U can ask your source team to place all files at specific time that is 7am and u can run this job
after 7am and do succesfull load of data to downstream

Touch File automatically generation

This below command will create a file which as ECHO is on and will be have sized KB File

This below command will create 0KB File and no content inside tha t file

Touch file or indicator file concept means the source teams 1 once the taskflow 1 gets
successfully completed it will just generate one touch file which will act as indicator file for
starting the next taskflow or next process Run.

While source team is transferring the data or file on sftp but in this process we shouldnt interrupt

Source team can place one touch file (indicator ) so once the whole process is completed and
next team will read that indicator file and delete or modify the file using file watch task and
proceed further.

Scenario: If the task runs more than 5minutes then it should send mail and it should fail

So that it will send mail to support team and team will do appropriate actions on the taskflow

Performance Tuning in IICS

1.Download session log and analyse the log

2.In log u will get specific things such as


Reader Transformation(IICS) Writer

Reader thread will give info about the source side

Transformation thread will give info about various transformation in mappings

Writer thread will give info about the target side

All this thread will have load summary mentioned if u have big data such as 100k only then load
summary is generated which as data of busy% and idle time

Such as:

Busy %:IF busy percentage is 100 or more means it is utilizing all resources for that respective
operation, and if we are getting timeout

Idle Time:0%

If busy 100 then idle is 0%

If u r seeing busy more than 50% then it is bottleneck

Database side performance tuning:

Incase it is having source or writer as rdbms then we need to explain plan of query and improve
performance whther its going for full table scan or other way

If writer as indexes, we should drop index and load data and then post sql we enable
index/create index again

If we have transaction data, we can go partitioning

SQL-HInts

Collect stats

Informatica level performance tuning:

If u have normalizer u cannot do pdo and partitioning

1.Pushdown Optimzation--> converting the transformations into sql queries and pushing down
to source/target/both and we cannot do pdo when we are dealing with flat file
Note:For running pdo u need to create mapping task

a.To Source:Traverse from sour

ce side and will try to convert the transformations as sql queries and pushing down to the
source side as select statements

As much possible it will traverse it will bring to informatica level

b.To Target: Traverse from target side and will try to convert the transformation logic as sql
queries and the pushingdown to the target as insert or update statements

c.Full: It can do both from target and source

Create a mapping and mapping task for pdo

IN mapping, you could see three dots on top right corner>Push down Optmization > Preview
Pushdown

U will get definition option where u need to select respective mapping

Next Pushdown Options> u can select here which type pushdown optimization u want to do

This above method u will get preview of query generated by pdo method but incase it isnt
working u can go for below method

--> while creating mapping task after entering definition details such as mapping name and
environment

-->next is schedule in that when u scroll down u will get option of pushdown optimization where
u can select the type of pdo

For example, u have mapping without pdo applied u are running it reading around 105 records
and target is loaded with 78 records due to some filter transformation data is reduced

But when using mapping with pdo of source it will read around 78 records and it will load those
78 records into target. So this way pdo can help in performance. In session log u will be able to
see transformation converted sql query

Use and pay for the pdo services for iics

And if u want to use pdo in on premises then u need license of pdo


2.Partitioning

Partition we have only source level only nor any transformation level

a.Key range partitioning – Relational Source

Create mapping

In source side U will get option as Partitions

There u need select partition type key range

Mention key ranges in below by using + addition symbol

Partition(on employee_id) Start Range End Range


#1 100 140
#2 140 180
#3 180 500
This can be done only if we get records with this range

Incase the range of records data gets changed i.e increased then it wont work

If employee_id is not falling in the above mentioned range then it wont be considered

So incase we apply partition it will process parallel

b.Fixed Partitioning – FF Source

Max:64 partition

In mapping source side u will get partition option

U need to enter how many partition (parallel processing u want to do)

It is more useful if we have large no.of records


METADATA:

Operational INsights Service gives information of run through iics

Incase u want in table level data for metadata u need to do through rest api call

To get dependencies you need to go to three dots of mapping u get option of show
dependencies

Lookup Cache

Whenever Informatica mappings is running it will check for

1.Paramaters & Variables

2.Lookup caches

3.Reading Data

Cache means whichever table we are using that table data will be taken into our local memory
RAM for speed retrieval of data it will take data from cache instead of going to lookup tables

By default it will create static lookup cache

Both the caches will vanished once the session or task is completed successfully or failed
Once session failed or successful the cache will vanish

1.STATIC LOOKUP CACHE

Once the cache is created During the run time the cache does not gets changed

During runtime incase lookup tables values get changed the static lookup cache will not be
changed

Once session failed or successful the cache will vanish

The static cache does not change while the integration service processes the lookup

2.DYNAMIC LOOKUP CACHE

We cannot enable dynamic cache in unconnected lookup

SCD 2 using date with md5 checksum

Generating md5 checksum values and eff_date and end_date values


Lookup Dynamic with customer_id condition

Removing all other fields except customer_key,customer_id and md5_checksum so that lookup
cache does not get build large size

Customer_key can be ignore that’s why selected ignore comparison

Lookup advanced properties must be checked and need to keep lookup source filter as
12/31/9999 to get only the active records for comparison
Routing data based on newlookuprow port values

Normal insertion of data that is new records


Update insert that means updated records insertion in table of existing records

Updating key based on surogate key value(customer_key)


Updating based on customer key and changing end_date for existing records

Use dynamic lookup cache to keep the lookup cache synchronized with the target

When u enable lookup cache,a mapping task builds the lookup cache when it processes the first
lookup request.The cache can be static or dynamic ,if cache is dynamic the task updates the
cached based on action in the task so if the task uses the lookup multiple times downstream
transformation can use updated data

You can use dynamic cache with most of sources but you cannot use dynamic cache with flat
file or salesforce lookups.

Incase we are creating dynamic lookup then we will have one extra column called as
newlookupRow in lookup transformation
NewLookupRow Port will consists of values such as

0—no change src and lookup tables having same data

1—New -insert operation

2--- Old record - Update operation

Base on newlookuprow we will do operation in target

If its dynamic lookup cache then it will first insert the data into lkp cache

The source having data emp_id --> 1001,1002,1003,1004,1002,1005,1003

So newlooprow for emp_id follows as

1001-1

1002-1

1003-1

1004-1

1002-0

1005-1

1003-0

Based on newlookuprow port value it will decide whether to update or insert the data

Newlookup row can be used in SCD types

3.PERSISTENT CACHE

If you want to permanently store the cache then u can use it

4.SHARED CACHE NAMED CACHE

If you want the cache is used by many mappings then u can use this cache
Normally caching gets vanished once the mapping task is executed successfully or it get failed .

If you are using persistent cached enabled and named shared cache

If you have multiple mappings under one taskflow out of which few mappings using same cache

Then u it might take 5mins for each mapping task for working on lookup cache so if 4 mappings
then around 20mins

Instead we can create named cache at mapping 1 and can be used by other mappings to avoid
time consumption

We can enable re-cache means incase data gets changed in between it will update the lookup
cache to avoid again wrong data load

Very first time u r running then u need to enable the recache incase its updating

We can reduce lookup time by

1.When the source is having 1000 records only and lookup table as 100k we can go for direct
lookup table instead of creating cache and while looking up for lookup table we can remove
order by clause to avoid for more time at lookup

2.Passing a sorted input to lookup cache so that it doesnt sort on its own which takes lot of time

Parametrization and Variables

Dynamic value to mapping ,mapping task(mct)


Parameter --> Value will not change during the run time

Variable --> Value may change during the run time

# Comments

$$a=10

$$-->user defined paramter

$-->System Defined

$PM

Connections,values,Source Query-->Parameterization

IICS:

IN -->value to mapping

IN-OUT--> value for one mapping to another mapping

For eg. Parameterizing the filter condition at source

Normal 1 to 1 load
Need to create parameter first

Mention the parameter filter for eg


You need to create the mapping task for that respective mapping

Under mapping task>parameter file location

Incase u dont mention the path of file

It will bydefault search at location /apps/Data_Integration_server/data/user_parameters

You need to mention the file that is having the parameterized values
In session log u can view how the parameter value is place and fired as sql query to db
You can create parameterization for connections also

While creating parameter you need to select as connections

Variables

For each and every time it will change today and tomorrow there might be difference in values
Project Related

Methodology-Waterfall,Agile

1.Requirement gathering-System REquiremnets or user stories as jira

Effort estimation = ETA(how many resources needed and how many will create)

2.Design - High level design document(architecture level source to target


data,schedulers) ,low level design document(table level flow and precisions)

3.Developmentn (STTM document,Development summary,unit test results,run


script(create table alter table),rollback scripts)

4.QA-Testing-->quality assurance – Testing sign off

5.UAT-User acceptance Testing

6.Cert-Certification

6.CAB-change adivsory board-approval

7.Go live

8.Production Support-Oncall

L1,L2,L3 Support

After going go live developer need to provide Keb knwoledge based documentation or
knowledge transfer to support team

Waterfall vs Agile

Waterfall – All steps will be followed step by step (sequential way)

Audit Columns-->create date,update date,create_username,updated_username,

Filesystem – SFTP-Winscp

We wont be always hitting the source for data instead once we will load the data in
ODS(Operational Data store) is a central database that provides a snapshot of the latest data
from multiple transactional systems for operational reporting.

Edw-enterprise Data warehouse


Edcp-Enterprise Data cloud Platform

What all data needed in target and how transformation needs to be applied on data

Usually developer creates or else Bussiness Analyst Develops the source to target mapping
sheet (sttm file)

It will contain Source columns_name,precision,transformation logics and


target_column_names,precision

Some columns may have direct mapping and some may have bussiness rules and cleansing
the data (ltrim and rtrim)

Migration in realtime

New development in IICS

Powercenter task but has limitation that we cannot modify the mappings and it should be direct
mapping and workflow should have one session only

Migration Factory

Internal Utilities

The data is main source is from SFTP(client or vendor has access to one of environment sftp
they place the file there)

Data masking is done for PII(personal identification information) encryption and decryption when
accessing that data

QA stage where the data validation happens where all valid data is further forwarded and
rejected records/data passed to source teams with report

Data masking is done for PII(personal identification information)

Symantic layer is last layer it has materialized view or view ,the materialized views get refreshed

Batch_date-->will pull that date data


Different Types of loads

Initial Load -->Extracting the data from source for first time

Historical Load-->

Delta Extract-->First time u have extracted data ,but daily the data is changing

Stage load --> will have truncated and load

Incremental Load-->The data gets inserted and updation this will never have T&L

Agile:

Business Analyst(BA),System Owner,Product Owner or Business Owner: requirements

EPIC:

Jira-User Story-US230000128

Sprint Basis

1 sprint -->10 business days (max upto 30days)

Wed Thu Fri

Mon Tue Wed Thu Fri

Mon Tue

1day =1story point =8hrs-->6hours (development)+2hrs(meeting)

1sprint –totally 10 story point for each member in the team

For example u started on wednesday 10th may so on 20th may u need to do demo or review of
what did in these sprint timeline

IN agile scrum team max –11 members

1 Product owner

1 Scrum master

1 Business analyst
Max 8 or 9 developers

CSM-customer sucess manager –scrum master to stack holder

Sprint standup metting --> daily 11am/3pm/9om-- discussion on the project progress

Jira -->all requirements ,progress

Sprint planning (capacity)+requirement

Meeting:

Sprint planning

Daily standup meeting

Sprint demo

Sprint retrospective

If you are not able to complete any task it will be carry forward to Next sprint

While working on current sprint work we may have alternate mondays or any other day about
next sprint planning

If You are working on production support and you are getting same issues multiple times you will
be creating a problem ticket

Putty is used for connecting through unix server

Winscp used for file transfer for one server to another server ->Prod env we wont have access
to move the file

Incase u have to load daily large volumne of data you can follow intra day loading instead 1 time
load of whole day
Different Schedulers

IICS Scheduler-Native

UC4-Automic

Control-m

Apache Airflow

Autosys

UC4 Automic Scheduler

There are various UC4 objects available to fullfil your scheduling requirements

1.JOBS-Job- are basic building blocks in UC4.for each program that need to run (for example
FTP database load) a job must be created .A job contains all the information required to execute
the program or script on the server and handle the output.When job si created it will specify the
program location,input and output parameters.Job are run both individually and as components
as UC4 process flows.

FUrthermore a job can be component of any number of process flow.If job definition is changed
the change is applied to every process flow that includes it
2.Job Plan --->Jobs are combined to create process flows.Process flows are equivalent to job
streams and run any number of jobs.Process flows include scheduling and exception handling
information.When jobs and process flows are added to a process flow.these objects are referred
to as process flow components

3.Events-->Jobs/Job plan can be trigerred based on time or existence of file.There are two
types of event object

3.a File event-->This is filewatcher objects which senses for file and if true triggers the actions

3.b.Time Event-->this used to trigger job plan multiple times a day

4.Schedule-->Schedule is Parent of all objects .The objects which are to be scheduled needs to
be placed in schedule object where frequency of job/job plan can be added.Schedule loads
every midnight and loads the objects are to triggered . Schedule runs for 24hrs a day and get
auto reload at 00.00 mighnight for next day execution

Jobplan contains mapping task,file transfer task(remote server to local server),Archival jobs,File
watcher jobs

In JOB, >Variables & Prompts > Variables > u will have jobname,job type,param location

In development env u will creating and after that we need to deploy all the dependencies of
mappings,mapping task,and so on and after that need to deploy the job uc4 components

Advantages of UC4

Imporves current data processes,such as automating process that are currently manual

Improves job handling across disparate systems,especially those that have specific output/input
dependencies ,sequence,etc

Provides the ability to check and validate jobs and notify on failure

Extends services by providing the ability to securely move data fromone system to another
system that cant easily to completed today

Provides reporting capability

Provides monitoring of jobs


Provides ability to schedule multiple jobs with flexibity and complexity

Allows for more informed and calculated decissions of maintence schedules and impact
analysis on scheduled jobs

Mapping task(for capturing values cache,variables)

Schema change Handling

In mapping task u have option of Schema change handling

By default ,if you make changes to schema,data integration does not pick up the changes
automatically.If you want data integration to refresh data object schema everytime the mapping
task runs.you can enable dynamic schema handling

A schema changed include one ore more following changes to data object

Fields are added


Fields are deleted

Fields are renamed

Field data types,precision,scale is updated

Commit Point

A commit interval is the interval at which the integration service commits the data to target
during a session.

The commit point can be factor of the commit interval type and the size of the buffer blocks

The commit interval is number of rows you want to use a basis for the commit point

The commit interval type is the type of rows that you want to use as basis for commit point

1.Target based

The integration service commits data based on no.of target rows and the key constraints on the
target table

The commit point also depends on buffer block size,the commit interval and the integration
service configuration for writer timeout

2.Source based commit

The integration service commits the data based on the number of source rows.

The commit point is the commit interval you configure in the session properties

3.User defined commit

The integration service commits data based on transactions defined in the mapping
properties.YOu can also configure some commit and rollback options in the session properties

Source based and user defined commits session have partitioning restrictoins.If you configure a
session with multiple partitions to use source –based or user-defined commit you can choose
pass through partitioning at certain partition points in a pipeline

COmmit interval--> by default 10,000 it will commit after every 10,000 records if you have more
data u can make commit interval also large

Commit on end of file-->You have 5 files and after each file it will commit on target side incase
you dont then at 5th file you got any error it will not load all 5 files then
Recover Strategy-->(for production support) ->recover from last checkpoint it will start from last
time where it failed

DTM Buffer Size-->performance related Data transfer manager -->Auto

Incremental Aggregation -->today u have aggregate some values and u got the final
aggregated value

So tommorrow while again start it will take yesterday aggregate value from cache and
aggregate with todays values

Enable high precision

Informatica oracle tables name it will only take upto 30 characters only --> to go for more
characters in informatica level u can used long name for oracle

Target load order--> in mapping u will flow run order--> in this we need to manual assign the
order of data loading

Constraint based load ordering—It will automatically select the order like which one to load first
to avoid constraint voilation issue

Deadlock
Incase 2jobs that is 2 different mappings loading table at same schedule time we will get write
intent lock it will say table is being used by another process

If u use deadlock retry in session/mapping task -->then m1 is loading and m2 is also loading
then it will retry for mapping m2 if enabled the deadlock retry it will not fail

M2 will wait for m1 process to complete

Stop on error-->if any error it will not fail -->mostly will keep 0 ->if kept 1 means it will pass error
it will not fail it will reject the records

Tracing level

None--> integration service uses the tracing level set in mapping

Terse-->Integration service logs initialization information,error messages and notifications of


rejected data

Normal(by default)-->Integration service logs initialization and status information errors


encountered and skipped rows due to transformation rows errors. Summarizes session results
but not at the level of individual rows

Verbose initialization --> In addition to normal tracing the integration service logs additional
initialization details, names of index and data file used an detailed transformation statistics

Verbose Data-->In addition to verbose initialization integration service logs each row that into
the mapping. Also notes where the integration service truncates string data to fit the precision of
a column and provides detailed transformation statistics

When you configure the tracing level to verbose data the integration service writes row data for
all rows in a block when it processes a transformation

In production we cannot debug or preview data through session log only we can analyse and
backtrack
Transformation Active or Passive
Source Active
Expression Passive
Aggregator Active(used to perform calculations on the
data such as sums, averages, counts, etc.)
Joiner Active(the number of rows in Joiner output
may not be equal to the number of rows in
Joiner Input. )
Rank Active(The rank transformation has an output
port by which it assigns a rank to the rows. Our
requirement is to load top 3 salaried
employees for each department)
Router Active
Union Active

STOP command on the session task, the integration service stops reading data from the source
although it continues processing the data to targets. If the integration service cannot finish
processing and committing data, we can issue the abort command. ABORT command has a
timeout period of 60 seconds.
Questions asked: 1. In the sequence generator, there are 2 ports CURVAL and NEXTVAL.
Which one is larger/has a higher value? - Current Value is the larger value because the nextval.
You can optimize performance by connecting only the NEXTVAL port in a mapping.

2. Union Transformation, does it perform a union or union all? - It performs a union all. It’s an
active transformation because Although we have duplicate records, it is still not an a passive.
There is a change happening to the row number, which is why it is active. To get rid of
duplicates, you can select distinct through sorter transformation.

3. You have an unconnected lookup, in which scenario will you use only unconnected and not
possible to use connected? - When you want to call a lookup based on a condition such as if
dept=10, then call the lookup, otherwise call the third lookup.

4. What is the reason that sorted input will improve the performance of aggregator? - When you
sort the data based on deptno for example, and when informatica is reading the data in the
cache and a new group is encountered, informatica does not know whether the last record is
dept 1,2, or 3, so the cache size will increase and performance will decrease.

5. If you have 10 records in a table, select * from table where rownum < 5, select * from table
where rownum = 5, select * from table where rownum > 5, what will be the output? - Rownum <
5 will give you 4 records - Rownum = 5 and Rownum > 4 will not work.

6. Is NULL treated as the highest or lowest value? - NULLS have the highest value. You have
the option to do NULLS first/last in order by clause.

7. What are the limitations of pushdown optimization? - Variable port cannot be used.
Normalization, parameter files, XML Transformation will not work with this. It doesn't work when
loading from flat file to database. Source and target must be the same database

8. What are the restrictions of a mapplet? - Normalizer and XML transformations cannot be
used. Sequence generator must be reusable. You also cannot have a target in a mapplet, so
update strategy cannot be used either.

You might also like