0% found this document useful (0 votes)
499 views14 pages

Interview Questions and Answers Informatica Powercenter

This document provides a summary of 50 important interview questions about Informatica PowerCenter. It begins by introducing Informatica PowerCenter as an ETL tool that can extract data from sources like databases and files, transform the data, and load it into target systems. The majority of the document consists of questions about specific PowerCenter features and functions, including dynamic file creation, different types of joins, session recovery strategies, the differences between variables and parameters, lookup transformations, partitioning methods, and how update strategies work.

Uploaded by

Sagar Wani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
499 views14 pages

Interview Questions and Answers Informatica Powercenter

This document provides a summary of 50 important interview questions about Informatica PowerCenter. It begins by introducing Informatica PowerCenter as an ETL tool that can extract data from sources like databases and files, transform the data, and load it into target systems. The majority of the document consists of questions about specific PowerCenter features and functions, including dynamic file creation, different types of joins, session recovery strategies, the differences between variables and parameters, lookup transformations, partitioning methods, and how update strategies work.

Uploaded by

Sagar Wani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

Interview Questions And Answers

Informatica Powercenter
Siddhartha Mathur|10 September, 2018

This document contains a list of 50 very important Informatica PowerCenter


questions that are a must-read before attending an interview.

Informatica PowerCenter:
Informatica PowerCenter is an ETL(Extract Transform and Load), which
can extract information from various source systems such as Database,
Flat files, XML, etc., and transforms the extracted information according to
the requirement and loads it into a target that can be again the Database,
Flat files, XML, etc.

Here are some of the most asked interview questions when it comes to
Informatica PowerCenter:

1. How can dynamic files be created as a target in


Informatica?
To generate multiple flat files, we can use TCT (transaction control
transformation) transformation. Informatica provides a special port,
"FileName" in the target file definition. So, we can go to the target designer
and edit the file definition, then click on the button (which is on the right-
side-up corner) to add the special port, this can be connected to the
expression transformation which is used to generate the appropriate name.

2. How can joins be performed in Informatica


without using the joiner transformation?
This can be done using the source qualifier transformation by using the
"User Defined Join" property.

3. What is meant by 'Output is deterministic' and


'Output is repeatable' in Informatica?
'Output is deterministic' is specified only when the source output does not
change between session runs. 'Output is repeatable' is specified only when
the order of the source output is the same as the session runs. Generally,
the Integration Service stages the source data for recovery. When these
options are set, the source data is not staged for recovery.

4. What is meant by Session Recovery Strategy in


Informatica?
Session Recovery Strategy is used when a session fails and needs to
restart from where it failed. Informatica uses an inbuilt table
OPB_SRVR_RECOVERY to determine the last committed record and
starts again from the next record. The session also needs to be configured
for this. To do this:
Go to session --> Properties --> For attribute 'Recovery Strategy' set value
to 'Resume from the Last checkpoint'
The commit strategy also matters here. If the commit interval is 500 records
and the transaction fail at the 1100th record, then the new run will start
from the 1001th record.

5. What is the difference between a variable and a


parameter in Informatica?
A parameter in Informatica is one for which the value is specified in a
parameter file and that value cannot be changed during the run of that
session. A variable, on the contrary, is the one whose value can change
during the session run.

6. What is the difference between a Static cache and


a dynamic Cache?
 Static Lookup cache:
o When the Dynamic Lookup Cache lookup property is not selected the
cache is static and the data in the cache will stay the same for the
entire session. PowerCenter does not update the cache while it
processes the transformation.
 Dynamic Lookup cache:
o When the Lookup Caching Enabled and Dynamic Lookup Cache
lookup properties are selected the lookup cache is dynamic.
o Each time the Server executes the session, whenever a row is
inserted, updated or deleted to or from a target table, the cache will
also be updated.

7. Under what situation do we use dynamic lookup


transformation?
Dynamic lookup transformation is used when we have to consider changed
data on the table which is been looking upon during the session run like
updating the details of the table. Example: SCD type II.

8. When Dynamic Cache is selected, a new default


port will be created. What does the new port do?
The cache will be getting updated if any changes in the lookup table during
the session run a new default port will be created - 'newlookuprow', it will
generate values - (0 - no change, 1 - a new record, 2 - changed record)
based on these values the UPD transformation can act further.

 0 = Integration Service does not update or insert the row in the


cache.
 1 = Integration Service inserts the row into the cache.
 2 = Integration Service updates the row in the cache.

9. What are the different types of ports in


Expression transformation?
There are 3 type of ports:

 Input,
 Output
 Variable

10. In an expression transformation, there are 2


variable ports, in the beginning, 1 output and 1
input. What is the sequence of execution of these
ports?
The sequence will be: Input -->Variable1 -->Variable2-->Output

11. What is MD5 function?


MD5 (Message Digest Function) is a hash function in Informatica which is
used to evaluate data integrity. The MD5 function uses Message-Digest
Algorithm 5 (MD5) and calculates the checksum of the input value. MD5 is
a one-way cryptographic hash function with a 128-bit hash value.
MD5 returns a 32 characters string of hexadecimal digits 0-9 & a-f and
returns NULL if the input is a null value.
12. A row has 5 columns. How do you take 5 ports
i.e. single row and create 5 rows from it?
This can be done using normalizer transformation (Gk and GCID port will
be created, which are inbuilt)

13. What is meant by the Target load plan in


Informatica?
When we have multiple targets to load data to in Informatica, we can
decide which target to load first and which to load next. This can be helpful
in case the two or more targets have a primary key or foreign key
relationship between them.
To do this: go to Mappings--> Target Load Plan--> Select a Target Load
Plan.

14. While comparing the source with the target,


when there is a need to avoid duplicate rows coming
from the source, then which lookup should be
used?
Dynamic Lookup should be used, as it creates a dynamic cache of the
target which changes as the rows are processed. In this case, rows can be
determined for insert or update. A normal lookup creates a static cache of
the target.

15. Reading data from huge (100 Million rows) flat


file, the target is oracle table which will have either
insert or update on target. Oracle has correct
indexes. What are the performance things you look
for from Informatica to improve performance?
Dropping the indexes before the session run and re-creating the indexes
once the session completes.

16. What are the different partitioning methods in


Informatica?
Database partitioning: This can be used with Oracle or IBM DB2 source
instances on a multi-node tablespace or with DB2 targets. It reads
partitioned data from the corresponding nodes in the database. The
PowerCenter Integration Service queries the IBM DB2 or Oracle system for
table partition information.
Hash partitioning: This can be used when you want the PowerCenter
Integration Service to distribute rows to the partitions by the group. For
example, you need to sort items by item ID, but you do not know how many
items have a particular ID number.
You can use the following types of hash partitioning:

 Hash auto-keys: A compound partition key is created by the


PowerCenter Integration Service using all grouped or sorted ports.
You may need to use hash auto-keys partitioning at rank, sorter, and
unsorted aggregator transformations.
 Hash user keys: A hash function is used by the PowerCenter
Integration Service to group rows of data among partitions. You
define the number of ports to generate the partition key.
 Key range: The PowerCenter Integration Service passes data to
each partition depending on the ranges you specify for each port.
One or more ports can be used to form a compound partition key.
Use key range partitioning where the sources or targets in the
pipeline are partitioned by key range.
 Pass-through: Choose pass-through partitioning where you want to
create an additional pipeline stage to improve performance, but do
not want to change the distribution of data across partitions. The
PowerCenter Integration Service passes all rows at one partition
point to the next partition point without redistributing them.
 Round-robin: This can be used so that each partition processes rows
based on the number and size of the blocks. The PowerCenter
Integration Service distributes blocks of data to one or more
partitions.

17. How does Update Strategy work in Informatica?


Update strategy transformation is used to insert, update, and delete
records in the target table. It can also reject the records without reaching
the target table. It can also flag rows in mapping with update strategy as
below:
DD_INSERT: Numeric value is 0. Used for flagging the row as Insert.
DD_UPDATE: Numeric value is 1. Used for flagging the row as Update.
DD_DELETE: Numeric value is 2. Used for flagging the row as Delete.
DD_REJECT: Numeric value is 3. Used for flagging the row as Reject.
In the session level have to set, 'Treat target rows as - Data Driven'
18. While doing Insert and Update on Target table,
Update Strategy is poor in performance. Without
using update strategy, how do you perform Insert
and Updates? How do you design the mapping?
We can create two mappings. One for inserting the new records and,
another one is for updating the existing record, In which have to connect
the key column and columns which have to get updated of the target table.
In the session only update the target rows have to check.

19. Out of these operations: like constants for insert,


update, delete and reject in update strategy, what
does expression of an update strategy leads to?
It will perform a corresponding operation.

20. If the update strategy evaluates a record to


Reject, what will Informatica do?
It will block that record

21. How do you improve the performance of a Joiner


Transformation?
We use sorted input by choosing the table with fewer records as Master
Table.

22. What is Transaction Control transformation?


Transaction Control Transformation. It is used to control the commit and
rollback of transactions. The following built-in variables are available:

 TC_CONTINUE_TRANSACTION: The Integration Service does not


perform any transaction change for this row. This is the default value
of the expression.
 TC_COMMIT_BEFORE: The Integration Service commits the
transaction, begins a new transaction, and writes the current row to
the target. The current row is in the new transaction.
 TC_COMMIT_AFTER: The Integration Service writes the current row
to the target, commits the transaction, and begins a new transaction.
The current row is in the committed transaction.
 TC_ROLLBACK_BEFORE: The Integration Service rolls back the
current transaction, begins a new transaction, and writes the current
row to the target. The current row is in the new transaction.
 TC_ROLLBACK_AFTER: The Integration Service writes the current
row to the target, rolls back the transaction, and begins a new
transaction. The current row is in the rolled-back transaction.

23. What is the difference between a Reusable


transformation and a Mapplet?
Reusable transformation is a Single transformation that can be reused in
many mappings.
Mapplet is a group of transformations that forms particular logic, which can
be used in many mappings.

24. Can you have a Mapplet that reads data from the
source and expression transformation and writes
data to the target?
We can add source definitions that act as a source of data for our mapplet.
We can add as many sources as we want. Another way to feed data
through a mapplet is with an input transformation. Mapplet can have an
expression transformation. The output of a mapplet cannot be connected to
any target table.
You cannot include the following objects in a mapplet:

 Normalizer transformations.
 Cobol sources.
 XML Source Qualifier transformations.
 XML sources.
 Target definitions.
 Pre- and post-session stored procedures.
 Other mapplets.

25. You design a Mapplet, when I drag a Mapplet to a


mapping, what ports of the Mapplet are visible?
What if we have to pass data to a Mapplet from
Mapping? What are the Output ports?
Input and output.

26. What are the different types of tasks?


Different types of tasks include:

1. Assignment- Used to assign a value to a workflow variable.


2. Command -Used to run a shell command during the workflow.
3. Control -Used to stop or abort the workflow.
4. Decision -Tells a condition to evaluate.
5. Email- Used to send an email during the workflow.
6. Event-Raise -Notifies the Event-Wait task that an event has occurred.
7. Event-Wait - It waits for the event to complete in order to start the
next task.
8. Session - Used to run the mapping created in Designer buy linking to
session.
9. Timer - It waits for an already timed event to start.

27. Session Task is nothing but a mapping. I design


a mapping, somebody is using it in a workflow. Can
you name a few properties of mapping that we can
override at the session level?
Some of the properties of mapping that we can override are:

 Table names.
 Properties to treat the target rows (insert, update).
 Joining condition.
 Filter condition.
 Source qualifier overrides.

28. What is a Control Task in a workflow?


Control tasks can be used to stop or abort the workflow.

29. If you run a Workflow and it fails, how would you


investigate the failure?
We can do this by looking at the session and wf logs.

30. How do you access the Session or Workflow


log?
We can access sessions and wf logs on the monitor.
31. What are the sections present in creating
Workflow Parameter file? What is the type of
information present?
Sample structure:
[Global]
[Folder_Name.WF:wf_name]
$param1 =
It might contain - log file name, some parameter values like data which has
to be passed, connection strings.
It will contain the information which will remain the same for current session
run.

32. How do you execute a workflow from a Unix


shell script?
Using PMCMD command:
pmcmd startworkflow -sv ${INTEGRATION_SERVICE} -d $
{DOMAIN_NAME} -u ${INFA_USR} -p ${INFA_PWD} -f $
{FOLDER_NAME} -wait ${WORKFLOW_NAME}

33. How to use mapping variable be used in a


workflow?
A mapping variable can be assigned to the workflow variable at the
workflow manager.
To do this: Go to Session→ Edit→Components→Pre/Post session variable
assignment.

34. How to use session name in a command task in


a workflow?
This can be done with the help of $PMSessionName.
Example: echo “Session: $PMSessionName” >> Dummy.txt

35. How to use session name and workflow name in


email task without hardcoding?
This can be done by using the %s for session and %w for workflow.
36. We have a scenario where we want our session
to stop processing further when the 2nd error is
detected while running it. How to achieve this?
There is a session level property: Stop on errors
Set this value to 2.

37. What type of sessions allow the variable


assignment from mapping to a workflow?
Only the non-reusable sessions. For Re-usable sessions, we cannot assign
the mapping variable to the workflow variable.

38. What are different types of tracing levels in


Informatica 10.1.1?
 Note: Applicable only at the session level. The Integration Service
uses the tracing levels configured in the mapping.
 Terse: logs initialization information, error messages, and notification
of rejected data in the session log file.
 Normal: Integration Service logs initialization and status information,
errors encountered and skipped rows due to transformation row
errors. Summarizes session results, but not at the level of individual
rows.
 Verbose Initialization: In addition to normal tracing, the Integration
Service logs additional initialization details; names of index and data
files used, and detailed transformation statistics.
 Verbose Data: In case of Verbose data, in addition to the verbose
initialization tracing, the logs of each row that passes into the
mapping is kept by the Integration Service. It also logs where the
Integration Service truncates string data to fit the precision of a
column and provides detailed transformation statistics. The
Integration Service writes row data for all rows in a block when it
processes a transformation when you configure the tracing level to
verbose data.

39. How can the ‘not exists’ operator be


implemented in Informatica?
Implementing the Not Exists operator is very easy in Informatica. For
example: If we want to get only the records which are available in table A
and not in table B, we use a joiner transformation with A as master and B
as detail. We specify the join condition and in the join type, select detail
outer join. This will get all the records from A table and only the matching
records from B table. Connect the joiner to a filter transformation and
specify the filter condition as B_port is NULL. This will give the records
which are in A and not in B. Then connect the filter to the target definition.

40. If a parameter file path is defined at both the


session level as well as at the workflow level, which
path will be taken while running the workflow?
The workflow-level parameter file is picked up irrespective of whether a
parameter file is defined at the session level or not.

41. When do we select a connected/unconnected


transformation in Informatica?
Unconnected Lookup should be selected when the same Lookup has to be
performed at multiple places. This is a kind of reusable lookup which is
used as a function in any transformation using LKP Expression. It uses the
only static cache.
A connected lookup should be used when the same lookup need not be
used at multiple places. It is the part of the data flow. It uses both static and
dynamic cache.

42. When we right click on a running session from


the workflow monitor, Stop and Abort options are
available. What is the difference between both of
them?
Stop option just makes the integration service stop taking input from the
source but continues to process the records which are being processed to
go to the target.
Abort option makes the integration service stop not only taking the records
from the input but also stops the in-process records.

43. What is the scenario which compels the


Informatica server to reject files?
This happens when it faces DD_Reject in update strategy transformation.
Moreover, it disrupts the database constraint filed in the rows was
condensed.
44. How can we keep last 20 session logs in
Informatica?
Go to Session→ Properties→Config Object→ Log Options
Set these 2 properties:
Save Session Logs by Session runs.
Save Session Log for these runs: 20

45. How can we delete duplicate rows from flat files?


We can make use of sorter transformation and select distinct option to
delete the duplicate rows.

46. Under what condition selecting Sorted Input in


aggregator may fail the session?
If the input data is properly sorted, the session may fail if the sort order by
ports and the group by ports of the aggregator are not in the same order.

47. How does Sorter handle NULL values?


We can configure the way the sorter transformation treats null values.
Enable the property Null Treated Low if we want to treat null values as
lower than any other value when it performs the sort operation. Disable this
option if we want the Integration Service to treat null values as higher than
any other value.

48. How does rank transformation handle string


values?
Rank transformation can return the strings at the top or the bottom of a
session sort order. When the Integration Service runs in Unicode mode, it
sorts character data in the session using the selected sort order associated
with the Code Page of IS which may be French, German, etc. When the
Integration Service runs in ASCII mode, it ignores this setting and uses a
binary sort order to sort character data.

49. What is a Dimensional Model?


 Data Modeling: It is a process of designing the database by fulfilling
business requirements specifications.
 A Data Modeler (or) Database Architect Designs the warehouse
Database using a GUI based data modeling tool called “ERWin”.
 ERWin is a data modeling tool from Computer Associates (A).
 A dimensional modeling consists of following types of schemas
designed for the data warehouse:
o Star Schema.
o Snowflake Schema.
o Galary Schema.

 A schema is a data model which consists of one or more tables.

50. What are the different ways of debugging a code


in Informatica?
 Use a debugger at the mapping level. A debugger gives you row by
row data.
 Use the workflow/session logs.
 Use the create and add target after any transformation which you
want to check for data.
 Change the tracing level at session level/transformation level to
verbose data.

Important Tips to remember while


preparing for an interview:
 The Interview questions of any ETL tool like Informatica
PowerCenter/ IBM Datastage mainly consist of 3 type of questions:
o Theoretical Questions related to the tool.
o Scenario-Based Questions related to the tool.
o How to implement a particular functionality through the tool and with
SQL/Unix(whichever applicable). The basic knowledge of Unix and
SQL is very essential for clearing any ETL interview.
 A lot of questions are generally asked about Automation, which you
may have implemented in your project using Informatica
PowerCenter.
o For Example Export and import Automation, which can be done using
Unix and PowerCenter.
 Thoroughly go through all the properties at the session level,
workflow level, and mapping level, a lot of questions are asked about
that.
 Be prepared for General Questions like ‘What is the most complex
requirement you have implemented in Informatica PowerCenter’ or
‘What is the most complex bug you have faced and how did you fix
it’.

All the Best!!

You might also like