Informatica Interview Questions
Informatica Interview Questions
1.How to check if the target definition is already used by any other mappings
Ans: Right click on target definition → Dependencies → Unselect all → select mapping → ok and
then we will be able to see in which all mapping is the target used.
1. Load data from Flat file to Flat file, CSV to flat, CSV to CSV
1) How do you manage comma separated data in delimiter if there is comma in data
also example address column while importing Flat files in source analyzer?
Using double quotes
2. Filter Transformation
• It is connected and active transformation
• It is used to filter the rows/records based on conditions
• The rows/records which do not satisfy the condition will be dropped
• If Filter condition is TRUE then it will pass all records
• We can use multiple filters
3. Router Transformation
• It is connected and active transformation
• It is similar to Filter Transformation
• With Single transformation we can connect to multiple targets with conditions
• The rows/records which do not satisfy the condition will be in default group.
4. Sorter Transformation
• It is connected and active transformation
• It is Active transformation because we can select the distinct Option in sorter
property.
• It is used to sort data in the ascending and descending order.
Transformation Type Description
Aggregator Active Connected Performs aggregate calculations.
Expression Passive Connected Calculates a value.
Active Connected or Executes user logic coded in Java. The bytecode for
Java
Passive Connected the user logic is stored in the repository
Joins data from different databases or flat file
Joiner Active Connected
systems.
Active Connected or
Passive Connected or
Lookup and return data from a flat file, relational
Lookup Active Unconnected
table, view, or synonym.
or Passive
Unconnected
Used in the pipeline to normalize data from
Normalizer Active Connected
relational or flat file sources.
Rank Active Connected Limits records to a top or bottom range.
Routes data into multiple transformations based on
Router Active Connected
group conditions.
Active Connected or
SQL Executes SQL queries against a database.
Passive Connected
Merges data from different databases or flat file
Union Active Connected
systems.
Reads data from one or more input ports and
XML Generator Active Connected
outputs XML through a single output port.
Reads XML from one input port and outputs data to
XML Parser Active Connected
one or more output ports.
XML Source Represents the rows that the Integration Service
Active Connected
Qualifier reads from an XML source when it runs a session.
5. Types of Joins in Joiner Transformation
• We can use joiner, if we want to join the data sources. Use a joiner and use the matching
column to join the tables.
• We can also use a Union transformation, if the tables have some common columns and we
need to join the data vertically. Create one union transformation add the matching ports
form the two sources, to two different input groups and send the output group to the
target.
The basic idea here is to use, either Joiner or Union transformation, to move the data from
two sources to a single target. Based on the requirement, we may decide, which one should
be used.
• Union transformation does not remove duplicates. To remove the duplicate rows, use sorter
transformation with "select distinct" option after the union transformation.
• The union transformation does not generate transactions.
• You cannot connect a sequence generator transformation to the union transformation.
12. Why Union transformation is active transformation? Important
Ans: Union Transformation is created by Custom Transformation that’s why is active transformation.
Row Indicators
The first column in the reject file is the row Indicator
Column Indicator
After the column indicator is a column indicator, followed by the first column of data and
another column indicator
Column indicator appears after every column of data and defined the type of data preceding it.
By default, rejected rows are written into reject and session log.
If it’s not Data Driven then first preference will be given to session level only.
Difference between Mapping level Update Strategy and Session Level Update strategy
Mapping level Update Strategy – We can Insert, Update, Delete and Reject specific records based
on conditions.
Session Level Update strategy – Here when marked it will Insert, Update, Delete and Reject all
records that means we cannot Insert, Update, Delete and Reject based on conditions.
14. Lookup Transformation
• It is similar to Joiner Transformation
• It is used to lookup the data from Source, Target, Source Qualifier & Relational Database.
• There will be two tables in Lookup Transformation one is called Input table and another one
is called Lookup table.
Types of Lookups: -
a) Based on rows: -
1) Passive (default): - no of output rows always equal to no of input rows
2) Active: - no of output rows always not equal to no of input rows
• Un- cached lookup– Here, the lookup transformation does not create the cache. For each
record, it goes to the lookup Source, performs the lookup and returns value. So for 10K
rows, it will go the Lookup source 10K times to get the related values.
• Cached Lookup– In order to reduce the to and fro communication with the Lookup Source
and Informatica Server, we can configure the lookup transformation to create the cache. In
this way, the entire data from the Lookup Source is cached and all lookups are performed
against the Caches.
Based on the types of the Caches configured, we can have two types of caches, Static and Dynamic.
The Integration Service performs differently based on the type of lookup cache that is configured.
The following table compares Lookup transformations with an uncached lookup, a static cache, and
a dynamic cache:
Persistent Cache
By default, the Lookup caches are deleted post successful completion of the respective sessions but,
we can configure to preserve the caches, to reuse it next time.
Shared Cache
We can share the lookup cache between multiple transformations. We can share an unnamed cache
between transformations in the same mapping. We can share a named cache between
transformations in the same or different mappings.
Cache and Uncache working are same no difference, result is same only performance will be
improved.
To improve the performance, Informatica will read all the data from lookup table and then load it
into the cache first, then read one by one from the input source and compare directly with the
cache. Result will be loaded to target and in this way the performance will be improved.
Data from condition column of lookup table is stored in Index cache and data from remaining
column of lookup will be stored in Data cache.
Lookup Properties
Joiner and Lookup both will join, both will do left outer join which one is
better?
Ans: Joiner is always cache that is the data from Master is stored in cache and in Lookup the data
from lookup is stored in cache.
Data cache and Index cache are there in both but in lookup we are not going to select types left
outer and right outer option is not there and by default its left outer.
Persistence Cache
Persistence Cache is created in the Disk at the end of the session and the content from
static/Dynamic cache is written into the Disk, this file is called persistence cache.
The data written into the disk at the end of the session is called Persistence Cache (Permanent
Cache).
While using Persistence Cache, in the 1st run static cache is created by lookup and in the 2nd run
static cache is created by Persistence Cache
Do not go for Persistence Cache if the lookup table values are changing more often as it slows down
the performance. Enable only if the lookup table values do not change often.
Persistence Cache is used to improve the performance as it is created in system Disk itself.
Disadvantage of Persistence Cache is that it won’t get updated if the lookup table data is updated.
Static Cache is created in the RAM. At run time data present in cache is not going to change.
Dynamic Cache is created in RAM. At run time data present in cache is going to change. It is faster in
Type 1 mapping. We can use Dynamic Cache in type 1 and not in type 2.
Static Cache is created in the RAM and Persistence Cache is created in the Disk
Difference between static and dynamic cache?
Lookup Query Override
Generate SQL
Unconnected Lookup
It is not connected to the pipeline
Unconnected lookup has only one return port and returns one column from each row.
The major advantage of unconnected lookup is its reusability. We can call an unconnected lookup
multiple times in the mapping unlike connected lookup.
We can use the unconnected lookup transformation when we need to return the output from
a single port.
Unconnected doesnot participate in the dataflow so informatica server creates a seperate cache
for unconnected and processing takes place parallely. so performance increases.
The major advantage of unconnected lookup is its reusability. We can call an unconnected lookup
multiple times in the mapping unlike connected lookup.
We can use the unconnected lookup transformation when we need to return the output from
a single port.
Unconnected does not participate in the dataflow so informatica server creates a separate cache
for unconnected and processing takes place parallelly. so, performance increases.
Pipeline Lookup
ABORT command has a timeout period of 60 seconds. If the integration service cannot finish
processing data within the timeout period, it kills the DTM process and terminates the session
1. How do you Load comma delimited Source File to || delimited Target File?
Ans: We can do this while importing the target definition in Target analyzer by change
the delimit option to ||or else in the workflow → Mapping → Set File Properties →
Advanced →Column Delimiter
2. There are 2 tables from 2 different sources EMP and Dept with PK and FK relationship
In Dept table I have a column called Employee_Count which is not coming from source
table, I have to load this Employee_Count from EMP table but as per my client request,
I’m not supposed to join or lookup EMP table and I’m supposed to Employee_Count in
dept table.
Can you develop a one single mapping?
3. Set variable and set max?
4. Have you worked on the SLA where if SLA is not met and paid the fine?
5. How can we optimize joiner Transformation?
• When joining between two data sources, treat the data source containing less
number of records as Master. This is because the Cache size of the Joiner
transformation depends on master data (unless sorted input with the same
source is used).
• Ensure that both the master and detail input sources are sorted and both
“Sorted Input” and “Master Sort Order” ports are checked and set
Ensuring the input data is sorted is absolutely must in order to achieve better
performance and we will soon know why.
7. Their favorite question is difference between un connected and connected look up.
1. It filters rows while reading the data from a source. 1. It filters rows from within a mapped data.
2. Can filter rows only from relational sources. 2. Can filter rows from any type of source system.
3. It limits the row sets extracted from a source. 3. It limits the row set sent to a target.
4. It enhances performance by minimizing the number 4. It is added close to the source to filter out the
of rows used in mapping. unwanted data early and maximize performance.
5. In this, filter condition uses the standard SQL to 5. It defines a condition using any statement or
execute in the database. transformation function to get either TRUE or FALSE.
Filter Router
3. Single Input and Single Output 3. Single Input and Multiple Output
4. We cannot get the non-satisfied 4. We can get the get the non-
data/ result satisfied data/ result in default
column
17. I have to run a workflow on daily basis at 6.30 pm in the evening. What can we do?
18. In SQL they asked about indexes and partition.
19. Also, they asked me about mapplet and mappings
20. Difference between partition and group by
21. Types of errors in Informatica
b) Fatal Errors
A fatal error occurs when the Integration Service cannot access the source,
target, or repository. This can include loss of connection or target database
errors, such as lack of database space to load data. If the session uses a
Normalizer or Sequence Generator transformation, the Integration Service
cannot update the sequence values in the repository, and a fatal error
occurs.
If the session does not use a Normalizer or Sequence Generator
transformation, and the Integration Service loses connection to the
repository, the Integration Service does not stop the session. The session
completes, but the Integration Service cannot log session statistics into the
repository.
You can stop a session from the Workflow Manager or though pmcmd.
You can abort a session from the Workflow Manager. You can also use the
ABORT function in the mapping logic to abort a session when the Integration
Service encounters a designated transformation error.
Shivraju gm 9500 - 8861327737
Contains the fact tables and the Contains the fact tables, dimension tables as well as
1. dimension tables. sub dimension tables.
4. It has less number of foreign keys. While it has more number of foreign keys.
Snowflake Schema
HCL- Sourab
Wipro
Birlasoft
FIS Global
Deloit
MicroLand
Capgemini 1
Ans: I am working for a project called GTSC where in play a role of Informatica developer my
responsibility here is to load the data from Staging area to the respective dimension and fact table
implementing the logic whatever is mentioned in the expect a mapping sheet provided by our data
modeller
we will be having a daily scrum call where Our scrum master will be as having the ticket to US and
only after having the ticket we will be getting to know on which particular table we will be working
upon and for the respective documentation we will be looking into the SharePoint where the
documentations will be uploaded by the modulus and then we will get to know what is our source
table target table and the columns that we need to map and the logic that we have to implement as
I told you I am call responsible for the data to be loaded with respect to dimensions and fact and If in
case I have any issues in terms of in availability of the target table or anything else technically I will
be getting in touch with the respective teams like the models are the business analyst to understand
more on the subject and requirement and will be working towards fulfilling the same
so we work based on the report specific requirement where our client says for a for this particular
report we are in need of these a number of source columns from a specific source table based on
which that expects will be created and on top of that we will be working on creating or editing the
existing mapping to make sure the required columns are available in the respective dimensions and
facts
Having this as my regular primary roles and responsibility we also work on the other Quest when and
where required this also request will be created by Our scrum master and will get event wise it
might be something like writing a simple SQL query for designing a mapping for doing a production
support or something like that
13. Do you think if you implement SCD Type 1 for 50 GB of data will it work as expected?
14. Help me understand your approach of design to implement this pipeline, explain me the
flow of mapping so that I can evaluate your design approach
15. What are the different types of transformations that you have used?
16. What is the prerequisite for using update transformation?
17. Tell me how the aggregator transformation works?
18. Difference between filter and router?
19. How to find the second highest salary from an employee table using joiners?
Capgemini 2 Interview
1. I have a very simple mapping source qualifier expression and target I am trying to
read a table of 50 rows and insert it into a database table after the workflow is
completed, I only see one row in the target table remaining 49 rows are dropped so
when I run the debugger against the job I see that all the rows are coming until
source qualifier are fine and also passing to the expression is fine. From expression
transformation, only one row comes out remaining 49 rows gets dropped out of
expression transformation and do not even pass to the target what do you think is a
problem here what would be happening
2. I have a file with header 10 columns and has 5 Data rows with values for all 10
columns separated by comma After the five rows, there are five more data rows and
the value is only from the column 1 to 5 Again separated by comma but after the fifth
column there is no comma, no nulls defined for the remaining five columns
Q: How do you define a source analyzer in a way that it it can read or 10 rows at the same
time without dropping anything using a single source definition and then write it to a target
database with 10 columns
FIS Global
1. Can you tell me the roles and responsibilities of your current project?
2. Print the name Jags as J a g s that is one letter below another letter?
3. How many numbers of A in the string Jagadish?
4. What is the result of Left Join, Right Join, and Full Join for the given tables?
5. How to get the cumulative sum of salary in a given table?
6. Sources is coma De-limited flat-file target is also flat-file now you have to load the coma de-
limited data to your target which is pipe-delimited How do you load that?
7. I have two tables employee table and department table both tables share foreign key and
primary key relationship I have to develop one single mapping which loads both the table
The employee table is loaded from the employee source and the department table is loaded
from the department source
In the department table, I have a column called count which is not coming from my source
table so I have to load this employee count from my employee table but as per my client's
request I am not supposed to do a lookup or use the joiner to the employee table and
department table ok so without using joiner or lookup I am still supposed to load the
employee count in the target table is there a chance that I can do this?
8. I have two tables employee table and department table and I have 2 mappings one mapping
load employee table and another mapping loads department table So how can I get the
employee count with two different mappings
9. how many functions have you used while working on expression transformation?
10. What is set variable and set max variable?
11. Did your project have any SLAs?
12. What are the different ways to remove duplicates in Oracle database?