0% found this document useful (0 votes)
83 views28 pages

Informatica Iics

The document outlines the professional experience of an individual with 3.6 years in Data Warehousing (DWH) using Informatica, primarily in the pharma and manufacturing domains. Their responsibilities include developing ETL mappings, preparing technical specifications, conducting unit testing, and collaborating with production support teams. Additionally, it discusses their current project involving Oracle applications for managing RFQs, the use of materialized views for reporting, and various SQL and PL/SQL concepts related to performance tuning and data management.

Uploaded by

nachiket dhabale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
83 views28 pages

Informatica Iics

The document outlines the professional experience of an individual with 3.6 years in Data Warehousing (DWH) using Informatica, primarily in the pharma and manufacturing domains. Their responsibilities include developing ETL mappings, preparing technical specifications, conducting unit testing, and collaborating with production support teams. Additionally, it discusses their current project involving Oracle applications for managing RFQs, the use of materialized views for reporting, and various SQL and PL/SQL concepts related to performance tuning and data management.

Uploaded by

nachiket dhabale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Tell me about yourself and current work experience?

 Total I have around 3.6 Years of experience in DWH using informatica tool in development.Primarily I worked
on PHARMA and manufacturing domains. (Banking or Sales Domains as per your project).
 In my Current project my roles & responsibilities are basically, I am working with onsiteoffshore model so
we use to get the tasks from my onsite team.
 As a developer first I need to understand the physical data model i.e., dimensions and facts;their relationship &
also functional specifications that tells the business requirement designed by Business Analyst.
 I involved into the preparation of source to target mapping sheet (tech Specs) which tell uswhat is the source
and target and which column we need to map to target column and alsowhat would be the business logic. This
document gives the clear picture for the development.
 Creating informatica mappings, sessions and workflows using different transformations toimplement business
logic.
 Preparation of Unit test cases also one of my responsibilities as per the businessrequirement.
 And also involved into Unit testing for the mappings developed by myself.
 I use to source code review for the mappings & workflows developed by my team members.
 And also involved into the preparation of deployment plan which contains list of mappingsand workflows they
need to migrate based on this deployment team can migrate the code from one environment to another
environment.
 Once the code rollout to production we also work with the production support team for 2 weeks where we
parallel give the KT. So, we also prepare the KT document as well for theproduction team.

Coming to My Current Project:


Currently I am working for XXX project for YYY client. Generally, YYY does not have a manufacturing unit, What BIZ
(Business) use to do here is before quarter ends they will call for quotations for primary supply channels this process we
called as a RFQ’s (Request for quotations). Once BIZ createsRFQ automatically notification will go to supply channels .So
these supply channels send back their respective quoted values that we called it as response from the supply channel.
After that biz will start negotiations with supply channels for the best deal then they approve the RFQ’s.
All these activities (Creating RFQ, supplier response and approve RFQ etc.) Performed in the oracleapps this is source
frontend tool application. These data which get stored into OLTP. So, the OLTP contains all the RFQs, supplier response
and approval status data.
We have some Oracle jobs running between OLTP and ODS which replicate the OLTP data to ODS. Itis designed in such a
way that any transaction entering into the OLTP is immediately reflected into the ODS.
We have a staging area where we load the entire ODS data into staging tables for this we have created some ETL
informatica mappings these mappings will truncate and reload the staging tablesfor each session run. Before loading to
staging tables, we are dropping indexes then after loading bulk data we are recreating indexes using store procedures.
Then we extract all this data from stage & load it into the dimensions & facts on top of dims and facts we have created
some materialized views as per the report requirement. Finally report directlypulls the data from MV. These reports
/dashboards performance always good because we are not doing any calculation at reporting level. These
dashboards/reports can be used for the analysis purpose like say how many RFQs created, how many RFQs approved, how
many RFQs got respondedfrom the supply channels?

Page 1 of 21
In the present system they don’t have the BI design, so they are using the manual process by exporting the sql query data
to excel sheet a preparing PI charts using macros. In the new system weare providing BI passion reports like drill down, drill
ups, PI charts, Graphs, detail reports and Dashboards.

ORACLE
How strong you are in SQL& PL/SQL?
1) I am good in SQL; I use to write the source qualifier queries for informatica mappings as per thebusiness
requirement.
2) I am comfortable to work with joins; co related queries, sub queries, analyzing tables, inline viewsand materialized
views.
3) As an informatica developer I could not get more opportunity to work on pl/sql side. But I workedon PL/SQL to
informatica migration project so I do have exposure on procedure, function and triggers.
What is the difference between view and materialized view?
View Materialized view
A view has a logical existence. It does not A materialized view has a physical existence.
contain data.
Its not a database object. It is a database object.
We cannot perform DML operation on view. We can perform DML operation on materialized
view.
When we do select * from view it will fetch the When we do select * from materialized view it
data from base table. will fetch the data from materialized view.
In view we cannot schedule to refresh. In materialized view we can schedule to
refresh.
We can keep aggregated data into materializedview.
Materialized view can be created based
on multiple tables.
Materialized View
Materialized view is very essential for reporting. If we don’t have the materialized view, it will directly fetch the data from
dimension and facts. This process is very slow since it involves multiple joins. So, the same report logic if we put in the
materialized view. We can fetch the data directly from materialized view for reporting purpose. So that we can avoid
multiple joins at report run time.It is always necessary to refresh the materialized view. Then it can simply perform select
statement on materialized view.
Difference between Trigger and Procedure?
Triggers Stored Procedures
In trigger no need to execute manually. Triggerswill be Where as in procedure we need to executemanually.
fired automatically.
Triggers that run implicitly when an INSERT, UPDATE, or
DELETE statement is issued against
the associated table.

Differences between sub-query and co-related sub-query?


Sub-query Co-related sub-query
A sub-query is executed once for the parent Whereas co-related sub-query is executed once
Query for each row of the parent query.
Example: Example:
Select * from emp where deptno in (selectdeptno Select a.* from emp e where sal >= (select avg(sal) from
from dept);
emp a where a.deptno=e.deptnogroup by a.deptno);

Page 2 of 21
Differences between where clause and having clause?
Where clause Having clause
Both where and having clause can be used to filter the data.
Where as in where clause it is not mandatory. But having clause we need to use it with thegroup by.

Where clause applies to the individual rows. Whereas having clause is used to test somecondition
on the group rather than on
individual rows.

Where clause is used to restrict rows. But having clause is used to restrict groups.
Restrict normal query by where Restrict group by function by having
In where clause every record is filtered basedon where. In having clause, it is with aggregate records(group by
functions).
Differences between stored procedure and functions
Stored Procedure Functions
Stored procedure may or may not returnvalues. Function should return at least one output
parameter. Can return more than one
parameter using OUT argument.

Stored procedure can be used to solve thebusiness Function can be used to calculations
logic.
Stored procedure is a pre-compiled statement. But function is not a pre-compiled statement.
Stored procedure accepts more than one Whereas function does not accept arguments.
argument.
Stored procedures are mainly used to process Functions are mainly used to compute values
the tasks.
Cannot be invoked from SQL statements. E.g.,SELECT Can be invoked form SQL statements e.g.,SELECT

Can affect the state of database using commit. Cannot affect the state of database.
Stored as a pseudo-code in database i.e.,compiled Parsed and compiled at runtime.
form.

Differences between rowid and rownum


Rowid Rownum
Rowid is an oracle internal id that is allocated every time Rownum is a row number returned by a select
a new record is inserted in a table. This ID is unique and statement.
cannot be changed by the
user.

Rowid is permanent. Rownum is temporary.


Rowid is a globally unique identifier for arow in a The rownum pseudocoloumn returns a number
database. It is created at the time the row is indicating the order in which oracleselects the row
inserted into the table, and from a table or set of joined
destroyed when it is removed from a table. rows.

How to find out duplicate records in table?


Select empno, count (*) from EMP group by empno having count (*)>1;
How to delete a duplicate record in a table?
Delete from EMP where rowid not in (select max (rowid) from EMP group by empno);
What is your tuning approach if SQL query taking long time? Or how do u tune SQL query?
If query taking long time, then First will run the query in Explain Plan, the explain plan process storesdata in the
PLAN_TABLE. it will give us execution plan of the query like whether the query is using the relevant indexes on the joining
columns or indexes to support the query are missing.

Page 3 of 21
If joining columns doesn’t have index then it will do the full table scan if it is full table scan the cost will be more then will
create the indexes on the joining columns and will run the query it should givebetter performance and also needs to
analyze the tables if analyzation happened long back. The ANALYZE statement can be used to gather statistics for a specific
table, index or cluster using ANALYZE TABLE employees COMPUTE STATISTICS;

If still have performance issue, then will use HINTS, hint is nothing but a clue. We can use hints likeALL_ROWS
One of the hints that 'invokes' the Cost based optimizer
ALL_ROWS is usually used for batch processing or data warehousing systems.(/*+ ALL_ROWS
*/)
FIRST_ROWS
One of the hints that 'invokes' the Cost based optimizer
FIRST_ROWS is usually used for OLTP systems.(/*+
FIRST_ROWS */)
CHOOSE
One of the hints that 'invokes' the Cost based optimizer
This hint lets the server choose (between ALL_ROWS and FIRST_ROWS, based on statisticsgathered.
HASH
Hashes one table (full scan) and creates a hash index for that table. Then hashes other table anduses hash index to find
corresponding records. Therefore, not suitable for < or > join conditions.
/*+ use_hash */
Hints are most useful to optimize the query performance.

DWH Concepts

Difference between OLTP and DWH/DS/OLAP ?


OLTP DWH/DSS/OLAP
OLTP maintains only current information. OLAP contains full history.
It is a normalized structure. It is a de-normalized structure.
Its volatile system. Its non-volatile system.
It cannot be used for reporting purpose. It’s a pure reporting system.
Since it is normalized structure so here itrequires Here it does not require much joins to fetch thedata.
multiple joins to fetch the data.
It’s not time variant. Its time variant.
It’s a pure relational model. It’s a dimensional model.

What is Staging area why we need it in DWH?


If target and source databases are different and target table volume is high it contains some millionsof records in this
scenario without staging table, we need to design your informatica using look up tofind out whether the record exists or
not in the target table since target has huge volumes so its costly to create cache it will hit the performance.
If we create staging tables in the target database, we can simply do outer join in the sourcequalifier to determine
insert/update this approach will give you good performance.
It will avoid full table scan to determine insert/updates on target.
And also, we can create index on staging tables since these tables were designed for specificapplication it will
not impact to any other schemas/users.
While processing flat files to data warehousing, we can perform cleansing
Data cleansing, also known as data scrubbing, is the process of ensuring that a set of data is correctand accurate. During
data cleansing, records are checked for accuracy and consistency.

Page 4 of 21
Since it is one-to-one mapping from ODS to staging we do truncate and reload. We can create
indexes in the staging state, to perform our source qualifier best.
If we have the staging area no need to relay on the informatics transformation to known whetherthe record exists or
not.
ODS:
My understanding of ODS is, it’s a replica of OLTP system and so the need of this, is to reduce the burden on production
system (OLTP) while fetching data for loading targets. Hence, it’s a mandateRequirement for every Warehouse.
So, every day do we transfer data to ODS from OLTP to keep it up to date?
OLTP is a sensitive database they should not allow multiple select statements it may impact the performance as well as if
something goes wrong while fetching data from OLTP to data warehouse itwill directly impact the business.
ODS is the replication of OLTP.
ODS is usually getting refreshed through some oracle jobs.

What is the difference between a primary key and a surrogate key?


A primary key is a special constraint on a column or set of columns. A primary key constraint ensuresthat the column(s) so
designated have no NULL values, and that every value is unique. Physically, a primary key is implemented by the database
system using a unique index, and all the columns in theprimary key must have been declared NOT NULL. A table may have
only one primary key, but it may be composite (consist of more than one column).
A surrogate key is any column or set of columns that can be declared as the primary key instead of a “real” or natural key.
Sometimes there can be several natural keys that could be declared as the primary key, and these are all called candidate
keys. So, a surrogate is a candidate key. A table couldactually have more than one surrogate key, although this would be
unusual. The most common typeof surrogate key is an incrementing integer, such as an auto increment column in MySQL,
or a sequence in Oracle, or an identity column in SQL Server.
Have you done any Performance tuning in informatica?
1) Yes, one of my mappings was taking 3-4 hours to process 40 million rows into staging table wedon’t have any
transformation inside the mapping its 1 to 1 mapping.
2) Here nothing is there to optimize the mapping so I created session partitions using key range on effective date column.
It improved performance lot, rather than 4 hours it was running in 30 minutesfor entire 40millions.Using partitions DTM
will creates multiple reader and writer threads.
3) There was one more scenario where I got very good performance in the mapping level. Ratherthan using lookup
transformation if we can able to do outer join in the source qualifier query override this will give you good
performance if both lookup table and source were in the same database. If lookup tables are huge volumes, then
creating cache is costly.
4) And also, if we can able to optimize mapping using less no of transformations always gives yougood performance.
5) If any mapping taking long time to execute then first, we need to look in to source and target statistics in the monitor
for the throughput and also find out where exactly the bottle neck by looking busy percentage in the session log will
come to know which transformation taking more time, if your source query is the bottle neck then it will show in the end
of the session log as “queryissued to database “that means there is a performance issue in the source query. We need to
tune the query using.

How strong you are in UNIX?


1) I have Unix shell scripting knowledge whatever informatica required likeIf we want to
run workflows in Unix using PMCMD.
Below is the script to run workflow using Unix.cd
/pmar/informatica/pc/pmserver/

Page 5 of 21
/pmar/informatica/pc/pmserver/pmcmd startworkflow -u $INFA_USER -p $INFA_PASSWD -s
$INFA_SERVER: $INFA_PORT -f $INFA_FOLDER -wait $1 >> $LOG_PATH/$LOG_FILE

2) And if we suppose to process flat files using informatica but those files were existing in remote server then we have to
write script to get ftp into informatica server before start process those files.
3) And also file watch mean that if indicator file available in the specified location, then we need tostart our informatica
jobs otherwise will send email notification using
Mail X command saying that previous jobs didn’t complete successfully something like that.
4) Using shell script update parameter file with session start time and end time.
This kind of scripting knowledge I do have. If any new UNIX requirement comes then I can Googleand get the solution
implement the same.
What is use of Shortcuts in informatica?
If we copy source definitions or target definition’s or mapplets from Shared folder to any otherfolders that will
become a shortcut.
Let’s assume we have imported some source and target definitions in a shared folder after that weare using those sources
and target definitions in other folders as a shortcut in some mappings.
If any modifications occur in the backend (Database) structure like adding new columns or dropexisting columns either in
source or target I f we reimport into shared folder those new changesautomatically it would reflect in all folder/mappings
wherever we used those sources or target definitions.

How to concat row data through informatica?


Source: Target
Ename EmpNo Ename EmpNo
stev 100
Stev methew 100
methew 100
john 101 John tom 101
tom 101

Ans:
Using Dynamic Lookup on Target table:
If record doesn’t exit do insert in target. If it is already existed then get corresponding Ename valefrom lookup and
concat in expression with current E name value then update the target Ename column using update strategy.
Using Var port Approach:
Sort the data in sq based on EmpNo column then Use expression to store previous record information using Var port after
that use router to insert a record if it is first time if it is already inserted then update Ename with concat value of prev
name and current name value then update intarget.
How to send Unique (Distinct) records into One target and duplicates into another target?Source:
Output: Target_1
Ename EmpNo
Ename EmpNo
stev 100
Stev 100
Stev 100
John 101
john 101
Mathew 102
Mathew 102
Target-2
Ename EmpNo
Stev 100

Page 6 of 21
Ans:
Using Dynamic Lookup on Target table:
If record doesn’t exit do insert in target_1. If it is already existed then send it to Target_2 usingRouter.
Using Var port Approach:
Sort the data in sq based on EmpNo column then Use expression to store previous record information using Var ports
after that use router to route the data into targets if it is first time thensent it to first target if it is already inserted then
send it to Tartget_2.
How to do Dynamic File generation in Informatica?
I want to generate the separate file for every employee (as per Name, it should generate file). It has to
generate 5 flat files and name of the flat file is corresponding employee namethat is the requirement.
Below is my mapping.
Source (Table) -> SQ -> Target (FF)
Source:
Dept Ename EmpNo
A S 22
A R 27
B P 29
B X 30
B U 34
This functionality was added in informatica 8.5 onwards earlier versions it was not there.
We can achieve it with use of transaction control and special "File Name" port in the target file.
In order to generate the target file names from the mapping, we should make use of the special "FileName" port in the target
file. You can't create this special port from the usual new port button.
There is a special button with label "F" on it to the right most corner of the target flat file whenviewed in "Target Designer".
When you have different sets of input data with different target files created, use the same instance,but with a Transaction
Control transformation which defines the boundary for the source sets.
in target flat file there is option in column tab i.e., filename as column.
when you click that one non editable column gets created in metadata of target.
in transaction control give condition as iif(not isnull(emp_no),tc_commit_before,continue) elsetc_commit_before
map the emp_no column to target's filename columnur mapping
will be like this
source -> squlf-> transaction control-> target
run it, separate files will be created by name of Ename

How do u populate 1st record to 1st target, 2nd record to 2nd target ,3rd record to 3rd target and 4threcord to 1st target
through informatica?
We can do using sequence generator by setting end value=3 and enable cycle option. Then in therouter take 3 groups
In 1st group specify condition as seq next value=1 pass those records to 1st target similarlyIn 2nd group
specify condition as seq next value=2 pass those records to 2nd target
In 3rd group specify condition as seq next value=3 pass those records to 3rd target.
Since we have enabled cycle option after reaching end value sequence generator will start from 1,for the 4th record seq.
next value is 1 so it will go to 1st target.
How do you perform incremental logic or Delta or CDC?
1) Incremental means suppose today we processed 100 records, for tomorrow run u need to extractwhatever the records
inserted newly and updated after previous run based on last updated timestamp (Yesterday run) this process called as
incremental or delta.

Page 7 of 21
Approach_1: Using set max var ()
2) First need to create mapping var ($$Pre_sess_max_upd) and assign initial value as old date(01/01/1940).
Then override source qualifier query to fetch only LAT_UPD_DATE >=$$Pre_sess_max_upd(Mapping var)
3) In the expression assign max last_upd_date value to $$Pre_sess_max_upd(mapping var) using setmax var
4) Because its var so it stores the max last upd_date value in the repository, in the next run oursource qualifier
query will fetch only the records updated or inseted after previous run.

Approach_2: Using parameter file


1) First need to create mapping parameter ($$Pre_sess_start_tmst) and assign initial value as olddate (01/01/1940) in
the parameter file.
2) Then override source qualifier query to fetch only LAT_UPD_DATE >=$$Pre_sess_start_tmst(Mapping var)
3) Update mapping parameter($$Pre_sess_start_tmst) values in the parameter file using shell scriptor another mapping
after first session get completed successfully
4) Because its mapping parameter so every time we need to update the value in the parameter fileafter competition of
main session.

Approach_3: Using oracle Control tables


First we need to create two control tables cont_tbl_1 and cont_tbl_1 with structure ofsession_st_time,wf_name
Then insert one record in each table with session_st_time=1/1/1940 and workflow_name
create two store procedures one for update cont_tbl_1 with session st_time, set property of storeprocedure type as
Source_pre_load.
In 2nd store procedure set property of store procedure type as Target _Post_load.this proc willupdate the session
_st_time in Cont_tbl_2 from cnt_tbl_1.
Then override source qualifier query to fetch only LAT_UPD_DATE >=(Select session_st_time fromcont_tbl_2 where
workflow name=’Actual work flow name’.
Difference between dynamic lkp and static lkp cache?
1) In Dynamic lkp the cache memory will get refreshed as soon as the record get inserted or updated/deleted in the
lookup table where as in static lookup the cache memory will not get refreshed even though record inserted or updated
in the lookup table it will refresh only in the nextsession run.
2) Best example where we need to use dynamic chache is if suppose first record and last record bothare same but there is
a change in the address what informatica mapping has to do here is first record needs to get insert and last record should
get update in the target table.
If we use static look up first record it will go to look up and check in the lkp cache based on the condition it will not find the
match so it returns null value then in the router will send that record toinsert flow.
3) But still this record does not available in the cache memory so when the last record comes to lookup it will check in the
cache it will not find the match it returns null values again it will go to insert flow through router but it supposed to go
update flow because cache didn’t refresh when the first record gets insert in to target table.
4) So, if we use dynamic look up, we can achieve our requirement because first time record getinsert then
immediately cache also get refresh with the target data.
5) When we process last record, it will find the match in the cache so it returns the value then routerwill route that record
to update flow.

Page 8 of 21
What is the difference between snowflake and star schema?
Star Schema Snowflake Schema
The star schema is the simplest data warehousescheme. Snowflake schema is a more complex data
warehouse model than a star schema.
In star schema each of the dimensions is represented in a In snowflake schema at least one hierarchyshould exist
single table. It should not have between dimension tables.
any hierarchies between dims.

It contains a fact table surrounded by dimension It contains a fact table surrounded by dimension tables.
tables. If the dimensions are de- normalized, we say it If a dimension is normalized,we say it is a snow flaked
is a star schema design. design.
In star schema only one join establishes the In snowflake schema since there is relationshipbetween
relationship between the fact table and any the dimensions tables it has to do
one of the dimension tables. many joins to fetch the data.
A star schema optimizes the performance by keeping Snowflake schemas normalize dimensions to
queries simple and providing fast response time. All the eliminated redundancy. The result is more complex
information about each queries and reduced query
level is stored in one row. performance.

It is called a star schema because the diagramresembles It is called a snowflake schema because thediagram
a star. resembles a snowflake.
Difference between data mart and data warehouse?
Data Mart Data Warehouse
Data mart is usually sponsored at the department level Data warehouse is a “Subject-Oriented, Integrated, Time-
and developed with a specific Variant, Nonvolatile collectionof data in support of
issue or subject in mind, a data mart is a datawarehouse decision making”.
with a focused objective.

A data mart is used on a business division/ A data warehouse is used on an enterprise level
department level.
A Data Mart is a subset of data from a Data A Data Warehouse is simply an integrated consolidation of
Warehouse. Data Marts are built for specificuser data from a variety of sourcesthat is specially designed to
groups. support strategic
and tactical decision making.

By providing decision makers with only a subsetof data The main objective of Data Warehouse is to provide
from the Data Warehouse, Privacy, Performance and an integrated environment and coherent picture of
Clarity Objectives can be the business at a point in
attained. time.

Differences between connected lookup and unconnected lookup?


Connected Lookup Unconnected Lookup
This is connected to pipeline and receives theinput Which is not connected to pipeline and receives input
values from pipeline.
values from the result of a: LKP expression in another
transformation via arguments.
We cannot use this lookup more than once in a We can use this transformation more than once
mapping. within the mapping
We can return multiple columns from the samerow. Designate one return port (R), returns onecolumn
from each row.
We can configure to use dynamic cache. We cannot configure to use dynamic cache.
Pass multiple output values to another transformation. Link Pass one output value to another transformation. The
lookup/output ports toanother transformation. lookup/output/return portpasses the value to the
transformation calling:
LKP expression.

Use a dynamic or static cache Use a static cache


Supports user defined default values. Does not support user defined default values.

Page 9 of 21
Cache includes the lookup source column in thelookup Cache includes all lookup/output ports in thelookup
condition and the lookup/return port.
condition and the lookup source columns that are output
ports.

What is the difference between joiner and lookup?


Joiner Lookup
In joiner on multiple matches, it will return allmatching In lookup it will return either first record or lastrecord or
records. any value or error value.
In joiner we cannot configure to use persistence cache, Whereas in lookup we can configure to use persistence
shared cache, uncached and cache, shared cache, uncached and
dynamic cache dynamic cache.
We cannot override the query in joiner We can override the query in lookup to fetchthe data
from multiple tables.
We can perform outer join in joiner We cannot perform outer join in lookup
transformation. transformation.
We cannot use relational operators in joiner Whereas in lookup we can use the relationoperators.
transformation. (i.e. <,>, <= and so on) (i.e., <,>, <= and so on)

What is the difference between source qualifier and lookup?


Source Qualifier Lookup
In source qualifier it will push all the matchingrecords. Whereas in lookup we can restrict whether todisplay
first value, last value or any value
In source qualifier there is no concept of cache. Whereas in lookup we concentrate on cache
concept.
When both source and lookup are in same When the source and lookup table exist in different
database, we can use source qualifier. database then we need to use lookup.

Differences between dynamic lookup and static lookup


Dynamic Lookup Cache Static Lookup Cache
In dynamic lookup the cache memory will get refreshed In static lookup the cache memory will not getrefreshed
as soon as the record get inserted orupdated/deleted in even though record inserted or updated in the lookup
the lookup table. table it will refresh only
in the next session run.

When we configure a lookup transformation touse a It is a default cache.


dynamic lookup cache, you can only use the equality
operator in the lookup condition.
NewLookupRow port will enable automatically.

Best example where we need to use dynamic cache is if If we use static lookup first record it will go to look up
suppose first record and last recordboth are same but and check in the lookup cache based on the condition it
there is a change in the address. What informatica will not find the match so itwill return null value, then in
mapping has to dohere is first record needs to get insert the router it will send that record to insert flow.
and lastrecord should get update in the target table. But still this record dose not available in the cache
memory so when the last record comesto lookup it will
check in the cache it will not find the match so it
returns null value again itwill go to insert flow through
router, but it is supposed to go to update flow because
cache
didn’t get refreshed when the first record getsinserted
into target table.

Page 10 of 21
How to Process multiple flat files to single target table through informatica if all files are samestructure?
We can process all flat files through one mapping and one session using list file.
First, we need to create list file using Unix script for all flat file the extension of the list file is. LST. This list file it will
have only flat file names.
At session level we need to set source file
directory as list file path and source file name
as list file nameAnd file type as indirect.
How to populate file name to target while loading multiple files using list file concept.
In informatica 8.6 by selecting Add currently processed flat file name option in the properties tab of source definition after
import source file definition in source analyzer. It will add new column as currently processed file name. We can map this
column to target to populate filename.
SCD Type-II Effective-Date Approach
We have one of the dimensions in current project called resource dimension. Here we are maintaining the history to keep
track of SCD changes.
To maintain the history in slowly changing dimension or resource dimension. We followed SCD Type- II Effective-Date
approach.
My resource dimension structure would be eff-start-date, eff-end-date, s.k and source columns. Whenever I do an insert
into dimension I would populate eff-start-date with sys date, eff-end-datewith future date and s.k as a sequence number.
If the record already presents in my dimension but there is change in the source data. In that casewhat I need to do is
Update the previous record eff-end-date with sys date and insert as a new record with source data.
Informatica design to implement SDC Type-II effective-date approach?
Once you fetch the record from source qualifier. We will send it to lookup to find out whether therecord is present in the
target or not based on source primary key column.
Once we find the match in the lookup, we are taking SCD column and s.k column from lookup toexpression
transformation.
In lookup transformation we need to override the lookup override query to fetch active records fromthe dimension while
building the cache.
In expression transformation I can compare source with lookup return data.If the source and
target data is same, then I can make a flag as ‘S’.
If the source and target data is different than I can make a flag as ‘U’.
If source data does not exist in the target that means lookup returns null value. I can flag it as ‘I’.Based on the flag values in
router I can route the data into insert and update flow.
If flag=’I’ or ‘U’ I will pass it to insert flow.
If flag=’U’ I will pass this record to eff-date update flow When we do
insert, we are passing the sequence value to s.k.
Whenever we do update, we are updating the eff-end-date column based on lookup return s.k value.
Complex Mapping
We have one of the order file requirements. Requirement is every day in source system they will placefilename with
timestamp in informatica server.
We have to process the same date file through informatica.
Source file directory contain older than 30 days files with timestamps.
For this requirement if I hardcode the timestamp for source file name it will process the same fileevery day.
1) So, what I did here is I created $InputFilename for source file name.
2) Then I am going to use the parameter file to supply the values to session variables ($InputFilename).
3) To update this parameter file, I have created one more mapping.

Page 11 of 21
4) This mapping will update the parameter file with appended timestamp to file name.
5) I make sure to run this parameter file update mapping before my actual mapping.
How to handle errors in informatica?
1) We have one of the sources with numerator and denominator values we need to calculatenum/deno
When populating to target.
2) If deno=0 I should not load this record into target table.
We need to send those records to flat file after completion of 1st session run. Shell script will checkthe file size.
3) If the file size is greater than zero, then it will send email notification to source system POC (point ofcontact) along with
deno zero record file and appropriate email subject and body.
If file size<=0 that means, there is no records in flat file. In this case shell script will not send any emailnotification.
Or
We are expecting a not null value for one of the source columns.If it is null that
means it is a error record.
We can use the above approach for error handling.
Worklet
Worklet is a set of reusable sessions. We cannot run the worklet without workflow.If we want to run
2 workflow one after another.
If each workflow exists in same folder, we can create 2 worklet rather than creating 2 workflows.Finally, we can call
these 2 worklets in one workflow.
There we can set the dependency.
If both workflows exist in different folders or repository then we cannot create worklet.
We can set the dependency between these two-workflow using shell script is one approach.The other approach
is event wait and event rise.
In shell script approach
 As soon as it completes first workflow, we are creating zero-byte file (indicator file).
 If indicator file is available location. We will run second workflow.
 If indicator file is not available, we will wait for 5 minutes and again we will check for theindicator. Like this
we will continue the loop for 5 times i.e 30 minutes.
 After 30 minutes if the file does not exist, we will send out email notification.
Event wait and Event rise approach
In event wait it will wait for infinite time. Till the indicator file is available.
Why we need source qualifier?
Simply it performs select statement. Select statement fetches the data in the form of row.
Source qualifier will select the data from the source table. It identifies the record from the source.Parameter file it will supply
the values to session level variables and mapping level variables.
Variables are of two types: Session level variables: Mapping level variables

Session level variables are of four types:


$DBConnection_Source
$DBConnection_Target
$InputFile
$OutputFile
Mapping level variables are of two types:
Variable, Parameter
What is the difference between mapping level and session level variables?
Mapping level variables always starts with $$.A session
level variable always starts with $.

Page 12 of 21
Flat File
Flat file is a collection of data in a file in the specific format.Informatica can
support two types of files
Delimiter or Fixed Width
In delimiter we need to specify the separator. Like (., ., /,|, etc…)
In fixed width we need to know about the format first. Means how many characters to read forparticular column.
In delimiter also it is necessary to know about the structure of the delimiter. Because to know aboutthe headers.
If the file contains the header, then in definition, we need to skip the first row.List file:
If you want to process multiple files with same structure. We don’t need multiple mapping andmultiple sessions.
We can use one mapping one session using list file option.
First, we need to create the list file for all the files. Then we can use this file in the main mapping.
Aggregator Transformation:
Transformation type: Active and Connected
The Aggregator transformation performs aggregate calculations, such as averages and sums. The Aggregator
transformation is unlike the Expression transformation, in that you use the Aggregator transformation to perform
calculations on groups. The Expression transformation permits you to perform calculations on a row-by-row basis only.
Components of the Aggregator Transformation:
The Aggregator is an active transformation, changing the number of rows in the pipeline. The Aggregator transformation
has the following components and options
Aggregate cache: The Integration Service stores data in the aggregate cache until it completes aggregate calculations. It
stores group values in an index cache and row data in the data cache.
Aggregate expression: Enter an expression in an output port. The expression can include non- aggregate expressions and
conditional clauses.
Group by port: Indicate how to create groups. The port can be any input, input/output, output, or variable port: When
grouping data, the Aggregator transformation outputs the last row of each groupunless otherwise specified.
Sorted input: Select this option to improve session performance. To use sorted input, you must pass data to the
Aggregator transformation sorted by group by port, in ascending or descending order.
Aggregate Expressions:
The Designer allows aggregate expressions only in the Aggregator transformation. An aggregate expression can include
conditional clauses and non-aggregate functions. It can also include one aggregate function nested within another
aggregate function, such as:
MAX (COUNT (ITEM))
The result of an aggregate expression varies depending on the group by ports used in thetransformation
Aggregate Functions
Use the following aggregate functions within an Aggregator transformation. You can nest oneaggregate function within
another aggregate function. The transformation language includes the following aggregate functions:
AVG, COUNT, FIRST, LAST, MAX, MEDIAN, MIN, PERCENTILE, STDDEV, SUM, VARIANCE
When you use any of these functions, you must use them in an expression within an Aggregatortransformation.

Page 13 of 21
Tips: Use sorted input to decrease the use of aggregate caches.
Sorted input reduces the amount of data cached during the session and improves session performance. Use this option
with the Sorter transformation to pass sorted data to the Aggregator transformation.
Limit connected input/output or output ports.
Limit the number of connected input/output or output ports to reduce the amount of data the Aggregator transformation
stores in the data cache.
Filter the data before aggregating it.
If you use a Filter transformation in the mapping, place the transformation before the Aggregator transformation to reduce
unnecessary aggregation.
Normalizer Transformation: Transformation type: Active/Connected
The Normalizer transformation receives a row that contains multiple-occurring columns and returns a row for each
instance of the multiple-occurring data.
The Normalizer transformation parses multiple-occurring columns from COBOL sources, relational tables, or other sources.
It can process multiple record types from a COBOL source that contains a REDEFINES clause.
The Normalizer transformation generates a key for each source row. The Integration Service increments the generated key
sequence number each time it processes a source row. When the source row contains a multiple-occurring column or a
multiple-occurring group of columns, the Normalizer transformation returns a row for each occurrence. Each row contains
the same generatedkey value.
SQL Transformation: Transformation type: Active/Passive/Connected
The SQL transformation processes SQL queries midstream in a pipeline. You can insert, delete, update, and retrieve rows
from a database. You can pass the database connection information to the SQL transformation as input data at run time.
The transformation processes external SQL scripts or SQL queries that you create in an SQL editor. The SQL transformation
processes the query and returns rows and database errors.
For example, you might need to create database tables before adding new transactions. You can create an SQL
transformation to create the tables in a workflow. The SQL transformation returns database errors in an output port. You
can configure another workflow to run if the SQL transformation returns no errors.
When you create an SQL transformation, you configure the following options:
Mode. The SQL transformation runs in one of the following modes:
Script mode. The SQL transformation runs ANSI SQL scripts that are externally located. You pass a script name to the
transformation with each input row. The SQL transformation outputs one row for each input row.
Query mode. The SQL transformation executes a query that you define in a query editor. You can pass strings or parameters
to the query to define dynamic queries or change the selection parameters. You can output multiple rows when the query
has a SELECT statement.
Database type. The type of database the SQL transformation connects to.
Connection type. Pass database connection information to the SQL transformation or use aconnection object. Script Mode
An SQL transformation running in script mode runs SQL scripts from text files. You pass each script file name from the
source to the SQL transformation Script Name port. The script file name containsthe complete path to the script file.
When you configure the transformation to run in script mode, you create a passive transformation. The transformation
returns one row for each input row. The output row contains results of the query and any database error.
When the SQL transformation runs in script mode, the query statement and query data do not change. When you need to
run different queries in script mode, you pass the scripts in the sourcedata. Use script mode to run data definition queries
such as creating or dropping tables.

Page 14 of 21
When you configure an SQL transformation to run in script mode, the Designer adds the Script Name input port to the
transformation
An SQL transformation configured for script mode has the following default ports:
Port Type Description
Script Name Input Receives the name of the script to execute for the current row.
Script Result Output Returns PASSED if the script execution succeeds for the row. Otherwise containsFAILED.

Script Error Output Returns errors that occur when a script fails for a row.
Script Mode Rules and Guidelines
 Use the following rules and guidelines for an SQL transformation that runs in script mode:
 You can use a static or dynamic database connection with script mode.
 To include multiple query statements in a script, you can separate them with a semicolon.
 You can use mapping variables or parameters in the script file name
 The script code page defaults to the locale of the operating system. You can change the localeof the script.
 You cannot use scripting languages such as Oracle PL/SQL or Microsoft/Sybase T-SQL in thescript.
 You cannot use nested scripts where the SQL script calls another SQL script
 A script cannot accept run-time arguments.
 The script file must be accessible by the Integration Service. The Integration Service must have read permissions on
the directory that contains the script. If the Integration Service uses operating system profiles, the operating
system user of the operating system profile must have read permissions on the directory that contains the script.
 The Integration Service ignores the output of any SELECT statement you include in the SQL script. The SQL
transformation in script mode does not output more than one row of data foreach input row.
Query Mode:
When an SQL transformation runs in query mode, it executes an SQL query that you define in the transformation. You pass
strings or parameters to the query from the transformation input ports to change the query statement or the query data.
When you configure the SQL transformation to run in query mode, you create an active transformation. The
transformation can return multiple rows for each input row.
Create queries in the SQL transformation SQL Editor. To create a query, type the query statement in the SQL Editor main
window. The SQL Editor provides a list of the transformation ports that you can reference in the query.
You can create the following types of SQL queries in the SQL transformation:
Static SQL query. The query statement does not change, but you can use query parameters to change the data. The
Integration Service prepares the query once and runs the query for all input rows.
 Dynamic SQL query. You can change the query statements and the data. The Integration Service prepares a
query for each input row.
 When you create a static query, the Integration Service prepares the SQL procedure once andexecutes it for each
row. When you create a dynamic query, the Integration Service prepares the SQL for each input row. You can
optimize performance by creating static queries.
 Query Mode Rules and Guidelines
 Use the following rules and guidelines when you configure the SQL transformation to run in query mode:
 The number and the order of the output ports must match the number and order of the fields in the query SELECT
clause.

Page 15 of 21
 The native datatype of an output port in the transformation must match the datatype of the corresponding
column in the database. The Integration Service generates a row error when the datatypes do not match.
 When the SQL query contains an INSERT, UPDATE, or DELETE clause, the transformation returns data to the SQL
Error port, the pass-through ports, and the Num Rows Affected port when it is enabled. If you add output ports
the ports receive NULL data values.
 When the SQL query contains a SELECT statement and the transformation has a pass-through port, the
transformation returns data to the pass-through port whether the query returns database data. The SQL
transformation returns a row with NULL data in the output ports.
 You cannot add the "_output" suffix to output port names that you create.
 You cannot use the pass-through port to return data from a SELECT query.
 When the number of output ports is more than the number of columns in the SELECT clause,the extra ports
receive a NULL value.
 When the number of output ports is less than the number of columns in the SELECT clause,the Integration
Service generates a row error.
 You can use string substitution instead of parameter binding in a query. However, the inputports must be string
datatypes.

Java Transformation Overview


Transformation type: Active/Passive/Connected
The Java transformation provides a simple native programming interface to define transformation functionality with the
Java programming language. You can use the Java transformation to quickly define simple or moderately complex
transformation functionality without advanced knowledge of the Java programming language or an external Java
development environment.
For example, you can define transformation logic to loop through input rows and generate multiple output rows based on
a specific condition. You can also use expressions, user-defined functions, unconnected transformations, and mapping
variables in the Java code.
Transaction Control Transformation: Transformation type : Active/Connected
PowerCenter lets you control commit and roll back transactions based on a set of rows that pass through a Transaction
Control transformation. A transaction is the set of rows bound by commit or roll back rows. You can define a transaction
based on a varying number of input rows. You might want to define transactions based on a group of rows ordered on a
common key, such as employee ID or order entry date.
In PowerCenter, you define transaction control at the following levels:
Within a mapping. Within a mapping, you use the Transaction Control transformation to define a transaction. You define
transactions using an expression in a Transaction Control transformation. Based on the return value of the expression, you
can choose to commit, roll back, or continue withoutany transaction changes.
Within a session. When you configure a session, you configure it for user-defined commit. You can choose to commit or
roll back a transaction if the Integration Service fails to transform or write any row to the target.
When you run the session, the Integration Service evaluates the expression for each row that enters the transformation.
When it evaluates a commit row, it commits all rows in the transaction to the target or targets. When the Integration
Service evaluates a roll back row, it rolls back all rows in the transaction from the target or targets.
If the mapping has a flat file target you can generate an output file each time the Integration Service starts a new
transaction. You can dynamically name each target flat file.

Page 16 of 21
Level-1 (Mappings)
1) Difference between join and lookup and Source transformation?
2) What are the different types of lookup in CDI (IICS) ?
3) Difference between Connected and Unconnected lookup?
4) What are the active and passive transformations?
5) What are the different methods to perform remove duplicates in CDI?
6) What is indirect file load and how can we implement that in IICS?
7) How will you read Source JSON file in IICS?
8) Describe Rank, Aggregator, Normalizer transformation?
9) IIF vs Decode function in the expression?
10) Router vs filter in IICS?
11) How to reset sequence generator when we migrate from DEV to QA?
12) Union vs File list??
13) What is SQL override and Lookup Override?
14) How to execute UNIX/Power shell/python commands in IICS Mapping
15) What is the biggest mapping you handled as a developer? (SCD TYPE 2)
16) Data cache and index cache in Join transformer
17) Hierarchical parser vs structural parser
18) Types of parameters in mapping (input and INOUT parameters) and its usage
19) SUBSTR, INSTR, ERROR, LKP, DATE functions
20) What is gcid in the normalizer
21) SCD Type 1,2,3,
22) Mapping level Performance tuning
23) Web service consumer transformation
24) fatal and nonfatal errors
25) Exception handling and user-defined errors
26) Types of caches
1. Data Cache
2. Index Cache
3. Static cache
4. Dynamic cache
5. Persistent Cache
6. Re cache (Refill Cache)
7. Shared Cache
27. How to call Stored Procedure in IICS?
28. How to call unconnected lookup object in IICS?
29. How to return multiple values from unconnected lookup?
30. What is incremental load, what are the different approaches to implement that
31. Difference between Upset and data driven
32. In join transformation, which object will be master and which object will be details, based on
what metrics we decide that?
33. How to convert Rows into columns in IICS?
34. Difference between REST V2 connection and Web service consumer
35. How to create business service and how to use in IICS
36. How to pass multiple rows to input request for web service call
37. How to do decrypt and encrypt the PGP encrypted source flat file?
38. How to copy/move/remove files from one folder to another folder using file processor
connection?
39. Can we move the file to SFTP location using IICS connections?
40. Can we use command in source instead of file list

Page 17 of 21
Level-2 (Mapping Tasks, Synchronization, Replication, Mass Ingestion)
1) How you implement performance tuning in the Informatica mapping Tasks
2) Error Handling mechanism in data integration
3) What is the mapping task?
4) How to schedule the Mapping
5) What is the blackout period in the schedule?
6) What is parameter file and how you use in the mapping
7) How to enable verbose mode in Informatica data integration
8) What is cross-schema pushdown optimization
9) Tell me below advanced session properties:
-> Rollback transactions on error
-> Commit on the end of file
-> Recovery Strategy
-> DTM process
-> Incremental Aggregation
-> Pushdown Optimization
-> Session Retry on deadlock
-> Stop on error
10) Difference between Linear task flow and Task flow
11) Limitations of Data synchronization task
12) Use of Replication task
13) What is incremental load, full load, and initial load
14) How to perform up sert in Informatica mapping and required constrained to implement
15) How to run Pre and Post SQL commands, Pre and Post Processing commands
16) Difference between Synchronization vs Replication Task
17) Different types of mass ingestion tasks
18) What is Data Masking
19) How to configure maplets in IICS
20) Use of control table in ETL
21) Explain the below components in task flow:
22) How to call CAI process in DI job 23) How to read parameter file values into MCT
24) How to execute python /Unix /PowerShell script using Command task (windows Secure agent)
25) How to execute multiple mapping task instances simultaneously
26) What is the use of Stop on Error property in MCT
27) What is the use of email notification option in MCT
28) How to use fixed width delimited file in source
29) How to create Business service and its use case
30) What is the use of hierarchical schema, can we do create without schema
31) How to send an email using notification step in task flow
32) Can we send output response to Task flow
33) How to send variables data to mapping columns in task flow
34) How to get values from mapping columns to task flow variables?
35) How to implement custom error handling in task flow

Page 18 of 21
36) How to do audit logging in IICS with data task output response variables?
37) How to Trigger the task flow based on file event
38) How to create file listener and how to trigger task flow
39) when we use file event and when we use schedule?
40) How can we trigger IICS task flow using third party scheduler?
41) What is include dependency check in assets export
42) Best practices for IICS code migrations (export and import)
43) How to implement versioning in IICS
44) what is asset level permissions and how to use that (ACL)
45) Different types of semi-structured data and how to read in IICS

46) Informatica Scenarios Home


47) Convert single row from source to three rows in target?
48) Split the non-key columns to separate tables with key column in both?
49) Separating duplicate and non-duplicate rows to separate tables
50) Retrieving first and last record from a table/file
51) Remove footer from your file
52) Remove header from your file
53) Sending first half record to target
54) Sending second half record to target
55) Sending alternate record to target
56) Separate the original records in target
57) Separate rows on group basis
58) Get top 5 records to target without using rank?
59) Segregating rows on group count basis
60) Extracting every nth row
61) Sending records to target tables in cyclic order
62) Concatenation of duplicate value by comma separation
63) Target table rows, with each row as sum of all previous rows from source table.
64) Produce files as target with dynamic names
65) Validating all mapping in repository
66) Using mapping parameter and variable in mapping
67) Removing '$' symbol from salary column
68) Currency convertor
69) sending data one after another to three tables in cyclic order
70) Converting '$' symbol to 'RS." in sal column
71) Insert and reject records using update strategy
72) Count the no of vowel present in emp name column
73) Remove special characters from emp no
74) Convert Numeric Value to Date Format
75) Check the Hire-Date is Date or Not
76) Date Difference in Hours
77) Sending to target with days difference more than 2 days

Informatica Interview Questions:


1) What is bottleneck in Informatica and how to check which bottleneck is present?
2) How to remove bottleneck in Informatica?
3) Different methods of optimization in Informatica?
4) Difference between joiner & look up?
5) What are different types of joins in Joiner in Informatica?

Page 19 of 21
6) Is look up active or passive? Why?
7) Different caches in lookup?
8) Difference between connected & unconnected lookup?
9) Difference between dynamic cache & static cache?
10) Difference between SQL override and lookup override?
11) Explain SCD type 2?
12) What is the difference in functionality of SCD type1 & type 2?
13) Difference between Filter & router? Which gives better performance?
14) Why is router called active transformation?
15) Why is Union called active transformation?
16) How to improve performance of lookup?
17) How to improve performance of Joiner?
18) How to improve performance of Aggregator?
19) What would happen if I forgot to select group by column in Aggregator? What will be the output?
20) Can I generate a repeatable sequence in Sequence Generator?
21) What will be output if I don’t connect the Nexval column from Sequence Generator and connects
only the Currval column?
22) What are different override options available in Informatica?
23) Can I join heterogeneous databases in source qualifier?
24) How can I make my records distinct in mapping? Explain different ways?
25) What are different tracing levels in Informatica?
26) What is transaction boundary?
27) What are different criteria to identify a transformation is active or passive?
28) Explain partitioning? Explain the different types of partitioning.
29) What is pushdown optimization?
30) What is Indirect file loading? Explain.
31) Explain Transactional Control transformation?
32) Can we find Dense rank using Rank transformation? If no then how can we find Dense rank in
Informatica?
33) What is the difference between STOP & ABORT?
34) How to load file name in the target table?
35) Can we create a mapping without Source Qualifier?
36) What are the different ways to set insert type (insert or update or delete) in mapping level &
session level?
37) I have used Update strategy in mapping, but my records are only getting inserted not getting
updated or rejected. What may be the reasons for this?
38) What is Incremental load & how can we achieve this? Explain.
39) How can I join different heterogeneous sources?
40) What is mapplet? What is worklet?
41) Difference between workflow and worklet?
42) Between lookup & joiner which is better?
43) What are the different ways by which we can generate sequence of numbers in Informatica?
Explain.
44) What is the most effective way to do distinct of records in Informatica?
45) Difference between mapping Variable & variable in expression transformation?
46) What is MD5 function? Explain SCD type 2 using MD5 function?
47) I want a session to run when the previous session has completed N runs. How to implement this in
Informatica?

Page 20 of 21
48) In filter transformation instead of giving condition I gave a random number by mistake. What will
be the output? Will the mapping run?
SQL Interview Questions:

1) Difference between Rank & Dense Rank?


2) Difference between Union & Union All?
3) What are different constraints available in SQL?
4) What is Natural Key, Composite Key?
5) Difference between Natural Key & Primary Key?
6) What are different types of sub queries?
7) What is correlated sub query?
8) What is the difference between Aggregate function & Analytic function?
9) Difference between Decode & NVL?
10) What is execution plan?
11) Difference between Group by & Partition by?
12) What is Pivot & Unpivot function?
13) How to implement SCD Type 2 using SQL?
14) What are the different types of Joins in SQL?
15) Explain functionality of different Joins in SQL?
16) What are different Views in SQL?
17) What is materialized view?
18) What are Indexes in SQL?
19) What is the difference between Clustered & non-clustered Index?
20) What is the difference between Primary Key & Unique Key?
21) Between integer & string which gives better performance?
22) Difference between char & varchar?
23) Difference between varchar & varchar2?

Data warehouse Interview Questions:

24) What is Fact & Dimension tables?


25) What are different types of Dimension table?
26) Dimension table contain Primary keys or foreign keys?
27) Fact table contain Primary keys or foreign keys?
28) What are different types of Fact?
29) What is Fact-less Fact table?
30) What is Junk Dimension?
31) Between Fact & Dimension tables which one is loaded first?
32) Difference between data mart & data warehouse?
33) What are different schemas in data warehouse? Explain with diagram.
34) Which schema is more complex, Star or Snowflake?
35) What is Galaxy schema?
36) Difference between OLTP & OLAP?
37) What is the role of Fact table in a star or snowflake schema?

Page 21 of 21
Informatica (Q&A)
1.What is the difference between Informatica PowerCenter and Informatica Cloud?
Informatica Intelligent Cloud Services is a cloud-based Integration platform(iPaaS). IICS helps you
integrate, synchronize all data and applications residing on your on-premises and cloud environments.
It provides similar functionality as PowerCenter in a better way and can be accessed via the internet.
Hence in IICS, there is no need to install any client applications on the personal computer or server. All
the supported applications can be accessed from the browser and the tasks can be developed through
browser UI. In PowerCenter, the client applications need to be installed on your server.
2.What is a Runtime environment?
A Runtime environment is the execution platform that runs a data integration or application
integration tasks. You must have at least one runtime environment setup to run tasks in your
organization. Basically, it is the sever upon which your data gets staged while processing. You can
choose either to process via the Informatica servers or your local servers which stays behind your
firewall. There are two types of runtime environments- Informatica Cloud Hosted Agent and
Informatica Cloud Secure Agent.
3.What is a Synchronization task?
Synchronization task helps you synchronize data between a source and target. A Synchronization task
can be built easily from the IICS UI by selecting the source and target without use of any
transformations like in mappings. You can also use expressions to transform the data according to your
business logic or use data filters to filter data before writing it to targets and use lookup data from
other objects and fetch a value. Anyone without PowerCenter mapping and transformation knowledge
can easily build synchronization tasks as UI guides you step by step.
4.What is a Replication task?
A Replication task allows you to replicate data from a database table or an on-premises application to a
desired target. You can choose to replicate all the source rows or the rows that changes since the last
runtime of the task using built in Incremental processing mechanism of Replication Task.
You can choose from three different type of operations when you replicate data to a target.
→ Incremental load after initial full load
→ Incremental load after initial partial load → Full load each run
5.What is the difference between a Synchronization task and Replication task?
In Synchronization task you must have a target to integrate data. However, a Replication task can
create a target for you. A Replication task can replicate an entire schema and all the tables in it at a
time which is not possible in Synchronization task. A Replication task comes with a built-in incremental
processing mechanism. In Synchronization task user needs to handle the incremental data processing.
6.Where does the metadata gets stored in Informatica Cloud (IICS)?
All the metadata gets stored in the Cloud server/repository. Unlike PowerCenter, all the information in
Informatica Cloud is stored on the server maintained by the Informatica and the user does not have
access to the repository database. Hence, it is not possible to use any SQL query on metadata tables to
retrieve the information like in Informatica PowerCenter.
7.What metadata information gets stored in the Informatica Cloud (IICS) repository?
Source and Target Metadata: Metadata information of each source and target including the field
names, datatype, precision, scale, and other properties.
Connection Information: The connection information to connect specific source and target systems in
an encrypted format.
Mappings: All the Data integration tasks built, their dependences and rules are stored.
Schedules: The schedules created you run the task built in IICS are stored.
Logging and Monitoring information: The results of all the jobs are stored.
8.What is a Mapping Configuration task?
A Mapping Configuration Task or Mapping Task is analogous to a session in Informatica PowerCenter.
You can define parameters that associate with the mapping. Define pre- and post-processing
commands. Add advance session properties to boost the performance and configure the task to run on
schedule.
Page 22 of 21
9.What is a task flow in Informatica Cloud?
A Task flow is analogous to a workflow in Informatica PowerCenter. A task flow controls the execution
sequence of a mapping configuration task, or a synchronization task based on the output of the
previous task.
10.What is the difference between a Task flow and Linear Task flow?
A Linear task flow runs the tasks one by one serially in an order defined in the task. If a task defined in
Linear task flow gets failed, you need to restart the entire task flow. A taskflow allows you to run the
task in parallel, advanced decision-making capabilities.
11.Can we run Powercenter jobs in Informatica cloud?
Yes. There is a Powercenter task available in Informatica Cloud where in user must upload the XML file
exported from Powercenter in Data Integration and run the job as a Powercenter task. You can update
an existing PowerCenter task to use a different PowerCenter XML file but cannot make changes to an
imported XML. When you upload a new PowerCenter XML file to an existing PowerCenter task, the
PowerCenter task deletes the old XML file and updates the PowerCenter task definition based on new
XML file content.
12.How does a update strategy transformation work in Informatica Cloud?
There is no Update strategy transformation available in Information Cloud. In the target transformation
in a mapping, Informatica Cloud Data Integration provides the option for the action to be performed on
the target – Insert, Update, Upsert, Delete and Data Driven.
13.What is the difference between a Union transformation in Informatica Cloud vs Informatica
Powercenter?
In earlier versions of Informatica Cloud, the Union transformation allows only two groups to be defined
in it. Hence if three different source groups need to be mapped to target, the user must use two Union
transformations. The output of first two groups to Union1. The output of Union1 and group3 to
Union2.
In the latest version, Informatica Cloud is supporting multiple groups. So, all the input groups can be
handled in a single Union transformation.
14.What is Dynamic Linking?
Informatica Cloud Data Integration allows you to create a new target files/tables at runtime. This
feature can only be used in mappings. In the target, choose Create New at Runtime option.
The user can choose a static filename which will be replaced by a new file every time the mapping runs
with the same name. The user can also choose to create a Dynamic filename so that every time the
mapping runs, a file is created with new name.
15.In what format can you export a task present in Informatica Cloud?
Informatica Cloud Data Integration supports exporting the tasks as a zip file where the metadata gets
stored in the JSON format inside the zip file. However, you can also download a XML version of the
tasks also which can be imported as workflows in Powercenter. But it will not support bulk export of
tasks in XML format at a time. Whereas you can export multiple tasks in form of JSON in a single export
zip file.
16.How do you read JSON Source file in IICS?
JSON files are read using the Hierarchy Parser transformation present in IICS. The user needs to define
a Hierarchical Schema that defines the expected hierarchy of output data in order to read a JSON file
through Hierarchy Parser. The Hierarchy Parser Transformation can also be used to read XML files in
Informatica Cloud Data Integration
17.What is a Hierarchical Schema in IICS?
A Hierarchical Schema is a component where user can upload an XML or JSON sample file that define
the hierarchy of output data. The Hierarchy Parser transformation converts input based on the
Hierarchical schema that is associated with transformation.
18.What is Indirect File loading and how to perform Indirect loading in IICS?
The processing of multiple source files having same structure and properties in a sequential manner in
a mapping is called Indirect File Loading. Indirect loading in IICS can be performed by selecting the File
List under Source Type property of a source transformation.

Page 23 of 21
19.What are the parameter types available in the Informatica Cloud?
IICS supports two types of parameters.
Input Parameter: Like a parameter in Powercenter. The parameter value remains constant as the value
defined in MCT or a Parameter file.
In-Out Parameter: Like a variable in Powercenter. The In-out parameter can be a constant or change
values within a single task run.
20.How many Status states are available in IICS monitor?
The various status states available in IICS are
Starting: Indicates that the task is starting.
Queued: There is a predefined number set which controls how many tasks can run together in your IICS
org. If the value is set to two and if two jobs are already running, the third task you trigger enters into
Queued state.
Running: The job enters the Running status from Queued status once the task is triggered completely.
Success: The task completed successfully without any issues.
Warning: The task completed with some rejects.
Failed: The task failed due to some issue.
Stopped: The parent job has stopped running, so the subtask cannot start. Applies to subtasks of
replication task instances.
Aborted: The job was aborted. Applies to file ingestion task instances.
Suspended: The job is paused. Applies to taskflow instances.
21.When Source is parameterized in a Cloud mapping, the source transformation fields would be
empty. Then how does the fields get propagated from source to the downstream transformations in
source parameterized mappings?
To propagate the fields to downstream transformations when source is parameterized, initially create
the mapping with actual source table. In the downstream transformation after source, select the Field
Selection Criteria as Named Fields and include all the source fields in the Incoming Fields section of the
transformation. Then change the source object to a parameter. This way the source fields are still
retained in the downstream transformation even when the fields are not available in source
transformation after the source is parameterized.
22.To include all incoming fields from an upstream transformation except those with dates, what
should you do?
Configure two field rules in a transformation. First, use the All-Fields rule to include all fields. Then,
create a Fields by Datatypes rule to exclude fields by data type and select Date/Time as the data type
to exclude
23.What are Preprocessing and postprocessing commands in IICS?
The Preprocessing and postprocessing commands are available in the Schedule tab of tasks to perform
additional jobs using SQL commands or Operating system commands. The task runs preprocessing
commands before it reads the source. It runs postprocessing commands after it writes to the target.
The task fails if If any command in the preprocessing or postprocessing scripts fail.
24.What are Field Name conflicts in IICS and how can they be resolved?
When there are fields with same name coming from different transformations into a downstream
transformation, the cloud mapping designer generates a Field Name Conflict error. You can either
resolve the conflict by renaming the fields in the upstream transformation itself or you can create a
field rule in downstream transformation to Bulk Rename fields by adding a prefix or a suffix to all
incoming fields.
25.What system variables are available in IICS to perform Incremental Loading?
IICS provides access to following system variables which can be used as a data filter variable to filter
newly inserted or updated records.
$LastRunTime returns the last time when the task ran successfully.
$LastRunDate returns only the last date on which the task ran successfully. The values of $LastRunDate
and $Lastruntime get stored in Informatica Cloud repository/server and it is not possible to override
the values of these parameters.

Page 24 of 21
26.What is the difference between the connected and unconnected sequence generator
transformation in Informatica Cloud Data Integration?
Sequence generator can be used in two different ways in Informatica cloud. One with Incoming fields
disabled and the other with incoming fields not disabled.
The difference between the sequence generator with incoming fields enabled and disabled is, when
NEXTVAL field is mapped to multiple transformations,
→ Sequence generator with incoming fields not disabled will generate same sequence of numbers for
each downstream transformation.
→ Sequence generator with incoming fields disabled will generate Unique sequence of numbers for
each downstream transformation.
27.Explain Partitioning in Informatica Cloud Data Integration.
Partitioning is nothing but enabling the parallel processing of the data through separate pipelines. With
the Partitioning enabled, you can select the number of partitions for the mapping. The DTM process
then creates a reader thread, transformation thread and writer thread for each partition allowing the
data to be processed concurrently, thereby reducing the execution time of the task. Partitions can be
enabled by configuring the Source transformation in mapping designer.
There are two major partitioning methods supported in Informatica Cloud Data Integration.
1. Key Range Partitioning distributes the data into multiple partitions based on the partitioning key
selected and range of the values defined for it. You must select a field as a partitioning key and defined
the start and end ranges of the value.
2. Fixed Partitioning can be enabled for sources which are not relational or support key range
partitioning. You must select the number of partitions by passing a value.
28.How to pass data from one mapping to other in Informatica Cloud Data Integration?
The data can be passed from one Mapping task to another in Informatica Cloud Data Integration
through a Task flow using parameters. The Mapping Task which passes the data should have an In-Out
Parameter defined using SetVariable functions. The Mapping Task which receives the data should
either have an Input parameter or an In-Out Parameter defined in the mapping to read the data passed
from upstream task.

***_SQL_Interview questions***

1.What is SQL?
SQL transfer Structured Query Language are also called as SEQUEL.
2.List out sub language of SQL?
They are 5 sub languages DDL,DML,DRL/DQL,TCL,DCL .
3.What is different between char and varchar2?
Char is a fixed size and varchar2 is a not a fixed size.
4.What is projection?
Selecting specific columns is projection.
5.How can we filter the rows in the table by use the Where, Group BY, Having, Order By clause?
Select deptno, sum(sal) from emp where ename <> ‘KING’ Group By deptno Having sum(sal) > 9000
Order By sum(sal) DESC;
6.What is column Alias?
Providing the duplicate to column Name. This is not a permanent.
7.Can we perform a arithmetic operation by using dual?
Select 10 + 20 Result from dual;
8.What is dual table?
Dual table is dummy table to calculate the some problems. This is one column and one row. Specified
to ‘X’
9.Write a query to display current date along with HH:MI:SS?
Select To_Char(sysdate,’DD-MON-YYYY HH:MI:SS’) from dual?

Page 25 of 21
10.Write a query to see the current date?
Select sysdate from dual;
11.Which operator is used to accept the values from the user?
INSERT, UPDATE,CREATE and etc.
12.How can you see all the table which are in the data base?
Select * from TAB
13.Which command is used to remove the table from the data base?
Drop command is used to the table.
14.What is different between delete and truncate command?
Delete to the table and getting the roll back. Truncate is used not possible the
table roll back.
15.Which operator is used to retrieve the rows based on null values?
IS NULL
16.In how many ways we can rename all the rows from the table?
They are two ways Delete and Truncate
17.How can we create copy of a table?
Create table emp1 AS select * from emp;
Create table emp2 AS select * from emp where deptno = 30;
18.Write a query to display no.of rows in the table?
BY using count (*) Select count (*) from emp;
19.What is different between count (*) and count (Expr)?
*is total table. Expr is only one column to count the table.
20.What is difference between group functions and scalar functions?
Group functions will act on total table and scalar functions will act on one row.
21.What is a use of Group By clause?
Group By clause will decided into several groups.
22.How can we filter the rows of Group By clause?
Having clause is used to filter the data.
23.Which clause is used to arrange the rows in the table?
Order By clause is used to arrange.
24.Which clause should be the last clause of the query?
Is order By caluse.

***Any operation performed on null will result to null values.

25.What is a TOAD?
Tool for Oracle Application Development.
26.What is need for Integrity constraint?
Constrains are rules which are applied on tables.
27.List out types of constraints?
They are 5 types NOT NULL, UNIQUE, PRIMARY KEY, FOREIGN KEY, and CHECK.
28.In how many level constraints can be created?
Those are two levels i.e. column level and table level.
29.Which constraint can be created?
The constraint created not null.
30.Dose not null constraints accept duplicate values?
Yes
31.Which constraint is used to unique for every row in the table?
Primary key
32.What is composite primary key?
When primary key is applied on multiple columns it is called composite primary key. Composite
primary key can be applied only at table level.

Page 26 of 21
33.Can a table name two primary key?
It is not possible.
34.What is foreign key constraint explain?
This foreign key is a established in parent table and child table relationship.
35.Can we establish a parent & child relationship without having constraint in the parent table?
On
36.Con you explain change related to foreign key on delete cascade on delete set null constraint?
Foreign key column in the child table will only accept values which are the primary key column or
unique column.
We can delete the rows from the parent table and the corresponding child table rows deleted
automatically.
When we delete row from parent table. The corresponding values will be changed to null.
37.Does every constraint it has constraint name?
38.How can you know constraint name and combination type apply for a table?
By using user constraint.
39.Is there any different when a constraint is a created at column level or table level?
No difference.
40.Can you provide user defined constraint name?
Yes
41.What are data dictionary tables?
Predefined tables or user constraints
42.What is the need for join?
To retrieve the multiple tables
43.What is EQUI join?
When tables are joined basing on a common column it is called EQUI_JOIN.
44.How many conditions are required to join ‘n’ tables?
We need to n-1 conditions.
45.How can be display matching as well as non-matching rows?
By using outer joins.
46.What is outer join operator?
(+)
47.What is Cartesian product?
All possible in the table matching
48.What is difference between union and union all?
The union set operator display only original values and union all set operator is display all values.
Duplicate values also.
49.What are pseudo columns?
It is rownum is a pseudo column which starts with one and increment by 1.
50.Write a query to display first n rows from the table?
Select rownum, empno, ename, sal, deptno from emp;
51.What are different between rownum and rowid?
Rownum Rowid
Rownum values starts with 1 Rowid’s are hexadecimal values.
and increment by one.
Rownum values are temporary. Rowid values are permanent.
Rownum values are generated The Rowid values are generated when
when query is executed. row is created or inserted.

52.Write query to delete duplicate rows from the table?


Delete from student where Rowid Not IN(select min(Rowid) from student group by sno);

Page 27 of 21
53.write a query to display the first five of highest?
Select * from (select * from emp ORDER BY sal desc) where rownum <=5)
Minus
Select * from (select * from emp ORDER BY sal desc) where rownum <=4);
54.Explain about correlated subquery?
When subquery is executed in relation to parent query, it is called correlated subquery.
55.Whar are multiple row operators?
IN, ANY, ALL
56.Explain scalar subquery?
When we use sub query in the select clause it is called scalar subquery.
57.Explain inline view?
When a sub query is used inform clause it is called inline view.

ALL THE BEST

Page 28 of 21

You might also like