0% found this document useful (0 votes)
10 views36 pages

Crack The Interview

Uploaded by

Apsar Shaik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views36 pages

Crack The Interview

Uploaded by

Apsar Shaik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

CALL/WHATSAPP - 8970853557 / 9448005273

CRACK THE NEXT INTERVIEW !!

This Informatica Interview Questions blog covers all the core concepts from basic to advanced
level. So utilize these Questions to improve your chances of being hired in your next interview

Self introduction

Please introduce yourself !

Hi !

This is prashanth , i am from bangalore

I am a m.tech graduate

having total 3.8 years of exp into informatica & iics cloud , from last 1 year i got the opportunity
to work in IICS project

I am currently working as IICS developer in infosys

Here migration is in process from Informatica power center to IICS Cloud

Our client here is Takeda Pharamaceutical company ( Visit www.takeda.com for more info)

One of the top 20 pharmaceuticals company

Here i involved in various activities related to etl (extract, transform, load) processes. Here are
some of my key roles and responsibilities:

Communication with business analysts and data modelers:

Regularly engage with business analysts and data modelers to comprehend project
requirements.

Ensure a clear understanding of the data sources provided in the form of tables and flat files.

Mapping logic development:

Interpret and analyze the documentation for developing mappings.

Develop logical processes for loading data incrementally using mapping variables.
Design and implement etl mappings for both historical and incremental loading based on
project requirements.

Complex etl mapping design:

Design and develop complex etl mappings for handling type-1 and type-2 dimensions.

Create mappings for complex fact tables, adhering to business logic and client requirements.

Test case preparation and unit testing:

Prepare comprehensive test cases for the developed mappings.

Conduct unit testing to ensure the accuracy and efficiency of the mappings.

Team collaboration and peer review:

Assist team members in the design and development of their mappings.

Participate in peer reviews of team members' mappings to ensure quality and adherence to best
practices.

Performance tuning:

Analyze etl code and identify areas for performance improvement.

Tune mappings or redesign code to enhance load performance.

Code migration and defect resolution:

Migrate code from individual user folders to the project folder for better organization.

Analyze and address defects identified by the project team to maintain the integrity of the etl
processes.

Framework job implementation:

Implement a consolidated workflow framework job that includes sessions for staging,
dimension, and fact tables.

Parameterize connections and session logs for flexibility.

Schedule and automate the framework job to run daily, ensuring seamless data flow from
source to stage, stage to dimension, and finally to the fact table.

1. What is the meaning of Enterprise Data Warehousing?

Enterprise Data Warehousing is the data of the organization being created or developed at a
single point of access. The data is globally accessed and viewed through a single source since
the server is linked to this single source. It also includes the periodic analysis of the source.
2. What is the meaning of Lookup transformation?

To get the relevant data or information, the Lookup transformation is used to find a source
qualifier, a target, or other sources. Many types of files can be searched in the Lookup
transformation like for example flat files, relational tables, synonym, or views, etc. The Lookup
transformation can be cited as active or passive. It can also be either connected or unconnected.
In mapping, multiple lookup transformations can be used. In the mapping, it is compared with
the lookup input port values.

The following are the different types of ports with which the lookup transformation is created:

1. Input port
2. Output port
3. Lookup ports
4. Return port

3. What are the points of difference between connected lookup and unconnected lookup?

Connected lookup is the one that takes up the input directly from the other transformations and
also participates in the data flow. On the other hand, an unconnected lookup is just the
opposite. Instead of taking the input from the other transformations, it simply receives the
values from the result or the function of the LKP expression.

Connected Lookup cache can be both dynamic and static but unconnected Lookup cache can't
be dynamic in nature. The First one can return to multiple output ports but the latter one
returns to only one output port. User-defined values which ads generally default values are
supported in the connected lookup but are not supported in the unconnected lookup.

4. How many input parameters can be present in an unconnected lookup?

The number of parameters that can include in an unconnected lookup is numerous. However, no
matter how many parameters are put, the return value would be only one. For example,
parameters like column 1, column 2, column 3, and column 4 can be put in an unconnected
lookup but there is only one return value.

To get the relevant data or information, the Lookup transformation is used to find a source
qualifier, a target, or other sources. Many types of files can be searched in the Lookup
transformation like for example flat files, relational tables, synonym, or views, etc. The Lookup
transformation can be cited as active or passive. It can also be either connected or unconnected.
In mapping, multiple lookup transformations can be used. In the mapping, it is compared with
the lookup input port values.
The following are the different types of ports with which the lookup transformation is created:

1. Input port
2. Output port
3. Lookup ports
4. Return port

5. What are the points of difference between connected lookup and unconnected lookup?

Connected lookup is the one that takes up the input directly from the other transformations and
also participates in the data flow. On the other hand, an unconnected lookup is just the
opposite. Instead of taking the input from the other transformations, it simply receives the
values from the result or the function of the LKP expression.

Connected Lookup cache can be both dynamic and static but unconnected Lookup cache can't
be dynamic in nature. The First one can return to multiple output ports but the latter one
returns to only one output port. User-defined values which ads generally default values are
supported in the connected lookup but are not supported in the unconnected lookup.

6. How many input parameters can be present in an unconnected lookup?

The number of parameters that can include in an unconnected lookup is numerous. However, no
matter how many parameters are put, the return value would be only one. For example,
parameters like column 1, column 2, column 3, and column 4 can be put in an unconnected
lookup but there is only one return value.

7. How many lookup caches are available?

Informatica lookup caches can be of different nature like static or dynamic. It can also be
persistent or non-persistent. Here are the names of the caches:

1. Static Cache
2. Dynamic Cache
3. Persistent Cache
4. Shared Cache
5. Reached

8. What is the difference between a data warehouse, a data mart, and a database?

Data warehouse consists of different kinds of data. A database also consists of data but
however, the information or data of the database is smaller in size than the data
warehouse. Datamart also includes different sorts of data that are needed for different domains.
Examples - Different dates for different sections of an organization like sales, marketing,
financing, etc.
9. What is a domain?

The main organizational point sometimes undertakes all the interlinked and interconnected
nodes and relationships and this is known as the domain. These links are covered mainly by one
single point of the organization.

10. What is the Cite the differences between a powerhouse and a repository server?

The powerhouse server is the main governing server that helps in the integration process of
various different processes among the different factors of the server's database repository. On
the other hand, the repository server ensures repository integrity, uniformity, and consistency.

11. In Informatica, how many numbers repositories are possible to be made?

The total figure of repositories created in Informatica mainly depends on the total amounts of
the ports of the Informatica.

12. What are the benefits of a partitioned session?

A session is partitioned in order to increase and improve the efficiency and the operation of the
server. It includes the solo implementation sequences in the session.

Informatica Scenario Based Interview Questions

13. Define parallel processing?

Parallel processing helps in further improvement of performance under hardware power. The
parallel processing is actually done by using the partitioning sessions. This partitioning option of
the Power Center in Informatica increases the performance of the Power Center by parallel

data processing. This allows the large data set to be divided into a smaller subset and this is also
processed in order to get a good and better performance of the session.

14. What are the different types of methods for the implementation of parallel processing
in Informatica?

There are different types of algorithms that can be used to implement parallel processing. These
are as follows:

 Database Partitioning - Database partitioning is actually a type of table partitioning


information. There is a particular type of service that queries the database system or the
information of the database, named the Integration Service. Basically, it looks up the
partitioned data from the nodes of the database.
 Joiner - The data is required to be joined in the Source Qualifier as it is important to do
so. It is also important to avoid the outer joins. A fewer row is much more efficient to be
used as a Master Source.
 Lookup - Here, joins replace the large lookup tables and the database is reviewed. Also,
database indexes are added to columns. Lookups should only return those ports that
meet a particular condition.

15. What are the different mapping design tips for Informatica?

The different mapping design tips are as follows:

 Standards - The design should be of a good standard. Following a standard consistently


is proven to be beneficial in the long run projects. Standards include naming
descriptions, conventions, environmental settings, documentation and parameter files,
etc.
 Reusability - Using reusable transformation is the best way to react to the potential
changes as quickly as possible. applets and worklets, these types of Informatica
components are best suited to be used.
 Scalability - It is important to scale while designing. In the development of mappings,
the volume must be correct.
 Simplicity - It is always better to create different mappings instead of creating one
complex mapping. It is all about creating a simple and logical process of design
 Modularity - This includes reprocessing and using modular techniques for designing.

16. What is the meaning of the word ‘session’? Give an explanation of how to combine
execution with the assistance of batches?

Converting data from a source to a target is generally implemented by a teaching service and
this is known as a session. Usually, the session manager executes the session. In order to
combine session’s executions, batches are used in two ways - serially or parallelly.

17. How many numbers of sessions are grouped in one batch?

Any number of sessions can be grouped in one batch but however, for an easier migration
process, it is better if the number is lesser in one batch.

18. What is the difference between mapping parameters and mapping variables?

The mapping variable refers to the changing values of the sessions' execution. On the other
hand, when the value doesn't change during the session then it is called mapping parameters.
The mapping procedure explains the procedure of the mapping parameters and the usage of
this parameter. Values are best allocated before the beginning of the session to these mapping
parameters.
19. Explain Partitionings & types in performance tuning ?

 Round-Robin Partitioning - With the aid of this, the Integration service does the
distribution of data across all partitions evenly. It also helps in grouping data in a correct
way.
 Hash Auto-keys partitioning - The hash auto keys partition is used by the power center
server to group data rows across partitions. These grouped ports are used as a
compound partition by the Integration Service.
 Hash User-Keys Partitioning - This type of partitioning is the same as auto keys
partitioning but here rows of data are grouped on the basis of a user-defined or a user-
friendly partition key. The ports can be chosen individually that correctly defines the key.
 Key Range Partitioning - More than one type of port can be used to form a compound
partition key for a specific source with its aid, the key range partitioning. Each partition
consists of different ranges and data is passed based on the mentioned and specified
range by the Integration Service.
 Pass-through Partitioning - Here, the data are passed from one partition point to
another. There is no distribution of data.

20. What are the best mapping development practices?

Best mapping development practices are as follows -

 Source Qualifier - This includes extracting the necessary data-keeping aside the
unnecessary ones. It also includes limiting columns and rows. Shortcuts are mainly used
in the source qualifier. The default query options like for example User Defined Join and
Filter etc, are suitable to use other than using source qualifier query override. The latter
doesn't allow the use of partitioning possible all the time.
 Expressions - It includes the use of local variables in order to limit the number of huge
calculations. Avoiding data type conversions and reducing invoking external coding is
also part of an expression. Using operators are way better than using functions as
numeric operations are better and faster than string operation.
 Aggregator - Filtering the data is a necessity before the Aggregation process. It is also
important to use sorted input.
 Filter - The data needs a filter transformation and it is a necessity to be close to the
source. Sometimes, multiple filters are also needed to be used which can also be later
replied by a router.
 Joiner - The data is required to be joined in the Source Qualifier as it is important to do
so. It is also important to avoid the outer joins. A fewer row is much more efficient to be
used as a Master Source.
 Lookup - Here, joins replace the large lookup tables and the database is reviewed. Also,
database indexes are added to columns. Lookups should only return those ports that
meet a particular condition.

21. What are the different mapping design tips for Informatica?
The different mapping design tips are as follows:

 Standards - The design should be of a good standard. Following a standard consistently


is proven to be beneficial in the long run projects. Standards include naming
descriptions, conventions, environmental settings, documentation and parameter files,
etc.
 Reusability - Using reusable transformation is the best way to react to the potential
changes as quickly as possible. applets and worklets, these types of Informatica
components are best suited to be used.
 Scalability - It is important to scale while designing. In the development of mappings,
the volume must be correct.
 Simplicity - It is always better to create different mappings instead of creating one
complex mapping. It is all about creating a simple and logical process of design
 Modularity - This includes reprocessing and using modular techniques for designing.

22. How many numbers of sessions are grouped in one batch?

Any number of sessions can be grouped in one batch but however, for an easier migration
process, it is better if the number is lesser in one batch.

23. What is the difference between mapping parameters and mapping variables?

The mapping variable refers to the changing values of the sessions' execution. On the other
hand, when the value doesn't change during the session then it is called mapping parameters.
The mapping procedure explains the procedure of the mapping parameters and the usage of
this parameter. Values are best allocated before the beginning of the session to these mapping
parameters.

24. What are the features of complex mapping?

These are the three most important features of complex mapping.

1. Difficult requirements
2. Numerous transformations
3. Complex logic regarding business

25. Which option helps in finding whether the mapping is correct or not?

The debugging option helps in judging whether the mapping is correct or not without really
connecting to the session.

26. What do you mean by OLAP?


OLAP or also known as On-Line Analytical Processing is the method with the assistance of which
multi-dimensional analysis occurs.

27. Mention the different types of OLAP?

The different types of OLAP are

1. ROLAP
2. HOLAP

28. What is the meaning of the surrogate key?

The surrogate key is just the replacement in the place of the prime key. The latter is natural in
nature. This is a different type of identity for each consisting of different data.

29. What is a session task?

When the Power Centre Server transfers data from the source to the target, it is often guided by
a set of instructions and this is known as the session task.

30 What is the meaning of the command task?

Command task only allows the flow of more than one shell command or sometimes flow of one
shell command in Windows while the work is running.

31. What is the meaning of a standalone command task?

The type of command task that allows the shell commands to run anywhere during the workflow
is known as the standalone task.

32. Define workflow?

The workflow includes a set of instructions that allows the server to communicate for the
implementation of tasks.

33. How many tools are there in the workflow manager?

There are four types of tools:

1. Task Designer
2. Task Developer
3. Workflow Designer
4. Worklet Designer

34. Define target load order?


Target load order is dependent on the source qualifiers in a mapping. Generally, multiple source
qualifiers are linked to a target load order.

35. Define Power Centre repository of Informatica?

Informatica Power Centre consists of the following Metadata;

 Source Definition
 Session and session logs
 Workflow
 Target Definition
 Mapping
 ODBC Connection

Two repositories are as follows

1. Global Repositories
2. Local Repositories

Mainly Extraction, Loading (ETL), and Transformation of the above-mentioned metadata are
performed through the Power Centre Repository.

36. Name the scenario in which the Informatica server rejects files?

When the server faces rejection of the update strategy transformation, it regrets files. The
database consisting of the information and data also gets disrupted. This is a rare case scenario.

37. How to use Normalizer Transformation in Informatica?

 This is of type an Active T/R which reads the data from COBOL files and VSAM sources
(virtual storage access method)
 Normalizer T/R act like a source Qualifier T/R while reading the data from COBOL files.
 Use Normalizer T/R that converting each input record into multiple output records. This
is known as Data pivoting.

38. What are the Limitations of Pushdown Optimization?

1. Rank T/R cannot be pushed


2. Transaction control T/R
3. Sorted aggregation.

Procedure:

1. Design a mapping with filter, rank, and expression T/R.


2. Create a session --> Double click the session select properties tab.

Attribute Value

Pushdown optimization Full

3. Select the mapping tab --> set reader, writer connection with target load type normal.

4. Click apply --> click ok --> save the session.

5. Create & start the workflow.

Pushdown Optimization Viewer:-

Double click the session --> Select the mapping tab from the left window --> select pushdown
optimization.

39. What is the difference between Copy and Shortcut?

The following are the differences between copy and shortcut

Copy Shortcut

Copy an object to another folder Dynamic-link to an object in the folder

Changes to the original object don’t reflect Dynamically reflects the changes to an original ob

Duplicate’s the space Preserves the space

Created from unshared folders Created from shared folders

40. How to use PMCMD Utility Command?

1. It is a command based client program that communicates with integration service to


perform some of the tasks which can also be performed using workflow manager client.
2. Using PMCMD we can perform the following tasks:
o Starting workflow.
o Scheduling workflow.
3. The PMCMD can be operated in two different modes:
o Interactive Mode.
o Command-line Mode.
41. How do I scheduling a Workflow?

A schedule is automation of running the workflow at a given date and time.

There are 2 types of schedulers:

1. Reusable scheduler
2. Non Reusable scheduler

Reusable scheduler:

A reusable scheduler can be assigned to multiple workflows.

Non Reusable scheduler:

 A non-reusable scheduler is created specifically to the workflow.


 A non-reusable scheduler can be converted into a reusable scheduler.

The following are the 3rd party schedulers:

1. Cron (Unix based sch/eduling process)


2. Tivoli
3. Control M
4. Autosys
5. Tidal
6. WLM (work hard manager)

- 99% of production people will do the scheduling.

 Before we run the workflow manually. Through scheduling, we run workflow this is called
Auto Running

42. What is Dynamic Lookup Cache?

 The cache updates or changes dynamically when lookup at the target table.
 The dynamic lookup T/R allows for the synchronization of the target lookup table image
in the memory with its physical table in the database.
 The dynamic lookup T/R or dynamic lookup cache is operated in only connected mode
(connected lookup )
 Dynamic lookup cache support only equality conditions (=conditions)

43. How to use Power Center Command-Line in Informatica?

The transformation language provides two comment specifiers to let you insert comments in the
expression:

 Two Dashes ( - - )

New Lookup Row Description

0 The integration service does not update or insert the row in the cache

1 The integration service inserts the row into the cache

2 The integration service updates the row in the cache

 Two Slashes ( / / )

The Power center integration service ignores all text on a line preceded by these two comment
specifiers.

Variable Port Mapping Variable

Local to the T/R Local to the Mapping

Values are non-persistent Values are persistent

Can’t be used with SQL override Can be used with SQL override

44. What is the difference between the variable port and the Mapping variable?

The following are the differences between variable port and Mapping variable:

 Mapping variables is used for incremental extraction.


 In mapping variables no need to change the data. It automatically changed.
 In the mapping parameter, you have to change the date and time.

45. Which is the T/R that builts only single cache memory?
Rank can build two types of cache memory. But sorter always built only one cache memory. The
cache is also called Buffer.

46. What is XML Source Qualifier Transformation in Informatica?

1. Reads the data from XML files.


2. XML source definition associates with XML source Qualifier.
3. XML files are case-sensitive markup language.
4. Files are saved with an extension. XML.
5. XML files are hierarchical (or) parent-child relationship file formats.
6. Files can be normalized or denormalized.

What is Load Order?

Design mapping applications that first load the data into the dimension tables. And then load
the data into the fact table.

 Load Rule: If all dimension table loadings are a success then load the data into the fact
table.
 Load Frequency: Database gets refreshed on daily loads, weekly loads, and monthly
loads.

47. What is Snowflake Schema?

Snowflake Schema is a large denormalized dimension table is split into multiple normalized
dimensions.

Advantage:

Select Query performance increases.

Disadvantage:

Maintenance cost increases due to more no. of tables.

48. What is a Standalone Email task?

1. It can be used anywhere in the workflow, defined will Link conditions to notify the
success or failure of prior tasks.
2. Visible in Flow Diagram.
3. Email Variables can be defined with stand-alone email tasks.

49. What is Mapping Debugger?


 A debugger is a tool. By using this we can identify records are loaded or not and correct
data is loaded or not from one T/R to another T/R.
 Session succeeded but records are not loaded. In this situation, we have to use the
Debugger tool.

50. What is the functionality of F10 in Informatica?

F10 --> Next Instance

51. What T/R has No cast?

Lookup T/R

Note:- Prevent wait is available in any task. It is available only in the Event wait task.

 F5 --> Start Debugger.


 A debugger is used to test the records are loader or not, correct data is loader or not.
 A debugger is used only to test Valid Mapping but not invalid Mapping.

52. What are Worklet and what types of workouts?

1. A worklet is defined as a group of related tasks.


2. There are 2 types of the worklet:
o Reusable worklet
o Non-Reusable worklet
3. Worklet expands and executes the tasks inside the workflow.
4. A workflow that contains the worklet is known as Parent Workflow.

(a) Reusable Worklet:

 Created using the worklet designer tool.


 Can be assigned to Multiple workflows.

(b) Non-Reusable Worklet:

 Created using workflow designer tool.


 Created Specific to the workflow.

53. What is Relative Mode?

In Real-time we use this.

Relative Time: The timer task can start the timer from the start timer of the timer task, the start
time of the workflow or worklet, or from the start time of the parent workflow.
 A timer task is mainly used for scheduling workflow.
 Workflow 11 AM --> Timer (11:05 AM) --> Absolute Mode
 Anytime workflow start after 5 mins Timer --> (5 mins) will start Relative Mode.

54. What is the Difference between Filter and Router T/R?

The following are the differences between Filter T/R and Router T/R:

Filter T/R Router T/R

Single condition Multiple conditions

Single Target Multiple Targets

Rejected rows cannot be captured Default group captures rejected rows.

55. What is a Repository Manager?

It is a GVI based administrative client that allows performing the following administrative tasks:

 Create, edit and delete folders.


 Assign users to access the folders with reading, write and execute permissions.
 Backup and Restore repository objects.

56. What is Rank Transformation in Informatica?


This a type of active T/R which allows you to find out either top performance or bottom
performers.

Rank T/R is created with the following types of the port:

1. Input Port (I)


2. Output Port (O)
3. Rank Port (R)
4. Variable Port (V)

57. What is meant by Informatica PowerCenter Architecture?

The following components get installed:

 Power Center Clients


 Power Center Repository.
 Power Center Domain.
 Power Center Repository Service (PCRS)
 Power Center Integration Service (PCIS)
 Informatica administrator.

Mapping is nothing but an ETL Application.

58. What is Workflow Monitor?

1. It is a GUI based client application that allows users to monitor ETL objects running an
ETL Server.
2. Collect runtime statistics such as:


o No. of records extracted.
o No. of records loaded.
o No. of records were rejected.
o Fetch session log
o Throughput

3. Complete information can be accessed from the workflow monitor.

4. For every session, one log file is created.

59. If Informatica has its own scheduler why using a third-party scheduler?
The client uses various applications (mainframes, oracle apps use Tivoli scheduling tool) and
integrates different applications & scheduling those applications it is very easy by using third
party schedulers.

60. What is Workflow Manager?

It is a GUI-based client that allows you to create the following ETL objects.

 Session
 Workflow
 Scheduler

Session:

 A session is a task that executes mapping.


 A session is created for each Mapping.
 A session is created to provide runtime properties.
 A session is a set of instructions that tells the ETL server to move the data from source to
destination.

Workflow:

Workflow is a set of instructions that tells how to run the session tasks and when to run the
session tasks.

61. What is Informatica PowerCenter?

A data integration tool that combines the data from multiple OLTP source systems, transforms
the data into a homogeneous format and delivers the data throughout the enterprise at any
speed.

It is a GUI-based ETL product from Informatica corporation which was founded in 1993 in
Redwood City, California.

There are many products in Informatica corporation:

 Informatica Analyzer.
 Life cycle management.
 Master data

Having many products in Informatica.

Informatica power center is one of the products of Informatica.


Using Informatica power center we will do the Extraction, transformation, and loading.

62. What is a Dimensional Model?

Data Modeling:

 It is a process of designing the database by fulfilling business requirements


specifications.
 A Data Modeler (or) Database Architect Designs the warehouse Database using a GUI
based data modeling tool called “Erwin”.
 ERWin is a data modeling tool from Computer Associates (A).

Dimensional modeling consists of the following types of schemas designed for Datawarehouse:


o Star Schema.
o Snowflake Schema.
o Gallery Schema.

A schema is a data model that consists of one or more tables.

63. How does Rank transformation handle string values?

Rank transformation can return the strings at the top or the bottom of a session sort order.
When the Integration Service runs in Unicode mode, it sorts character data in the session using
the selected sort order associated with the Code Page of IS which may be French, German, etc.
When the Integration Service runs in ASCII mode, it ignores this setting and uses a binary sort
order to sort character data.

64. Is a sorter an active or passive transformation?

The sorter is an active transformation because when it configures output rows, it discards
duplicates from the key and consequently changes the number of rows.

65. Mention the types of transformations available in Informatica.

The following are the types of transformations available in Informatica:

 Source Qualifier Transformation


 Rank Transformation
 Lookup and Reusable transformation
 Router Transformation
 Aggregator Transformation
 Joiner transformation
 Sequence Generator Transformation
 Transaction Control Transformation
 Performance Tuning for Transformation
 Expression Transformation
 Normalizer Transformation
 External Transformation

66. What is the difference between active and passive transformation?

Based on the change in the number of rows, the active transformations are those which change
the number of input and data rows passed to them. While passive transformations remain the
same for any number of input and output rows passed to them.

67. What are the output files created by the Informatica server at runtime?

The output files created by the Informatica server at runtime are listed below:

 Informatica Server log: Informatica home directory creates a log for all the error
messages and status.
 Session log file: For each session, a session log file stores the data into the log file about
the ongoing initialization process, SQL commands, errors, and more.
 Session detail file: It contains load statistics for each target in mapping, including data
about the name of the table, no of rows written or rejected.
 Performance detail file: It includes data about session performance.
 Reject file: Rows of data not written to targets.
 Control file: Information about target flat-file and loading instructions to the external
loader.
 Post-session email: Automatically delivers session run data to designated recipients.
 Indicator file: It contains a number to indicate whether the row was marked for insert,
delete or reject, and update.
 Output file: Informatica server creates a target file based on the details entered in the
session property sheet.
 Cache file: It automatically builds, when the Informatica server creates a memory cache.

68. What is the difference between static cache and dynamic cache?

The following are the differences between static cache and dynamic cache:

Static Cache Dynamic Cache

In the dynamic cache, we can insert or update


A static cache is a default cache, and we
data into the lookup and pass the data to the
cannot insert or update the caches
target.
Handles multiple matches Doesn't handles multiple matches

Suitable for relational and flat file


Suitable for relational lookup types
lookup types

Relational operators like = &= can be


Only = operator is used
used

Used for both connected and


Used for only connected lookup transformation
unconnected lookup transformation

69. Can you tell what types of groups does router transformation contains?

Router transformation contains the following types of transformations:

1. Input group
2. Output group:

Further, the output group contains two types:


1. User-defined groups
2. Default group

70. How do you differentiate stop and abort options in a workflow monitor?

The below table will detail the differences between the stop and abort options in a workflow
monitor:

Stop Abort

The stop option is used for executing the The abort option turns off the task
session task and allows another task to run. completely that is running.

Abort waits for the services to be


While using this option, the integration service
completed, and then only actions take
stop reading data from the source of the file
place

Processes data to the targets or sources Time out period of 60 seconds


Stops the process and session gets
Stops sharing resources from the processes
terminated

71. Is it possible to store previous session logs in Informatica?

 Yes, it is possible. If any session is running in timestamp mode, then automatically


session logout will not overwrite the current session log.
 Go to Session Properties –> Config Object –> Log Options.
 Select the properties as follows.
 Save session log by –> SessionRuns
 Save session log for these runs –> Change the number that you want to save the number
of log files (Default is 0)
 If you want to save all of the log files created by every run, and then select the option
Save session log for these runs –> Session TimeStamp.

72. What do you know about Data-Driven sessions?

 In Informatica, Data-Driven is the property that decides the way the data needs to
perform when mapping includes an Update strategy transformation.
 By mentioning DD_INSERT or DD_DELETE or DD_UPDATE in the update strategy
transformation, we can execute data-driven sessions.

73. What is a Mapplet in Informatica?

Ans. A reusable data object created in the Mapplet designer is called a Mapplet. It includes a
collection of transformations that allows you to reuse transformation logic in different
mappings.

74. What is the difference between Mapping and Mapplet?

The following are the difference between Mapping and Mapplet:

Mapping Mapplet

A set of source, target, and transformation Collection of transformations

They are not reusable and developed with different


They are reusable
transformation

75. List the transformations used for SQL override.

The below-listed transformations are used for SQL override:


 Source Qualifier
 Lookup
 Target

76. State the differences between SQL override and Lookup override.

The differences between SQL override and Lookup override are listed below:

SQL Override Lookup Override

Limits the no of rows that enter the Limits the no of lookup rows for avoiding table scan
mapping pipeline and saves lookup time

Manually enters the query By default, the "Order By" clause

Supports any kind of join by writing


Supports only Non-Equi joins
the query

77. What is a shared cache?

A shared cache is a static lookup cache shared by various lookup transformations in the
mapping. Using a shared cache reduces the amount of time needed to build the cache.

78. Explain code page compatibility.

Compatibility between code pages used for getting accurate data movement when the
Informatica Server runs in the Unicode data movement mode. There won't be any data losses if
code pages are identical. One code page can be a superset or subset of another.

79. Define Expression transformation?

In Informatica, expression transformation is a passive transformation that allows performing


non-aggregate calculations on the source data. It means you can perform calculations on a
single row. Using this transformation, you can test data before passing it to the target table or
another transformation through conditional statements.

80. What is Aggregator transformation?

Aggregator Transformation in Informatica is an active transformation that allows you to perform


calculations like sum, average, etc. We can perform aggregate operations over a group of rows,
and it stores all the estimates and records in a temporary placeholder called aggregator cache
memory.
81. What do you know about filter transformation?

Filter transformation in Informatica is an active transformation that changes the number of rows
passed through it. It allows the rows to pass through it based on specified filter conditions and
drops rows that don't meet the requirement. The data can be filtered based on one or more
terms.

82. Why is union transformation active?

In Informatica, union transformation is an active transformation because it combines the two or


more data streams into one. The total no of rows passing through Union is equal to no of rows
passing out of it, and the sequence of rows from the input stream preserved are the same in the
output stream, but we cannot preserve the position of the rows.

83. What is the use of incremental aggregation in Informatica?

Incremental aggregation usually gets created when a session gets created through the
execution of an application. This aggregation allows you to capture changes in the source data
for aggregating calculations in a session. If the source changes incrementally, you can capture
those changes and configure the session to process them. It will allow you to update the target
incrementally, rather than deleting the previous load data and recalculating similar data each
time you run the session.

84. What does reusable transformation mean?

Reusable transformations are practiced numerous times in mapping. It is stored as metadata


and is different from other mappings that use transformation. If any changes are performed to
the reusable transformation, then all the mappings where the transformation is used get
nullified.

85. How does update strategy work in Informatica?

The update strategy is the active and connected transformation that allows to insert, delete, or
update records in the target table. Also, it restricts the files from not reaching the target table.

86. Differentiate Informatica and Datastage.

Both Informatica and Datastage are powerful ETL tools. Still, the significant difference between
both is Informatica forces you to organize in a step-by-step process. In contrast, Datastage
provides flexibility in dragging and dropping objects based on logic flow.

Informatica Datastage
Dynamic partitioning Static partitioning

Supports flat-file lookups Supports hash files, lookup file sets, etc.

It has a service-oriented architecture It has a client-server architecture

Step by step data integration solution Project-based integration solutions

87. Explain transaction control transformation.

Transaction Control in Informatica is an active and connected transformation that allows


committing or rollbacking transactions during mapping execution. A transaction is a collection
of rows bound by commit or rollback rows. Based on variation in no input rows, a transaction is
defined. Commit or rollback operations ensure data availability.

The built-in variables available in this transformation are:

 TC_CONTINUE_TRANSACTION
 TC_COMMIT_BEFORE
 TC_COMMIT_AFTER
 TC_ROLLBACK_BEFORE
 TC_ROLLBACK_AFTER

88. How will you load flatfile by using increament load

89. What is constraint based order ?

90. What is delimit file ?

In a file each and very column is separated by a delimiter like a comma (,) or a tab or a tilt
sympbol

88.What are the transformations that use cache for performance?

Ans : Aggregator, Lookups, Joiner and Ranker

91. Difference between Union and Union all Transformation in informatica ?


This is a set operations , it should have same structure of tables for performing union
function

In SQL it will remove duplicate while using same structure table & union all it will not
remove duplicate

But in Informatica , Union will act as union all , that is Union will not remove duplicate

92. Set max variable in Informatica did you use increament load

93. What is caches ?

94. Tell me some of the Dimension & fact tables in your project

Fact table:
-----------

https://dwbi1.wordpress.com/2019/02/20/transactional-fact-
tables/#:~:text=A%20transactional%20fact%20table%20is,same%20as%20the%20source%
20table.

Dimension:
==========

Account_dim
User_dim
Cust_dim
date_dim
product_dim
region_dim
branch_dim
store_dim
employee_dim

92. Why unconnected lookup transformation used ?

1. Lookup Reusability :

Unconnected Lookup can be used as a reusable in different pipeline in the mapping , we


can use same formula of Expression multiple time in different pipeline ,
LKP.LKPTRANS(lookup_column)

Its easy to connect since it has no pipeline or physical connection like connected lookup
2. Condition lookup :

Out of some million records if some 10000 records not having production description in
that case we can use unconnected lookup to update
IIF(isnull(name),:LKP.lkptrans(ssn),name)

IIF(ISNULL(PROD_Description),:LKP.LKPTRANS(Prod_id),prod_description)

IIF(ISNULL(PROD_Description), It means if is null of prod_Description than I can go for


Lookup

:LKP.LKPTRANS(Prod_id), I can pass the prod_id

, PROD_Description Else I will get the same product descriptions

IIF(ISNULL(PROD_Description),:LKP.LKPTRANS(Prod_id),prod_description) Similarly some


different condition also we can write
93. What is the diff between static , dynamic , Shared , persistant cache ?

1. Static cache - Everytime the cache will get varnished after the completion
of the session
2. Dynamic cache – while using SCD it will not allow duplicate from source to
target if when we enable Dynamic cache first it will insert in target
thanlookup cache will create for that particular record , if it found same
record again source it will identify & it will not allow the duplicate record
it will sych up with the After enable dynamic cache a port will create new
loolkup row , no nee to go for new record
3. Persistant cache

94. What is performance tuning?

Have you implemented in your project ?

The goal of performance tuning is to optimize session performance by eliminating


performance bottlenecks.

To tune session performance, first identify a performance bottleneck, eliminate it,

And then identify the next performance bottleneck until you are satisfied with the session
performance.

Performance bottlenecks can occur in the source and target, the mapping, the session,
and the system.
Source bottlenecks:-

Inefficient query or small database network packet sizes can cause source bottlenecks.

To identify source bottle neck if source is a relational table, put a filter transformation in
the mapping,

Just after source qualifier;make the condition of filter to false.

Without filter total time = time taken by (source + transformations + target load)

Now because of filter, total time = time taken by source.

Target bottlenecks:-

Small database checkpoint intervals, small database network packet sizes,

Or problems during heavy loading operations can cause target bottlenecks.

If the target is a relational table, then substitute it with a flat file and run the session.

If the time taken now is very much less than the time taken for the session to load to
table,

Then the target table is the bottleneck.

Mapping bottlenecks:-

A complex mapping logic or a not well written mapping logic can lead to mapping
bottleneck.

With mapping bottleneck, transformation thread runs slower causing the reader thread to
wait

For free blocks and writer thread to wait blocks filled up for writing to target.

Session bottlenecks:-

If you do not have a source, target, or mapping bottleneck, you may have a session
bottleneck.

Small cache size, low buffer memory, and small commit intervals can cause session
bottlenecks.

System bottlenecks

Optimization:

1. Caching the lookup table:


When caching is enabled the informatica server caches the lookup table and queries the
cache during the session.

When this option is not enabled the server queries the lookup table on a row-by row
basis.

If your mapping contains multiple lookups that look up on the same lookup table,

It is suggested you share the cache in order to avoid performing caching multiple times.

2. Optimizing the lookup condition :

Whenever multiple conditions are placed, the condition with equality sign should take
precedence.

3.lookup override :

You can reduce the processing time if you use lookup sql override properly in the lookup
transformation.

You can use the lookup sql override to reduce the amount of data that you look up.

This also helps in saving the cache space.

Suppress order by statement by appending two dashes (--)

Remove all ports not used downstream or in the sql override

4. Indexing the lookup table :

The cached lookup table should be indexed on order by columns.

The session log contains the order by statement

The un-cached lookup since the server issues a select statement for each row passing

Into lookup transformation, it is better to index the lookup table on the columns in the

Condition.

5.replace large lookup tables with joins in the source qualifier when possible

(take always less number of records table as a lookup table)

Optimize filter transformation:

You can improve the efficiency by filtering early in the data flow.

Use source qualifier to filter the data. You can also use s
Ource qualifier sql override to filter the records, instead of using filter transformation.

Replace multiple filters with a router

Source qualifier transformation:

Bring only the required columns from the source to the source qualifier.

Avoid using order by clause inside the source qualifier sql override.

Optimize aggregate transformation:

1.group by simpler columns. Preferably numeric columns.

2.use sorted input. The sorted input decreases the use of aggregate caches.

3.use as early as possible

4.filter data before aggregating

5.limit the no of ports used in the aggregator transformation.

Optimize seq. Generator transformation:

1. Try creating a reusable seq. Generator transformation and use it in multiple mappings

Optimize expression transformation:

1. Minimize aggregate function calls.

2. Replace common sub-expressions with local variables(variable ports)

3. Use operators instead of functions.

Optimizing joiner transformation :

1.it is recommended to assign the table with lesser number of records as master while
using joiner transformation.

2.it is also recommended to perform joining in the source qualifier using sql override as

Performing joins on the database is sometimes faster compared to performing in


informatica.

3.additionally, pass the sorted data to joiner transformation

4.perform normal joins when possible

• common sources of problems

• too many transforms


• unused links between ports

• too many input/output or outputs ports connected out of aggregator, ranking,


lookup transformations

• unnecessary data-type conversions

96. What is reuseble transformations ?

Changes to a reusable transformation that you enter through the

Transformation developer are immediately reflected in all instances of that transformation.

While this feature is a powerful way to save work and enforce standards,

You risk in validating mappings when you modify a reusable transformation

Mappings can contain reusable and non-reusable transformations.

Non-reusable transformations exist within a single mapping.

Reusable transformations can be used in multiple mappings.

Files that created during session?

Error logs,bad files,workflo logs,sessionlogs

Standalone command tasktask can be used anywhere in the workflow to run the shell
commands

97. What are session logs threads can and explain?

Session logs contain information about the tasks that the integration service performs
during a session,

Plus load summary and transformation statistics

Allocation of heap memory execution of pre-session commands

Creation of sql commands for reader and writer threads

Start and end times for target loading

Errors encountered during the session and general information

Execution of post-session commands

Load summary of reader, writer, and dtm statistics


Integration service version and build number

Complex Queries in SQL ( Oracle )

These questions are the most frequently asked in interviews.

1. To fetch ALTERNATE records from a table. (EVEN NUMBERED)


select * from emp where rowid in (select decode(mod(rownum,2),0,rowid, null) from
emp);

2. To select ALTERNATE records from a table. (ODD NUMBERED)


select * from emp where rowid in (select decode(mod(rownum,2),0,null ,rowid) from
emp);

3. Find the 3rd MAX salary in the emp table.


select distinct sal from emp e1 where 3 = (select count(distinct sal) from emp e2 where
e1.sal <= e2.sal);

4. Find the 3rd MIN salary in the emp table.


select distinct sal from emp e1 where 3 = (select count(distinct sal) from emp e2where
e1.sal >= e2.sal);

5. Select FIRST n records from a table.


select * from emp where rownum <= &n;

6. Select LAST n records from a table


select * from emp minus select * from emp where rownum <= (select count(*) - &n from
emp);
7. List dept no., Dept name for all the departments in which there are no employees
in the department.
select * from dept where deptno not in (select deptno from emp);
alternate solution: select * from dept a where not exists (select * from emp b where
a.deptno = b.deptno);
altertnate solution: select empno,ename,b.deptno,dname from emp a, dept b where
a.deptno(+) = b.deptno and empno is null;

8. How to get 3 Max salaries ?


b.sal) order by a.sal desc;

9. How to get 3 Min salaries ?


select distinct sal from emp a where 3 >= (select count(distinct sal) from emp b where
a.sal >= b.sal);

10. How to get nth max salaries ?


select distinct hiredate from emp a where &n = (select count(distinct sal) from emp b
where a.sal >= b.sal);

11. Select DISTINCT RECORDS from emp table.


select * from emp a where rowid = (select max(rowid) from emp b
where a.empno=b.empno);

12. How to delete duplicate rows in a table?


delete from emp a where rowid != (select max(rowid) from emp b
where a.empno=b.empno);

13. Count of number of employees in department wise.


select count(EMPNO), b.deptno, dname from emp a, dept b where
a.deptno(+)=b.deptno group by b.deptno,dname;
14. Suppose there is annual salary information provided by emp table. How to fetch
monthly salary of each and every employee?

select ename,sal/12 as monthlysal from emp;

15. Select all record from emp table where deptno =10 or 40.

select * from emp where deptno=30 or deptno=10;

16. Select all record from emp table where deptno=30 and sal>1500.

select * from emp where deptno=30 and sal>1500;

17. Select all record from emp where job not in SALESMAN or CLERK.

select * from emp where job not in ('SALESMAN','CLERK');

18. Select all record from emp where ename in 'BLAKE','SCOTT','KING'and'FORD'.

select * from emp where ename in('JONES','BLAKE','SCOTT','KING','FORD');

19. Select all records where ename starts with ‘S’ and its lenth is 6 char.

select * from emp where ename like'S____';

20. Select all records where ename may be any no of character but it should end with
‘R’.

select * from emp where ename like'%R';

21. Count MGR and their salary in emp table.

select count(MGR),count(sal) from emp;


22. In emp table add comm+sal as total sal .

select ename,(sal+nvl(comm,0)) as totalsal from emp;

23. Select any salary <3000 from emp table.

select * from emp where sal> any(select sal from emp where sal<3000);

24. Select all salary <3000 from emp table.

select * from emp where sal> all(select sal from emp where sal<3000);

25. Select all the employee group by deptno and sal in descending order.

select ename,deptno,sal from emp order by deptno,sal desc;

26. How can I create an empty table emp1 with same structure as emp?

Create table emp1 as select * from emp where 1=2;

27. How to retrive record where sal between 1000 to 2000?


Select * from emp where sal>=1000 And sal<2000

28. Select all records where dept no of both emp and dept table matches.
select * from emp where exists(select * from dept where emp.deptno=dept.deptno)

29. If there are two tables emp1 and emp2, and both have common record. How can I
fetch all the recods but common records only once?
(Select * from emp) Union (Select * from emp1)
30. How to fetch only common records from two tables emp and emp1?
(Select * from emp) Intersect (Select * from emp1)

31. How can I retrive all records of emp1 those should not present in emp2?
(Select * from emp) Minus (Select * from emp1)

32. Count the totalsa deptno wise where more than 2 employees exist.
SELECT deptno, sum(sal) As totalsal
FROM emp
GROUP BY deptno
HAVING COUNT(empno) > 2

You might also like