100% found this document useful (1 vote)
715 views35 pages

Informatica MDM Interview Preparation

The document discusses Informatica MDM batch processes including land, stage, load, tokenize, match, consolidate, and publish processes. It describes how these processes move and transform data between landing, staging, and base tables. It also discusses features like reject handling, audit trails, delta detection, and cleansing functions.

Uploaded by

Bharadwaj Karri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
715 views35 pages

Informatica MDM Interview Preparation

The document discusses Informatica MDM batch processes including land, stage, load, tokenize, match, consolidate, and publish processes. It describes how these processes move and transform data between landing, staging, and base tables. It also discusses features like reject handling, audit trails, delta detection, and cleansing functions.

Uploaded by

Bharadwaj Karri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 35

Informatica MDM Interview Preparation 

 
MDM Batch Process  
 

 
Batch Process 
Informatica MDM Hub provides a sequence of steps and options to load the
data, maintain lineage of data for tracking & other metadata for effective
processing and finally consolidate the data to build the best version of Truth
(BVT). Let’s look at the batch processes at a high level and understand the
features they offer. 
 
Land Process 
Land Process is external to Informatica MDM Hub and is executed using
external batch processes or external applications that directly populates
landing tables in the Hub Store.  

 
FIGURE 1: LAND PROCESS 
  
 
Stage Process 
1
Stage Process transfers source data from a “landing table to source specific
staging table” associated with a base object.  The movement of data from
landing table to staging table is defined using Mappings offered by Informatica
MDM Hub. 
 

 
FIGURE 2: STAGE PROCESS 
 
 
Mappings 
Mappings define the movement of data from the source column in the landing
table to the target column in staging table. Data from landing table is cleansed
and standardized before loading to staging table. The cleansing and
standardization can be done within Informatica MDM HUB or outside using
external tools like Informatica Data Quality (IDQ) or Informatica Power Center. 
 
Reject Records 
During the Stage Process, records that have problems are rejected, transferred
along with the reason for rejection to the Reject Table associated with the
Staging Table. If there are multiple reasons for rejection, the first reason alone
is persisted. 
 
Advantage of this feature 
This feature helps investigate the rejected records after running Stage Process.
As this feature is built into Informatica MDM, it saves considerable
development and implementation time. 
  
 
Audit Trail 
Stage Process is capable of maintaining a copy/replica of source data in a table
(called as RAW table) associated with each of the staging tables. This is
possible only if audit trail is enabled for the staging table for configurable
number of stage job runs or retention period. 

Advantage of this feature 

2
This feature is useful for auditing purposes and tracking data issues, identifying
the missing data back to the source data. Informatica MDM Hub provides this
feature with an easy configuration, saving considerable development and
implementation time. 
Delta Detection 
 
Stage Process can detect the records in the landing table that are new or
updated if delta detection is enabled for a staging table. The comparison for
delta detection can be configured based on all columns or for any date column
or specific set of columns as required. 
 
Advantage of this feature 
This feature is useful, if the data sent by source system is a full data-set, to
limit the volume to changed data only and thus improve performance in
further processes. 
 
 
Load Process 
Load Process moves data from staging table to the corresponding base object.
In addition to this, Load Process performs lookup, computes the trust score for
merging of data and running the validation rules. Each base object has tables
associated with it to capture the lineage of data, track the history of changes
and other details  
 

 
FIGURE 5: LOAD PROCESS 
Reject Records 
During the Load Process, records are rejected and transferred along with
reason for rejection to the Reject Table associated with Base Object’s Staging
Table for that source system. 
 

Advantage of this feature 

3
This feature is based on central reject table for maintaining reject records for
both Stage & Load Process. Informatica knowledge base article KB 90407
provides a good insight into the reject handling process. 
 
 Tokenize Process 
Tokenization process generates match tokens and stores them in a match key
(strip) table associated with the base object. The match tokens are
subsequently used by the Match Process to identify suspects for matching. 
 

 
FIGURE 6: TOKENIZE PROCESS 
 
Match & Consolidate Process 
Matching is the process of identifying whether two records are similar or
duplicate of the other either by exact (deterministic) or fuzzy (probabilistic)
matching. 
Consolidation is the process of consolidating data from matched records into a
single, master record once the match pairs or suspects have been identified in
the Match process. Records can flag for auto merge if the suspects are
sufficiently similar or sent for manual merge if suspects are likely to be
duplicates. 

 
FIGURE 7: LOAD & CONSOLIDATE PROCESS 
Publish 
Publishes the best version of the truth to other systems or processes that use
outbound JMS message queues. 
CLEANSE FUNCTIONS 
4
Source System Cleanse Maps
One of the primary methods to bring data into the Informatica MDM Hub from
source systems is through the land, stage, and load processes. During the stage
process, the data can be modified by data standardization routines, such as
routines that remove erroneous characters, and other data quality procedures.
This process is referred to as data cleansing.

Cleanse maps define the rules for how to move data from landing tables to
staging tables. These maps are based on one landing table and one staging
table. Each field in the landing table can be directly mapped to a field in the
staging table or the field can be modified by mapping it through one or more
cleanse functions before mapping it to the staging table field. A typical cleanse
routine found on a cleanse map might remove excess spaces (trim) from the
data, or change all of the characters in a string field to uppercase.

These maps are also called stage maps and are executed using a batch process
by running the stage job associated with the specific staging table used by the
map. The maps can also be called in real-time through calls to the SIF API.

The cleanse maps defined in the MDM Hub sample Operational Reference
Store are classified based on the following systems:
 Legacy
 SFA
 Lookup
 Product
 ODI
Some cleanse maps are simple while others are complex, using multiple
custom cleanse functions.

5
Cleanse Functions
Informatica MDM Hub comes with a standard set of cleanse functions that
consist of common string manipulation functions, logical operations, data
conversion functions, and pre-built cleanse lists (a specific type of cleanse
function). You can combine these standard functions into custom functions
that perform data manipulation specific to the data cleanse requirements of a
particular source system.

A number of custom cleanse functions have been created as part of the sample
ORS configuration. These functions can be found in the Cleanse Function tool.
They are stored in the Custom, CustomUtilityLibrary, General Processing, and
IDD Cleanse and Validation Library folders. There are also some sample cleanse
lists in the Noise Filters folder.

In addition to these custom functions, the sample ORS contains cleanse


function libraries (folders) for third-party data quality tools (for example, 
Informatica Address Verification) and third-party data service providers. You
can access the functionality of these products using special adapters developed
on the Informatica MDM Open Cleanse architecture that allows for plugging-in
third-party data quality tools.

These third-party cleanse adapters can be purchased separately, as needed,


from Informatica. For more details, please contact your Informatica Account
Manager.

NOTE:The cleanse functions based on these products will not function unless
the underlying third-party software and Informatica MDM cleanse adapter is
correctly installed and configured on your system.

6
Cleanse Address - NA
This cleanse graph function is used to cleanse North American (NA) addresses.
It combines the 
Informatica Address Verification
 cleanse function with other cleanse functions to create a complex function
that is used as a component of the address cleanse maps. The function can also
be called as a stand alone service through the SIF API.

Parse Phone Number - NA


This cleanse graph function parses phone numbers into the different components of a North
American phone number.

7
Informatica MDM log files used for troubleshooting 

Solution 

Informatica MDM writes to a number of different log files, depending on the


process that is running. If you encounter a problem when running a process,
Informatica MDM Support will likely ask you for one or more of the log files to
assist with troubleshooting. 
Log files typically requested by Siperian Support 
If you have a failure in Staging, Match or Tokenize, Siperian Support will
usually ask you for 
 Siperian cleanse-match server log, 
 Siperian database debug log 
Support might also ask for 
 SQL Loader logs, 
 Application Server logs (i.e., WebLogic/WebSphere/JBOSS logs) 
If you have a failure in Load, Merge, Auto merge or Unmerge, Siperian
Support will usually ask you for 
 Siperian database debug log 
Support might also ask for 
 Siperian cleanse-match server log and
WebLogic/WebSphere/JBOSS logs (for Auto merge failure occurring
in BuildMatchGroups (BMG) process) 
If you have a failure in a SIF API call or BDD, Siperian Support will usually ask
you for 
 Siperian hub server log, 
 Siperian database debug log 

8
Support might also ask for 
 Siperian cleanse-match server log, 
 Application Server logs (i.e., WebLogic/WebSphere/JBOSS logs) 
If you have an error reported in the Siperian console, Siperian Support will
usually ask you for 
 Siperian console log, 
 Siperian hub server log 
Support might also ask for 
 Siperian database debug log, 
 Application Server logs (i.e., WebLogic/WebSphere/JBOSS logs) 
  

Queries and Packages Overview 


 
In MDM Hub, a query is a request to retrieve data from the Hub Store. The
request is in the form of an SQL SELECT statement. When you run a query, the
MDM Hub sends the query SQL statement to the database that contains the
Hub Store, and the database returns the results of the query. A package is a
public view of the results of the query. 
 
Generic Queries 
A generic query is a type of query that you define by using a query wizard and
building blocks. No SQL knowledge is required. The Queries tool generates an
SQL SELECT statement from the selected building blocks. The generated
SELECT statement works with all supported databases. 
 
Custom Queries 
A custom query is a type of query where you define your own SQL SELECT
statement. When you want to use database-specific SQL syntax and grammar,
create a custom query. 
 
Query Groups 
A query group is a user-defined container for queries. Use query groups to
organize your queries so that they are easier to find and run. 
TIP: If you create query groups before you create queries, you can select a
group when you create a query. 
 
Packages 

9
A package is a public view of the results of a query. Data stewards use
packages with the Data Manager and Merge Manager tools. You can also use
packages with external applications. 
  

QUESTIONS 
 
Q1)  What is MDM? 

MDM is an acronym for Master data management. It is used to


manage the critical data of a business organization and is linked to
one single file which is also called a master file. It acts as a single
point of reference to make important business decisions. MDM acts
as a central repository of data sharing between various departments
when done properly. 
 
Q2) Describe all the biggest management and technical challenges in
adopting MDM? 

There is always a challenge for technical folks in data governance to sell the
project and get the fund. There is always a look for ROI by management. They
require MDM knotted to quantifiable benefits that are considered by business
leaders such as dollar amounts around ROI. 
Return on investment (ROI) is a mathematical formula that investors can use to
evaluate their investments and judge how well a particular investment has performed
compared to others. An ROI calculation is sometimes used with other approaches to
develop a business case for a given proposal 
 
Q3) What is the use of hosting such conferences in helping MDM? 

It is very necessary to host such conferences. For example, the budgets of the
Pharmaceutical and Financial services industries are increasing. By 2010,
Forrester predicted MDM to be a $6 billion market which was a sixty percent
10
growth rate over the $1 billion MDM market in the previous year. Gartner also
forecasted that seventy percent of two thousand global companies will have
an MDM solution, by the year 2010. 
  
Q4) What is data warehousing? 
 Data Warehousing (DW) is a method of gathering and managing data from multiple
sources to help organizations with valuable insights. A typical data warehouse is majorly
used to integrate and analyze data from multiple sources. Data warehousing is the
central source for the BI tools and for visualizing data.
 Data Warehousing is associated with various components and technologies that enable
the organizations to use the data in a systematic way.  It stores all the business
information in an electronic format for further analysis instead of transaction
processing. The Data warehouse transforms the data into understandable information
and makes it available for business users. 
Q5) Explain Various stages of Data Warehousing? 

There are four fundamental stages of Data Warehousing they are: 

 Offline Operational Databases: Perhaps this is the first stage in which a


data warehouse system is developed from copying the operational process
into an offline server. This process doesn’t make any impact or disturbance
to the actual performance of the system.

 Offline Data Warehouse: In this stage, the operational data gets updated
into the warehouse on a timely basis like daily, weekly or monthly. And
also, the data gets stored in an integrated report-oriented way.

 Real-Time Data Warehouse: In this stage, data warehouses are updated


whenever an event or transaction happens. A transaction or event includes
an order or a booking or a delivery etc.

 integrated Data Warehouse: In this stage, transactions and activity


generated by warehouses go through the operating system and are helpful
in the daily functioning of a business.

Q6) What is Informatica PowerCenter? 

Informatica PowerCenter is an organization extract, transform, and load (ETL)


tool employed in building the data warehouses for an organization. It is a well-
developed organization by an Informatica organization that loads data into a
centralized point like a data warehouse. Informatica PowerCenter extracts

11
data from multiple data sources, transforms and load that data into files. It
provides the foundation for major data integrations with external parties. 
 
Q7) What is mapping? 

Data mapping is a process of mapping a field data sources to the targeted file
or location. There are multiple data mapping tools available which help the
developers in mapping the data from a source file to target file. 
 

Q8) What is Mapplet? 

Mapplet is a reusable object that has a group of transformations and allows us


to reuse the transformation logic in different mappings. 
 
Q9) List the different components of Informatica PowerCenter? 

Following is the list of components available in Informatica PowerCenter. 


1.  PowerCenter Repository 
2.  PowerCenter Client  
3.  Integration Service  
4.  Data Analyzer 
5.  PowerCenter Repository Reports 
6.  PowerCenter Domain  
7.  Administration Console 
8.  Repository Service 
9.  Web Services Hub  
10. Metadata Manager 
 
  
Q10) What is Data Mining? 

Data mining is a process of analyzing huge sets of data to find the hidden
valuable insights out of it.  It allows the users to find the previously unknown
patterns and relationships between various elements in data. The insights
extracted for data mining would help in fraud detection, marketing, and
scientific discovery, etc. The other names for data mining are Knowledge
extraction, Knowledge discovery, information harvesting, data/pattern
analysis, etc. 

12
 
Q11) What is Fact Table? 

In data warehousing, a fact table contains metrics, measures or facts about a


business process. The fact table is located at the snowflake schema or star
schema surrounded by multiple dimension tables. A fact table typically
contains two columns in which one contains facts and the other one is a
foreign key. 

Q12) Name the foreign key columns in dimension and fact table? 
Dimension Table’s foreign keys are the primary keys of entity tables. Fact Table’s foreign
keys are primary keys of dimension tables 
 

Q13) What is the Dimension Table? 


It is a table in the star schema of a data warehouse. While building Data
Warehouses dimensional data models use dimension tables and facts. The
dimension table is a compilation of hierarchies, categories, and logic. 
 

Q14) List the various methods of loading dimension tables? 

Following are the two different paths to load data in dimension tables: 

1. Conventional (slow): Before loading data into a dimension table all the keys
and constraints are validated against the data. This process maintains data
integrity and it’s a time taking process. 

2. Direct (Fast): In this process before loading the data into dimensional tables
all the constraints and keys are disabled. The constraint and key validation
process can be done once you are done with the data loading process. If any
set of data found as invalid or irrelated then this data is skipped from the index
and from all future processes as well. 
  

Q15) List the objects that are not allowed to be used in Mapplet? 

13
Following are the objects you can’t use in Mapplet: 

1. COBOL source definition 


2. Normalizer transformations 
3. Joiner transformations 
4. Pre- or post-session stored procedures 
5. sequence generator transformations that are non-reusable 
6. XML source definitions 
7. Target definitions 
8. IBM MQ source definitions 
9. Power mart 3.5 styles of Lookup functions 
 

Q16) Explain the different ways to migrate from one environment to another
in Informatica? 

Following are the different ways to migrate to different environments in


Informatica: 

1. By exporting repository and deploy into a new environment,  


2. By copying objects/folders 
3. By exporting every mapping to XML and deploying them in a new
environment. 

By using deployment groups in Informatica 


 

Q17) Explain the difference between mapping variable and mapping


parameter? 

1. A mapping variable is dynamic in nature and changes through the sessions.


The integration service saves the value of Mapping variable in the
repository on successful completion of every session. And the same value
will be used when we run the session.

14
2. A Mapping Parameter is different from a Mapping variable, it is a static
value. You are required to define a variable before executing a session and
the value you have given remains the same even after successful
completion of the session. While executing the session PowerCenter
validates the value from the Parameter and keeps the same value till the
end of the session. Whenever you run the session it values are extracted
from the file.
 

Q18) What are the different ways to delete the duplicate records from
Informatica? 

1. By selecting the distinct option in the source qualifier,  


2. By Overriding a SQL Query in Source qualifier 
3. By using Aggregator and group by all fields 
 

Q19) What are the various repositories that can be created using the
Informatica Repository Manager? 

1. Standalone Repository: This is an individual repository that functions


individually and is not related to any other repository.

2. Global Repository: This is a centralized repository in a domain. This can hold


shared objects across different repositories of a domain. All the objects are
shred using global shortcuts. 

3. Local Repository: It is a repository that resides within a domain. This


repository can be connected to a global repository using objects of shared
folders and using global shortcuts. 
 

Q20) Describe the data movement modes available in Informatica Server? 

We have two data movement modes available in Informatica.  PowerCenter


decides the handling process based on the instructions provided by the data

15
movement code.  You can select the data movement mode in the
configuration. Following are the data movement types. 
 Unicode Mode and ASCII Mode 

 OLAP -: Online Analytical Processing. It processes as an app helps that gathers,


manages, presents and processes multidimensional data for management and
analysis purposes. 
  

Q21) What is OLAP & OLTP? 

OLAP (Online Analytical Processing) is a powerful technology that works


behind the scenes to support many business intelligence applications. This
application gathers, manages, transforms and presents multidimensional data
for analysis purposes. 

OLTP (Online Transaction Processing) which consists of online data and


normalized tables. It is designed to store the operational data on a continuous
basis.  It performs day-to-day operations and is used for data analysis. OLTP
store data one transaction at a time. 
 

Q22) What are the various LOCKs used in Informatica MDM? 

Exclusive Lock: This Lock can only allow access to a single user to make
changes to underlying ORS and also blocks other users from modifying
metadata in the ORS till the Exclusive lock exits. 

Write Lock: This lock allows multiple users at a time to make changes to the
underlying metadata. 
 

Q23) In Informatica-MDM what is the Expiration module of the automatic


lock? 

In the current connection, the hub console is refreshed every 60 seconds. Here
users can release the lock manually. The lock will automatically be released
when a user switches to another database while having a hold of a lock. When
a hub console is terminated by a user the lock gets expired within a minute. 
 

16
Q24) List the tools that do not need lock-in Informatica-MDM? 

Following are the tools that do not need lock: 

1. Data Manager
2. Merge Manager
3. Hierarchy manager
4. Audit Manager
  

Q25) What are the tools in Informatica MDM that need Lock? 

To make configuration changes to the database of MDM hub master there are
multiple tools that need LOCK. They are: 

1. Message Queues 
2. Users 
3. Databases 
4. Tool Access 
5. Security Providers 
6. Repository Manager 

Q26) List the tables that can be linked to the staging date? 

We have multiple tables that can be integrated with the staging data in MDM.
They are: 

1. Raw Table 
2. Staging Table 
3. Landing Table 
4. Rejects Table 
 

Q27) What is Schema in Informatica MDM? 


The Schema is defined as a data model that is being used in the
implementation of a Siperian Hub. In general, a Siperian Hub does not require
any specific schema. The Siperian Hub contains a schema and it is
independent. 
  

17
Q28) Explain about various components of Siperian Hub in Informatica
MDM? 

The Siperian Hub consists of various components, each of them has been
designed to address specific problems. The following are the various
components of Siperian Hub. 

Master Reference Manage: It works restlessly to create the most accurate


records by performing various tasks such as data cleansing, matching,
consolidation, and merging. 

Hierarchy Manager: It builds, manages data, and also describes the


relationship between various records. 

Activity Manager: It performs functions like master data synchronization, data


events evaluation, etc. 
 

Q29) Explain the components of Informatica Hub Console? 

Following are the components of Informatica Hub Console: 

Design Console: This component is helpful in solution configuration during


deployment, and allows ongoing configuration according to the changing
needs. 

Data Steward Console: This component is being used to review consolidated


data and also matched data queued for exception handling. 

Administration Console: The component has been used to assign role-based


security and various database administrative activities. 
 

Q30) Define the Informatica MDM term base object? 


18
A Base object in MDM is used to define core business entities such as products,
employees, customers, accounts, etc.  The base object acts as an endpoint for
consolidating data from various systems. The Schema manager is the only way
you have to define base objects, it is not allowed to configure in the database. 
  

Q31) What do you know about Activity Manager in Informatica MDM? 

Informatica Activity Manager (AM) synchronizes master data, examines data


events, delivers unique views of activity and reference data from the varied
sources. 

Activity manager provides the following features: 

1. The activity manager facilitates combining master data that is resided in


Informatica hub with analytical and transactional data of other systems. 

2. The activity manager looks after data modifications, in the Informatica


MDM hub and also other transactional applications. And also, if any changes
made to the data the same will be synchronized across all other systems. 
 

Q32) Explain about Hierarchy Manager (HM) in Informatica MDM? 

The Hierarchy Manager helps you to manage hierarchy data that is associated
with the records you manage in MRM.  Whatever the applications provide data
to MRM also store relationship data across master data. This system creates
high complexity to manage data relationships because each application is
different and has a unique hierarchy. In the same way, every data mart and
data warehouse are developed to reflect relationships that are needed for
specific purposes. 

Q33) Explain various data loading stages in MDM?


 
The following are the various stages in which data is stored into hub stores in a
sequential process. 
1. Land 
2. Stage 
3. Load 
4. Match 
19
5. Consolidate 
 

SET-2

Informatica MDM is one of the most popular Master Data Management


software that allows management and organization of data through a single
unified platform. The data management software is safe and reliable.
Informatica MDM gives an insight into relationships with customers, products,
and the related data streamlining data management to drive towards
success.   
Below is a curated list of Informatica MDM interview questions most
frequently asked by the interviewers in an Informatica MDM interview process
covering topics from a Beginners to Advanced level.   

Most Frequently Asked Informatica MDM Interview Questions 


 Explain about MDM? 
 Explain to us about various fundamental phases of data
warehousing? 

20
 Define Mapplet? 
 Explain about OLAP and OLTP? 
 What is Data Warehousing? 
 Define Informatica PowerCenter 
 What is Mapping? 
 Name various objects that can’t be used in a mapplet 
 What is OLTP? 
 What is OLAP? 
 
 
 
 
 
 
 
 
 

 
  
1. Explain about MDM? 
Ans: Master Data Management (MDM) is a methodology that allows the
organization to link all its essential data into a single file, which is called a
master file. This file seems to be a standard base reference to the organization
that helps to make crucial business decisions. Master Data Management acts
as a network to share the data among various firms. 
 

2. What does the term MDM mean? 


The term MDM (Master Data Management) refers to a comprehensive method
that helps an enterprise to link its entire critical data to a single, master file
serving as a common point of reference. Informatica’s Master Data
Management simplifies data sharing across departments and members.  
 

21
3. List out different components of PowerCenter? 
Ans: The following are the different components of PowerCenter:  
 Metadata Manager 
 PowerCenter Domain 
 Repository Service 
 Administration Console 
 Integration Service 
 PowerCenter Repository Reports 
 PowerCenter Client 
 Web Services Hub 
 Data Analyzer 
 PowerCenter Repository 

4. Explain to us about Data Warehousing? 


Ans: The data warehouse is the primary method of managing and gathering
information from various sources to assist the firms with helpful insights. Using
a Data warehouse, you can analyze and integrate the data from various
sources. A data warehouse is connected with different technologies and
components that allow the firms to utilize the data in a well-organized process.
It collects all the information in a digital form for further interpretation and
Moulds the data into an understandable way and makes it feasible for business
users.  
 

5. Explain to us about various fundamental phases of data warehousing? 


Ans: following are the various four fundamental phases of data warehousing.
They are:  
 Offline Operational Databases  
 Offline data warehouse 
 Real-time data warehouse 
 Integrated data warehouse 
Now, let us know about each phase of the data warehouse with useful
insights.  
Offline Operational Databases: This is the initial stage of the data warehouse,
evolved from copying the original operating system into an offline server. This
process doesn’t impact the performance of the system.  

22
Offline Data Warehouse: In this phase, data warehouses update the
operational data regularly such as daily, weekly, and monthly and the data is
stored in a homogenous report-oriented way.   
Real-time Data Warehouse: In this phase, the operating system executes an
action or an event every time and updates the events in the data warehouse.   
Integrated Data Warehouse: In this last phase, data warehouses generate
transactions and activities that pass through the OS, which assists the
organization in daily business functionality.  
 
6. What are the most significant technical and management challenges in
adopting MDM? 
 
Ans: Every technical folk need to face challenges when they are selling a
project and getting the funds. Management is actively looking for ROI, and they
need MDM to be quantifiable advantages and profits for their business. 
 
7. What is meant by Dimensional Modelling? 

Ans: In dimensional modelling, there are two tables which are distinct from the
third normal form. In this model, fact tables used for business measurements
and the dimension table contain a context.  
 
 
8. What is meant by dimension table? 
Ans: Dimension table is a compilation of logic, categories, and hierarchies,
which can be used for a further traverse in hierarchy nodes by a user. And the
dimension table contains textual measurements stored in fact tables.  

9. Explain various methods to load the data in dimension tables? 


Ans: There are two methods to load the data in dimension tables. Following
are the two methods: 
 Conventional (Slow): Before storing the data into the dimension
table, all constraints and keys are legalized against the data, and this
process is called data integrity which is a time taking process.  
 Direct (Fast): In this process, all constraints and keys are disabled
before loading the data into the dimension table. Once loading of the
information is complete into the dimension table, it validates the
data against constraints and keys. In this process, if any set of data is

23
invalid or irrelevant, then the information is skipped from the index
and all future operations.  
10. Define fact tables? 
Ans: In a data warehouse, a fact table consists of two columns which include
metrics and measurements of business methods and foreign key. It is
determined at the center of the star or snowflake schema surrounded by
dimension tables. 
11. Explain the term Mapping? 
Ans: Mapping represents the flow of data between the sources and targets. It
is a set of target definitions and source linked by transformation objects that
defines the data transformation rules.  
12. Define Mapplet? 
Ans: Mapplet is a reusable element that consists set of changes that allow
reusing that transformation logic in involved mappings.  
13. Explain to us about Transformation? 
Ans: It is a repository element that helps to generate and modifies the data. In
mapping, transformation represents certain operations that perform on
integrated services. All the data that passes through these transformation
ports linked with mapping.  
 
14. What is Data Mining? 
Ans: Data Mining is the process of analyzing the vast amount of data to find
valuable hidden insights and compiling it into useful data. It allows the users to
discover unknown relationships and patterns between various elements in the
data. The useful insights extracted from the data might help you in scientific
discovery, marketing, fraud detection, and many more.  
 
15. List out various objects that cannot be used in the Mapplets? 
Ans: Following are the various objects that cannot use in the mapplets:  
 COBOL source definition 
 Normalizer transformations 
 Joiner transformations 
 Post or pre-session stored procedures 
 sequence generator transformations that are non-reusable 
 XML source definitions 
 Target definitions 
 IBM MQ source definitions 
 Power mart 3.5 styles of Lookup functions 
16. What are the foreign columns in fact and dimensional tables? 

24
Ans: In fact, table, foreign keys are the primary key of the dimensional table,
and in the dimensional table, foreign keys are the primary key of the entity
table.   
 
17. Explain different ways used in Informatica to switch one environment to
another? 
Ans: Following are the different ways used in Informatica to switch one
environment to another: 
 By copying folder/objects 
 By exporting deploy and repository into an environment 
 By dumping every mapping to XML and using them in a new
environment 
 Informatica deployment groups 
 
18. Differentiate Mapping variables and Mapping parameters? 
Ans:  
Mapping Variable: It is dynamic and changes the value during the session. The
PowerCenter reads the initial value of the mapping variable and stores the
value in the database after the completion of each session. And the same value
is used when you run the session.  
Mapping Parameter: It is static, and you need to define the variable before
performing the session. And this value remains the same even after
completion of the session. While running the session, integration service
legalizes the value and keeps the same value even after the termination of the
session.  
19. Explain various ways to eliminate duplicate records from Informatica? 
Ans: Following are the different ways to delete the duplicate records from
Informatica:  
 By choosing the different option in the source qualifier 
 By Revoking a SQL Query in Source qualifier 
 By using Aggregator and group by all fields 
20. How to find invalid mappings in a folder? 
Ans: Using the following query, you can find invalid mappings in a folder: 
SELECT MAPPING_NAME FROM REP_ALL_MAPPINGS WHERE  
 
SUBJECT_AREA='YOUR_FOLDER_NAME' AND PARENT_MAPPING_IS_VALIED
<>1 
 
21. Explain different repositories that can be created using the Informatica
Repository Manager? 

25
Ans: Here are the various types of repositories that you can create in
Informatica.  
 Standalone Repository: It is a single repository that operates
individually, which is not associated with other repositories.  
 Global Repository: It is a central repository and carries shared
objects over various repositories of a domain.  
 Local Repository: Local repository stays within the domain. It can
correlate to the global repository with the help of global ways and
make use of these objects in their shared folders. 
 
22. Explain different data movement modes that are available in Informatica
Server? 
Ans: Data movement modes provide a set of instructions to handle the process
of character data. Based on that, you can determine the data movement from
Informatica server configuration settings. Here are the two data movement
modes: 
 ASCII Mode 
 Unicode Mode 
23. Explain different types of Locks that are used in Informatica MDM 10.1? 
Ans:  Following are the two types of locks used in Informatica MDM 10.1: 
 Exclusive Lock allows the individual user to access and make
amends to the underlying metadata. 
 Write Lock allows various users to access and make amends to the
underlying metadata at a time.  
24. List out the tools that do not require Lock-in Informatica MDM? 
Ans: Following are the list of various tools that are not required lock: 
 Hierarchy Manager 
 Data Manager 
 Merge Manager 
 Audit Manager 
 
25. List out the tools that require Lock in Informatica MDM? 
Ans: Following are the tools that are required Lock to make configuration
amends to the database of MDM hub master in Informatica MDM: 
 Message Queues 
 Users 
 Databases 
 Tool Access 
 Security Providers 
 Repository Manager 

26
 
 
 
26. Explain about OLAP and OLTP? 
Ans:  
 Online Analytical Processing (OLAP) is a technology that works in
the background of the application to support many business
Intelligence (BI) applications. It manages, gathers and transforms
multidimensional data for analysis.  
 Online Transaction Process (OLTP) is a process that contains data
and normalized tables. This application is designed to modify the data
and perform day-to-day operations to analyze the data.  
27. What is the expiration module of automatic look in Informatica MDM? 
Ans: For every 60 seconds the hub console is refreshed in the current situation.
Lock can be released by the users manually. If a user switches to another
database while the lock is on hold, the lock will be released automatically. The
lock expires within a minute if the user terminates the hub console. 
 
28. Explain various components of the Informatica hub console? 
Ans: Following are the components of Informatica hub console: 
 Design Console: This element helps in configuration solution
during the deployment stage and allows the changes according to the
requirements in the ongoing configuration.  
 
 Data Stewed Console: This element is utilized to review
consolidated data and also check the data queued for the exception
handling.  
 
 Administration Console: This component is used to assign various
database administrative activities and role-based security.  
 
 
29. List the tables that can linked to the staging date? 
Ans: Following are the multiple tables that can integrate with the staging data
in MDM. They are: 
 Raw Table 
 Staging Table 
 Landing Table 
 Rejects Table 
30. Tell us about various loading phases in MDM? 

27
Ans: Following are the various stages in which data is stored into hub stores in
a sequential process. 
 Land 
 Stage 
 Load 
 Match 
 Consolidate 
31. Tell us about the Informatica PowerCenter? 
Ans: Informatica PowerCenter is a data integration software which is
developed by Informatica Corporation. It is a widely used ETL tool (Extract,
Transform, Load) to build organization data warehouses. The components of
Informatica PowerCenter assists in extracting the data from multiple sources,
transforming the data according to the business requirements and loading it
into targeted data warehouses. 
 
32. Describe all the biggest management and technical challenges in adopting
MDM? 
Some of the most difficult challenges in adopting MDM include:  
 Model Agility - the selected data management model must be
agile to suit the business operations and for seamless data
integration. 
 Data Governance - a strong data governance strategy must be
used to identify, capture, measure, and rectify data quality issues in
the source system.  
 Data Standards - the data standard set for the master data must
also be suitable for all other data types in the organization. 
 Data Integration - accurate data integration policies must be set to
avoid errors in the data integrations process which may result in loss
of data during the transfer. 
 Data Stewardship - data stewardship is an important step towards
maintaining data quality. 
33. What is Data Warehousing? 
Data warehouses are a trove of information updated periodically. Data
warehouses play a key role in the decision making of any business as it
contains all the data relating to processes, transactions, etc., of a company and
serves as a deciding factor. Data warehousing allows data analysts to perform
analysis, and execute complex queries on structured data sets and also
conduct data mining. Data warehouses help identify the current position of the
business by comparing all factors.  
34. Define Dimensional Modeling? 

28
Dimensional modeling is an important data structure technique to optimize
data storage in data warehouses. Dimensional modeling includes Facts tables
and Dimension tables determining measurements of the business (Facts table)
and dimensions and other calculations of the measurement (Dimension table). 
 
35. Describe various fundamental stages of Data Warehousing? 
The data warehousing stages play an integral role in determining the changes
in data in the warehouse. The fundamental stages data warehousing is:  
 Offline Operational Databases - it is the first fundamental stage in
data warehousing. The database development of an operational
system to an offline server takes place in this stage by copying the
databases. 
 Offline Data Warehouse - the data warehouses are updated
periodically from the operational systems and the data is stored in a
data structure that is reporting-oriented.  
 Real-time Data Warehouse - data warehouses are updated based
on the criteria of a particular transaction or event. A transaction is
executed every time by an operational system.  
 Integrated Data Warehouse - It is the last stage of data
warehousing. The activity or transaction is reversed back into the
operational system and the generated transactions are ready for an
organization's regular use.  
 
36. Define Informatica PowerCenter. 
Informatica PowerCenter is an ETL (Extract, Transform, and Load) tool used for
enterprise data warehouses. PowerCenter helps extract data from the selected
source, transforms it, and loads into the chosen data warehouse. Informatica
PowerCenter consists of client tools, a server, a repository, and a repository
server as its core components. It executes tasks generated by the workflow
manager and allows mapping using the mapping designer.  
 
37. Name various components of Informatica PowerCenter. 
There are many components that form the foundation of Informatica
PowerCenter. They include:  
 PowerCenter Repository 
 PowerCenter Domain 
 PowerCenter Client  
 Administration Console  
 Integration Service  
 Repository Service  

29
 Data Analyzer  
 Web Services Hub 
 PowerCenter Repository Reports  
 Metadata manager 
 
38. What is Mapping? 
Mapping can be described as a set of target definitions and sources that are
connected with transformation objects that define data transformation rules.
Mapping represents the flow of data between the targets and sources. 
 
39. What is a Mapplet? 
A Mapplet is a reusable object consisting of transformations which can be
reused in multiple mappings. A Mapplet can be created in the Mapplet
Designer. 
40. What is Transformation? 
Informatica transformations are objects in the repository capable of reading,
modifying, or passing data to defined targeted structures such as tables, files,
etc.,. Transformations represent a set of rules determining data flow and data
loading into targets. 
 
41. What is Data Mining? 
Data Mining is also known as knowledge discovery in data (KDD) for the reason
that it involves sorting through and performing complex analysis procedures
on multiple data sets to discover underlying information crucial for business
growth. 
 

42. What is a Fact Table? 


In data warehousing, a fact table is present at the center of a star scheme and
contains quantitative information relating to metrics, measurements, or facts
of a business process. 
 
43. What is a Dimension Table? 
A dimension table is a part of the star, snowflake, or star flake schema in data
warehousing. Dimension tables contain measurement of a fact and are
connected to the fact table. The dimension tables form an integral component
of dimensional modelling.  
 
44. How to connect the foreign key columns in dimension and fact table. 

30
The dimension tables must contain a primary key which corresponds to a
foreign key in a fact table and a fact table must contain a primary key that
corresponds with the foreign key in the dimension tables. 
 
 
 
 
45.Describe different methods to load dimension tables. 
The methods of loading dimension tables are:  
 Conventional Loading - all the table constraints and keys are
checked before loading the data.  
 Direct Loading - all the constraints are disabled for direct data
loading. This loading process checks table constraints post-data
loading and only indexes the qualified data.  
46. Name various objects that can’t be used in a mapplet. 
Objects that cannot be used in a mapplet include: 
 COBOL source definition 
 Target definitions 
 IBM WMQ source definitions  
 XML source definitions 
 Joiner transformations  
 Normalizer transformations  
 Non-Reusable sequence generator transformations  
 Power mart 3.5 styles Lookup functions 
 Post or pre-session stored procedures  

47. Define different ways used in Informatica to migrate from one


environment to another. 
The following are the ways to migrate from one environment to another in
Informatica:  
 The repository can be imported or exported to the new
environment  
 Using Informatica deployment groups  
 Copying folders or objects  
 All mappings can be exported to XML and later imported to a new
environment 
 
48. What are the ways for deleting duplicate records in Informatica? 
Duplicate records can be deleted from Informatica by: 

31
 Using select distinct in source qualifier  
 Using group and aggregator by all fields 
 Overriding SQL query in the source qualifier  
 
 
 
 
49. Differentiate between variable and mapping parameters. 
 A mapping parameter holds a constant value before running the
session and maintains the same constant value throughout the
complete session. The value of the mapping parameter can be
updated through the parameter file. 
 A mapping variable does not hold a constant value. The value in a
mapping variable can change through the session. Value of the
mapping variable is stored by the Informatica server in the repository
at the end of each successful session and the same value is used in
the next session.  
50. Describe various repositories that can be generated using Informatica
Repository Manager. 
There are four types of repositories that can be generated using Informatica
Repository Manager: 
 Global Repository - the global repository acts as an information
hub and stores common objects that can be used by multiple
developers through shortcuts. The objects may be operational,
application source definitions, reusable transformations, mapplets,
and mappings.  
 Local Repository - local repositories are usually used in the case of
development. A local repository facilitates creating shortcuts to
objects in shared folders in the global repository. These objects may
include source definitions, lookups, common dimensions, and
enterprise standard transformations. Copies of the objects can also
be created in non-shared folders. 
 Version Control - versioned repositories store multiple versions of
an object and each version acts as a separate object with individual
properties. PowerCenter’s version control feature allows developing,
testing, and deploying metadata into productions. 
 Standalone Repository - a standalone repository is not related to
any other repositories and functions by itself.  
 
 

32
 
51. How to find all the invalid mappings in a folder? 
The invalid mappings in a folder can be found using the below mentioned
query: 
SELECT MAPPING_NAME FROM REP_ALL_MAPPINGS WHERE  
 
SUBJECT_AREA='YOUR_FOLDER_NAME' AND PARENT_MAPPING_IS_VALIED
<>1 
 
52. Name various data movement modes in Informatica. 
Data movement mode enables the Informatica server to handle the character
data. The data movement modes can be selected in the Informatica server
configuration settings. There are two modes of data movement in
Informatica:  
 ASCII mode and  
 Unicode mode 
 
53. What is OLTP? 
OLTP is the abbreviation of Online Transaction Processing. OLTP involves
capturing, storing and processing data from multiple transactions in real-time.
All the transaction data is stored in a database. 
 
54. Describe the parallel degree of data loading properties in MDM. 
In Informatica, the parallel degree of data loading properties clarifies the
degree of parallelism set on the base object table and other related tables.
Though it does not affect all the batch processes, it has a significant effect on
the performance when used. The use of parallel degree depends on the
number of CPUs on the database and available memory. The default
parallelism value is 1.  
 
55. Explain various types of Lock used in Informatica MDM 10.1. 
Informatica MDM 10.1 ha two types of LOCK: 
 Exclusive LOCK - Exclusive LOCK allows only one user to make
changes to the underlying operational reference store.  
 Write LOCK - Write LOCK allows multiple users to make changes to
the underlying metadata, at the same time. 
 
56. What is the expiration module of automatic lock-in Informatica MDM? 
The hub console is refreshed in the current connection every minute i.e., every
60 seconds. A lock can be manually released by a user. If a user switches to

33
another database while holding a lock, the lock is released automatically. If a
user terminates the hub console, the lock expires after one minute.  
 
57.Name the tool which does not require Lock in Informatica MDM. 
Tools which do not require Lock in Informatica include:  
 Merge manager 
 Audit manager 
 Data manager and  
 Hierarchy manager 
58. Name various tools that require LOCK in Informatica MDM. 
In Informatica, some tools require LOCK to make configuration changes to the
database of the Hub Master in MDM. These tools are: 
 Tool Access 
 Message Queues 
 Security Providers  
 Databases 
 Users, and  
 Repository Manager 
59. Name the tables that are linked with staging data in Informatica MDM. 
The tables linked with staging data in Informatica are: 
 Raw Table 
 Landing Table 
 Rejects table, and  
 Staging Table 
 
60. What is OLAP? 
OLAP (Online Analytical Processing) software performs multidimensional
analysis on large volumes of data. It collects, processes, manages, and presents
data for analysis and management.  
 
61. What are the processes involved in Informatica MDM? 
The data from different sources undergoes complex processing and the
processes in Informatica include: 
 Landing - the data is acquired from the source system and pushed
into the MDM landing tables.  
 Staging - all the data in the landing tables is cleansed,
standardized and then pushed into the MDM staging tables.  
 Load - the data from the staging table is collected and loaded into
the BO table. 

34
 Tokenization - the tokenization process is used after the
configuration of match rules to generate match tokens.  
 Match - the match process plays an integral role in helping match
the records. 
 Merge or Consolidation - all the records that have been matched
are consolidated during the merge process. 
62. What is a stage process? 
The stage process involves the transfer of source data from the landing tables
to the stage tables. The stage process is completed using the stage mapping
between the landing table and the stage table. Data cleansing and
standardization is also done during the stage process.  
 

35

You might also like