Informatica Fundamentals 5
Informatica Fundamentals 5
1. Introduction
Organizations have a number of ERP, CRM, SCM and Web application
implementations and are hence burdened with the maintenance of these
heterogeneous environments. To address the existing and evolving integration
requirements, organizations need a reliable and scalable data integration
architecture so that individual projects can build value on one another.Informatica
provides a complete range of tools and data services needed to address the most
complex data integration projects.
2. Purpose and Intended Audience
The purpose of this document is to provide an overview of the architecture of
Informatica, its features, its working, the advantages offered by Informatica vis-vis the other data integration tools etc.
This document is intended as a reference material for members of the ETL team
so as enable the team members in getting an initial understanding of the
Architecture, Features and Working of Informatica.The Case Study provided
herein would help the reader in getting a good working knowledge of the
application.
3. Assumptions:
In order to follow this document better, the reader would be required to have a
sound knowledge of the Data Warehousing concepts and also have an exposure to
SQL as a language for the database. Knowledge of ODBC and basic networking
is essential to help install Informatica and knowledge of Unix and Shells would be
helpful for Unix based servers.
4. Informatica in the Data Warehousing Scenario
a) What is a Data Warehouse?
A Data Warehouse is a Subject Oriented, Integrated, Non volatile, and Time
Variant repository of data that is generally used for querying and analyzing the
past trends to support management decisions for the future.
A Data Warehouse can be a relational database, multidimensional
database, flat file, hierarchical database, object database, etc.
Please refer the following links for more information on Data Warehousing concepts
http://www.dwinfocenter.org/
DWH_Material_Prese
ntation.ppt
Requirement Gathering
The Project team will gather end user reporting requirements and the
remaining period of the project would be dedicated to satisfying these
requirements.
ii.
iii.
Data Modeling
The foundation of the data warehousing system is the data model. The first
step in this stage is to build the Logical data model based on the user
requirements and the next step would be to translate the Logical data
model into a Physical data model.
iv.
v.
Reporting: Design, Develop and enable the end users to visualize the
reports thereby bringing value to the Data Warehouse.
Designer: The Designer has five tools that are used to analyze sources,
design target schemas and build the Source to Target mappings. These are
Source Analyzer: This is used to either import or create the
source definitions.
Warehouse Designer: This is used to import or create target
definitions.
Mapping Designer: This is used to create mappings that will be
run by the Informatica Server to extract, transform and load data.
Transformation Developer: This is used to develop reusable
transformations that can be used in mappings.
Mapplet Designer: This is used to create sets of transformations
referred to as Mapplets which can be used across mappings.
iii.
c) Informatica Server:
The Informatica Server reads the mapping and the session information from the
repository. It extracts data from the mapping sources, stores it in the memory,
applies the transformation rules and loads the transformed data into the mapping
targets.
Connectivity:
Informatica uses the Network Protocol, Native Drivers or the ODBC for the
Connectivity between its various components. The Connectivity details are as
provided in the diagram above.
6. Setting up Informatica:
i.
ii.
iii.
iv.
v.
vi.
Go to StartSettingsControl Panel
Go to Administrative ToolsData Sources(ODBC)
iii.
iv.
v.
vi.
vii.
viii.
vii.
viii.
ix.
x.
xi.
7. Case Study
A Transformation is a repository object that generates, modifies, or passes data.
The various Transformations that are provided by the Designer in Informatica have been
explained with the aid of a mapping, Map_CD_Country_code. (Explained in blue)
The mapping is present in the cifSIT9i repository of the SIT machine under the folder
Ecif_Dev_map
Objective: The mapping Map_CD_Country_code has been developed to extract data
from the STG_COUNTRY table and move it into the ECIF_COUNTRY and the
TRF_COUNTRY target tables.
a) Source Definition:
i.
ii.
iii.
The circled area provides the location of the object that the shortcut references.
In the above ex, the object referenced by the shortcut is present in the cifSIT9i repository
under the Ecif_dev_def folder and the object name is STG_COUNTRY.
All fields from the Source are moved into the Source Qualifier.
*For information on the Naming Standard, please refer the document embedded below:
Informatica_ETL_Na
ming_Conventions.doc
P.N: The Naming standards provided in the document indicate generic standards that
CAN be followed while designing a mapping.
What are the advantages of having a Shortcut?
The following are the main advantages of having a Shortcut:
The main advantage of having a shortcut is maintenance.
If all instances of an object have to change, the original repository object is the
only object that has to be edited and all shortcuts accessing the object
automatically inherit the changes.
Restricting the repository users to a set of predefined metadata by asking users to
incorporate the shortcuts into their work instead of developing repository objects
independently.
Space can be saved in a repository by keeping a single repository object and using
shortcuts to that object, instead of creating copies of the object in multiple folders.
For information on creating and working with Shortcuts, refer the Informatica Designer
Help.
b) Source Qualifier (SQ_Shortcut_To_STG_COUNTRY):
i. The Source Qualifier is an Active transformation.
ii. The differences between an Active and a Passive transformation are as
given below:
Active Transformation
Passive Transformation
An Active Transformation can change the
A Passive Transformation does not change
number of rows that pass through it
the number of rows that pass through it.
Ex.:
Ex:
Advanced External Procedure
Expression
Aggregator
External Procedure
ERP Source Qualifier
Input
Filter
Lookup
Joiner
Output
Normalizer
Sequence Generator
Rank
Stored Procedure
Source Qualifier
XML Source Qualifier
Router
Update Strategy
The ISO_CTRY_COD field from the Source Qualifier is moved to the Lookup
transformation LKP_CTRY_COD and all the fields including the
ISO_CTRY_COD is moved to the Expression transformation EXP_COUNTRY.
iv.
Constant
Numeric Value
Insert
DD_INSERT
Update
DD_UPDATE 1
Delete
DD_DELETE 2
Reject
DD_REJECT
Click on RepositoryConnect
Provide the Username
Expand the Ecif_Dev_map folder.
Select the s_Map_CD_Country_code in the right pane, right click and select
edit.
Properties for Sessions window open up.
Pls refer fig below.
v. The Treat rows as option determines the treatment for all rows in the
session. The options provided here are insert, delete, update or data-driven.
vi. If the mapping for the session contains an Update Strategy transformation,
this field is marked Data Driven by default. If any other option is selected,
the Informatica Server ignores all Update Strategy transformations in the
mapping.
vii. The Data Driven option is selected if records destined for the same table
need to be flagged on occasion for one operation (for example, update), or
for a different operation (for example, reject).
viii.
Records can be flagged for reject only with this option.
For more info on Update Strategy transformation and other settings for Update Strategy,
refer the Informatica Designer help.
ix.
The Forward Rejected Rows option indicates whether the Update Strategy
transformation pass rejected rows to the next transformation or rejects them.
x.
By default, Informatica Server forwards rejected rows to the next
transformation.
xi. The Informatica Server flags the rows for reject and writes them to the
session reject files.
xii. If the Forward Rejected Rows is not selected, the Informatica Server drops
rejected rows and writes them to the session log file.
Update Strategy UPD_COUNTRY_CODE updates the target table
Shortcut_to_ECIF_COUNTRY which is a shortcut to the ECIF_COUNTRY
table.
If multiple informatica mappings write to the same target table, the sequence
generator should be used as a reusable object or a shortcut.
If non informatica routines write to the same target table, using a trigger or a
database method is recommended.
The document provided below highlights the Best Practices that can be taken into
consideration either while designing mappings or when running sessions.
Informatica_Tuning_
Guide.doc
For info on the features in the Informatica Power Center 6.2, refer the link below:
http://www.itap.purdue.edu/ea/files/PMPC-62_release%20notes%20for%206.2.pdf
Pls refer the link below for enhancements related to Informatica PowerCenter 7.1
http://www.csn.no/nyhetsbrev/0402NyhetsbrevInfa_files/whats_new_PC7_dec2003.pdf