0% found this document useful (0 votes)
1K views21 pages

Informatica Training

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1/ 21

Overview of

Informatica
 Overview of Informatica PowerCenter

 Architecture

 Key Development Steps

 A Sample Mapping
To fetch the data from different systems, making it coherent,
and loading into a Data Warehouse requires some kind of
extraction, cleansing, integration, and load. ETL stands for
Extraction, Transformation & Load.
Data
RDBMS Mainframe Other Warehouse

• Transaction level  Aggregate data  Aggregated data


data  Cleanse data  Historical data
• Optimized for  Consolidate data
transaction response
time  Apply business
rules
• Current
 De-normalize data
• Normalized or De-
normalized data
Transform
Extract ETL Load
• Informatica is a tool, supporting all the steps of Extraction, Transformation and Load process. Now a days Informatica is
also being used as an Integration tool.

• Informatica can communicate with all major data sources (mainframe/RDBMS/Flat Files/XML/VSM/SAP etc), can
move/transform data between them. It can move huge volumes of data in a very effective way, many a times better than
even bespoke programs written for specific data movement only. It can throttle the transactions (do big updates in small
chunks to avoid long locking and filling the transactional log). It can effectively join data from two distinct data sources
(even a xml file can be joined with a relational table)

• Some facts and figures about Informatica Corporation:

Founded in 1993, based in Redwood City, California


1400+ Employees; 3450 + Customers; 79 of the Fortune 100 Companies
NASDAQ Stock Symbol: INFA; Stock Price: $18.74 (09/04/2009)
Revenues in fiscal year 2008: $455.7M
Informatica Developer Networks: 20000 Members

• The important products provided by Informatica Corporation is provided below:


Power Center
Power Mart
Power Exchange
Power Center Connect
Power Channel
Metadata Exchange
Power Analyzer
Super Glue
PowerCenter provides an environment that
allows you to load data into a centralized
location, such as a data warehouse or
operational data store (ODS). You can extract
data from multiple sources, transform the data
according to business logic you build in the
client application, and load the transformed data
into file and relational targets.
Sources
Targets

Standard: RDBMS, Flat Files, XML,


Standard: RDBMS, Flat Files, XML,
ODBC
ODBC

Applications: SAP R/3, SAP BW,


Applications: SAP R/3, SAP BW,
PeopleSoft, Siebel, JD Edwards, i2
PeopleSoft, Siebel, JD Edwards, i2

EAI: MQ Series, Tibco, JMS, Web


EAI: MQ Series, Tibco, JMS, Web
Services
Services

Legacy: Mainframes (DB2, VSAM,


Legacy: Mainframes (DB2)AS400
IMS, IDMS, Adabas)AS400 (DB2,
(DB2)
Flat File)

Remote Targets
Remote Sources
Informatica ETL product, known as Informatica Power Center consists of following main components:

1.Informatica PowerCenter Client Tools:

These are the development tools installed at developer end. These tools enable a developer to
• Define transformation process, known as mapping. (Designer)
• Define run-time properties for a mapping, known as sessions (Workflow Manager)
• Monitor execution of sessions (Workflow Monitor)
• Manage repository, useful for administrators (Repository Manager)
• Report Metadata (Metadata Reporter)

2. Informatica PowerCenter Repository:


Repository is the heart of Informatica tools. Repository is a kind of data inventory where all the data related to mappings,
sources, targets etc is kept. This is the place where all the metadata for your application is stored. All the client tools and
Informatica Server fetch data from Repository. Informatica client and server without repository is same as a PC without
memory/harddisk, which has got the ability to process data but has no data to process. This can be treated as backend of
Informatica.

3. Informatica PowerCenter Server:


Server is the place, where all the executions take place. Server makes physical connections to sources/targets, fetches
data, applies the transformations mentioned in the mapping and loads the data in the target system.

4. PowerCenter Domain:
The Power Center domain is the primary unit for management and administration within PowerCenter. The Service
Manager runs on a PowerCenter domain. The Service Manager supports the domain and the application services.
Application services represent server-based functionality and include the Repository Service, Integration Service, Web
Services Hub, and SAP BW Service
5. Administration Console:
The Administration Console is a web-based administration tool you can use to administer the PowerCenter domain.

6. Repository Service:
The Repository Service accepts requests from the PowerCenter Client to create and modify repository metadata and
accepts requests from the Integration Service for metadata when a workflow runs.

7.Integration Service:
The Integration Service extracts data from sources and loads data to targets.

8.Web Services Hub:


Web Services Hub is a gateway that exposes PowerCenter functionality to external clients through web services.

9. SAP BW Service:
The SAP BW Service extracts data from and loads data to SAP BW.

10. Data Analyzer:


Data Analyzer provides a framework to perform business analytics on corporate data. Data Analyzer provides capability
to extract, filter, format, and analyze corporate information from data stored in a data warehouse, operational data
store, or other data storage models.
PM Server
Native/ODBC Native/ODBC
Sources Targets
Native/ODBC
Native/ODBC
TCP/IP

TCP/IP Repository
Server
Heterogeneous Heterogeneous
Sources Targets
TCP/IP TCP/IP

Repository
Agent
Native

Repository Designer Workflow Workflow Repository


Manager Monitor Manager
 Repository
◦ Consists of a relational database
◦ Contains the information and instructions
required to extract, transform and load data
◦ Logs activity from the workflow manager
◦ Captures metadata
◦ Repositories can be Local or Global
 Server

◦ Engine that drives Informatica processes

◦ Uses information from the repository to process


data

◦ Extracts data from sources, performs


transformations on the data, then stores the
results into targets
 Sources – Informatica supports:
◦ Relational databases – Oracle, Sybase, DB/2, SQL
Server, and Teradata.
◦ Files – Fixed and delimited flat files, and XML.
◦ Applications – PowerConnect products can
connect with Peoplesoft, SAP (R/3), Siebel, MQ
Series, and Tibco.
◦ Mainframes – PowerConnect products also
available for IBM DB2 on MVS.
◦ Other connectivity sources
 Targets – Informatica supports:
◦ Relational databases – Oracle, Sybase, DB/2, SQL
Server, and Teradata.
◦ Files – Fixed and delimited flat files, and XML.
◦ Applications – SAP BW (with PowerConnect). MQ
Series, and Tibco is also supported.
◦ Other connectivity targets and methods, using
ODBC, native drivers, and FTP.
Repository Designer Workflow Workflow
Manager Manager Monitor

Manage Repository: Build ETL Build and Monitor


• Connections Mappings start and start
• Folders workflows to workflows
run mappings
• Objects
• Users and Groups
 PowerCenter has a service-oriented architecture that provides the ability to scale
services and share resources across multiple machines. The PowerCenter domain is
the fundamental administrative unit in PowerCenter. The domain supports the
administration of the distributed services. A domain is a collection of nodes and
services that you can group in folders based on administration ownership. A node is
the logical representation of a machine in a domain. One node in the domain acts
as a gateway to receive service requests from clients and route them to the
appropriate service and node. Services and processes run on nodes in a domain.
 Services for the domain include the Service Manager and a set of application
services
 Service Manager. A service that manages all domain operations. It runs the
application services and performs domain functions on each node in the domain.
Some domain functions include authentication, authorization, and logging.
 Application services. Services that represent PowerCenter server-based
functionality, such as the Repository Service and the Integration Service. The
application services that runs on a node depend on the way you configure the
services
 The domain is managed through the PowerCenter Administration Console. The
Administration Console consolidates administrative tasks for domain objects such
as services, nodes, licenses, and grids. You access the Administration Console
through a web browser.
 The PowerCenter Administration Console is the administration tool used
to administer the PowerCenter domain. The Administration Console is
used to perform administrative tasks such as managing logs, user
accounts, and domain objects. Domain objects include services, nodes,
grids, folders, and licenses.
 Domain is the highest object in the Navigator hierarchy.
 Folders in the domain are used to organize objects and to manage
security. Folders can contain nodes, services, grids, licenses, and other
folders.
 Application services are a group of services that represent PowerCenter
server-based functionality. The 2 major application services are
Integration Service and Repository Service.
 The Integration Service is an application service that runs data integration
sessions and workflows.
 The Repository Service is an application service that manages the
repository. It retrieves, inserts, and updates metadata in the repository
database tables.
Debug Mappings
Create Folder

Create Sessions
Create Database Connections

Create and Execute Workfl


Create Sources/Target Definitions
Schedule Workflows
Develop Mappings/Transformations
 A Pass-Through mapping inserts all the source rows into the target.
 To create and edit mappings, you use the Mapping Designer tool in the
Designer. The mapping interface in the Designer is component-based.
You add transformations to a mapping that depict how the Integration
Service extracts and transforms data before it loads a target.
In next session, we will look into every transformation in detail and
do some
hands-on exercises.

You might also like