0% found this document useful (0 votes)
96 views17 pages

Informatica Excel Source

This document provides guidance on using Excel as a source in Informatica and configuring lookup transformations to return multiple rows. It discusses defining a range in Excel to treat as a table, creating an ODBC connection to the Excel file, importing the source definition into Informatica, and setting the lookup policy to "Use All Values" to return multiple matching rows. The document also covers performance tuning techniques for lookup transformations such as caching tables, minimizing cache size, removing unused columns, and using indexes.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
96 views17 pages

Informatica Excel Source

This document provides guidance on using Excel as a source in Informatica and configuring lookup transformations to return multiple rows. It discusses defining a range in Excel to treat as a table, creating an ODBC connection to the Excel file, importing the source definition into Informatica, and setting the lookup policy to "Use All Values" to return multiple matching rows. The document also covers performance tuning techniques for lookup transformations such as caching tables, minimizing cache size, removing unused columns, and using indexes.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Informatica Excel Source

Last Updated on Monday, 24 September 2012 12:18 Written by Navonil Sarkar

This article is a guide on how to Unload data from EXCEL file system to target relational database using Informatica.

Excel as a Source
Follow the instructions below on how to extract data from excel file

Define Range in Excel: File Name: EXCEL_FILE Path: PowerCenter8.6.1\server\infa_shared\SrcFiles [File is on ETL Server]

A range should be defined containing the data in the excel workbook like below and then save the workbook.

This defined name DEPT will be treated as relational table by informatica.

System DSN creation: DSN Name: DSN_EXEL_FILE

Workbook selection in DSN

Importing data definition of Excel in Source Analyzer: Since Informatica treats MS Excel file as a database, Import from Databaseneeds to be selected.

System DSN created above should be selected. No username/password is required here.

DEPT needs to be selected and OK should be clicked to import the source definition. After import Data types and Precision can be changed accordingly.

[N.B.: range name that is defined in excel is the table name] Mapping needs to be created with the imported source.

Creation of Connection in Workflow Manager: A relational ODBC type connection should be created.

Session level Connection should be created with below details: Username: PmNullUser Pwd: PmNullPasswd Connection String:

And create a workflow to run the mapping.

Limitations: 1. Manual intervention is required to select the range in Excel. 2. Different DSN and Connection should be made for different excel workbook. 3. Need to have Microsoft Excel Driver under PowerCenter8.6.1\ODBC5.2

NOTE: Excel 2007 Table as Source


Create Table/ Define Name in MS Excel 2007:

Create the data set in MS Excel sheet 2007

Select the data set

Go to Insert and then Table tab.

Create the table and then Name the table

Finally Save the sheet in 2007 format.

What is Active Lookup Transformation


Last Updated on Wednesday, 10 October 2012 05:09 Written by Saurav Mitra

Informatica 9x allows us to configure Lookup transformation to return multiple rows. So now we can retrieve multiple rows from a lookup table thus making Lookup transformation an Active transformation type.

How to configure a Lookup as Active?


To use this option, while creating the transformation, we must configure the Lookup transformation property "Lookup Policy on Multiple Match" to Use All Values. Once created we cannot change the mode between passive and active. When ever the Lookup policy on multiple matchattribute is set to Use All Values. The property becomes read-only.

Implementing a Lookup As Active


Scenario: Suppose we have customer order data in a relational table. Each customer has multiple orders in the table. We can configure the Lookup transformation to return all the orders placed by a customer. Now check the below simple mapping where we want to return all employees in the departments.

Go to Transformation and click Create. Select Transformation Type as Lookup and enter a name for the transformation.

Next check the option Return All Values on Multiple Match.

Here our source is the DEPT table and the EMP table is used a lookup. The lookup condition is based on the department number.

Basically we try to achive the result as the below sql select:SELECT DEPT.DEPTNO, DEPT.DNAME, DEPT.LOC, EMP.ENAME, EMP.SAL FROM DEPT LEFT OUTER JOIN EMP ON DEPT.DEPTNO = EMP.DEPTNO

Active Lookup Transformation Restrictions:


1. 2. 3. We cannot return multiple rows from an unconnected Lookup transformation We cannot enable dynamic cache for a Active Lookup transformation. Active Lookup Transformation that returns multiple rows cannot share a cache with a similar Passive Lookup Transformation that returns one matching row for each input row.

4. ERFORMANCE TUNING OF LOOKUP TRANSFORMATIONS


5. Naveen

6. 17
7. Apr 2011

8. 9. Lookup transformations are used to lookup a set of values in another table.Lookups slows down the performance. 10. 1. To improve performance, cache the lookup tables. Informatica can cache all the lookup and reference tables; this makes operations run very fast. (Meaning of cache is given in point 2 of this section and the procedure for determining the optimum cache size is given at the end of this document.) 11. 2. Even after caching, the performance can be further improved by minimizing the size of the lookup cache. Reduce the number of cached rows by using a sql override with a restriction. 12. Cache: Cache stores data in memory so that Informatica does not have to read the table each time it is referenced. This reduces the time taken by the process to a large extent. Cache is automatically generated by Informatica depending on the marked lookup ports or by a user defined sql query. 13. Example for caching by a user defined query: 14. Suppose we need to lookup records where employee_id=eno. 15. employee_id is from the lookup table, EMPLOYEE_TABLE and eno is the 16. input that comes from the from the source table, SUPPORT_TABLE. 17. We put the following sql query override in Lookup Transform 18. select employee_id from EMPLOYEE_TABLE

19. If there are 50,000 employee_id, then size of the lookup cache will be 50,000. 20. Instead of the above query, we put the following:21. select emp employee_id from EMPLOYEE_TABLE e, SUPPORT_TABLE s 22. where e. employee_id=s.eno 23. If there are 1000 eno, then the size of the lookup cache will be only 1000.But here the performance gain will happen only if the number of records in SUPPORT_TABLE is not huge. Our concern is to make the size of the cache as less as possible. 24. 3. In lookup tables, delete all unused columns and keep only the fields that are used in the mapping. 25. 4. If possible, replace lookups by joiner transformation or single source qualifier.Joiner transformation takes more time than source qualifier transformation. 26. 5. If lookup transformation specifies several conditions, then place conditions that use equality operator = first in the conditions that appear in the conditions tab. 27. 6. In the sql override query of the lookup table, there will be an ORDER BY clause. Remove it if not needed or put fewer column names in the ORDER BY list. 28. 7. Do not use caching in the following cases: 29. -Source is small and lookup table is large. 30. -If lookup is done on the primary key of the lookup table. 31. 8. Cache the lookup table columns definitely in the following case: 32. -If lookup table is small and source is large. 33. 9. If lookup data is static, use persistent cache. Persistent caches help to save and reuse cache files. If several sessions in the same job use the same lookup table, then using persistent cache will help the sessions to reuse cache files. In case of static lookups, cache files will be built from memory cache instead of from the database, which will improve the performance. 34. 10. If source is huge and lookup table is also huge, then also use persistent cache.

35. 11. If target table is the lookup table, then use dynamic cache. The Informatica server updates the lookup cache as it passes rows to the target. 36. 12. Use only the lookups you want in the mapping. Too many lookups inside a mapping will slow down the session. 37. 13. If lookup table has a lot of data, then it will take too long to cache or fit in memory. So move those fields to source qualifier and then join with the main table. 38. 14. If there are several lookups with the same data set, then share the caches. 39. 15. If we are going to return only 1 row, then use unconnected lookup. 40. 16. All data are read into cache in the order the fields are listed in lookup ports. If we have an index that is even partially in this order, the loading of these lookups can be speeded up. 41. 17. If the table that we use for look up has an index (or if we have privilege to add index to the table in the database, do so), then the performance would increase both for cached and un cached lookup

42.
Read more: http://informaticatutorials-naveen.blogspot.com/2011/04/performance-tuning-oflookup.html#ixzz2Be01aYXs Under Creative Commons License: Attribution Non-Commercial

You might also like