Look Up Transformation
Look Up Transformation
Contents
[Hide TOC]
1 Question 2 Answer
o
2.1 Example
2.1.1 Connected LKPs 2.1.2 Unconnected LKPs 2.1.3 Performance Considerations for Lookups 2.1.4 Misconceptions about lookup SQL Indexes
o o
[edit] Question
What is lookup transformation in informatica?
[edit] Answer
Lookup is a transformation to look up the values from a relational table/view or a flat file. The developer defines the lookup match criteria. There are two types of Lookups in PowercenterDesigner, namely; 1) Connected Lookup 2) Unconnected Lookup . Different caches can also be used with lookup like static, dynamic, persistent, and shared(The dynamic cache cannot be used while creating an un-connected lookup). Each of these has its own identification. For more details, the book "Informatica Help" can be useful. Hope you are aware with the basics of Informatica. Now proceeding through lookup transformation.
Lookup transformation is Passive and it can be both Connected and UnConnected as well. It is used to look up data in a relational table, view, or synonym. Lookup definition can be imported either from source or from target tables. For example, if we want to retrieve all the sales of a product with an ID 10 and assume that the sales data resides in another table called 'Sales'. Here instead of using the sales table as one more source, use Lookup transformation to lookup the data for the product, with ID 10 in sales table. Difference between Connected and UnConnected Lookup Transformation: 1. Connected lookup receives input values directly from mapping pipeline whereas UnConnected lookup receives values from: LKP expression from another transformation. 2. Connected lookup returns multiple columns from the same row whereas UnConnected lookup has one return port and returns one column from each row. 3. Connected lookup supports user-defined default values whereas UnConnected lookup does not support user defined values.
[edit] Example
3. It will not process each and evry row. It will return the values based expression Condition. 4. If no match found for the LKP condition, the LKP transformation will return Null Values. 5. it is a reusable trnsformation. The same LKP trnans can be called multiple times in same mapping 6. it will return only one value. 7. it can use only static cache [edit] Performance Considerations for Lookups
Below are a list of performance considerations for lookups in Informatica PowerCenter. Performance for Lookups
[edit] Misconceptions about lookup SQL Indexes
I have seen people suggesting an index to improve the performance of any SQL. This suggestion is incorrect - many times. Specially when talking about indexing the condition port columns of Lookup SQL, it is far more "incorrect". Before explaining why it is incorrect, I would try to detail the functionality of Lookup. To explain the stuff with an example, we take the usual HR schema EMP table. I have EMPNO, ENAME, SALARY as columns in EMP table. Let us say, there is a lookup in ETL mapping that checks for a particular EMPNO and returns ENAME and SALARY from the Lookup. Now, the output ports for the Lookup are "ENAME" and "SALARY". The condition port is "EMPNO". Imagine that you are facing performance problems with this Lookup and one of the suggestion was to index the condition port. As suggested (incorrectly) you create an index on EMPNO column in the underlying database table. Practically, the SQL the lookup executes is going to be this:
select EMPNO, ENAME, SALARY from EMP ORDER BY EMPNO, ENAME, SALARY;
The data resulted from this query is stored in the Lookup cache and then, each record from the source is looked up against this cache. So, the checking against the condition port column is done in the Informatica Lookup cache and "not in the database". So any index created in the database has no effect for this. You may be wondering if we can replicate the same indexing here in Lookup Cache. You don't have to worry about it. PowerCenter create "index" cache and "data" cache for the Lookup. In this case, condition port data - "EMPNO" is indexed and hashed in "index" cache and the rest along with EMPNO is found in "data" cache. I hope now you understand why indexing condition port columns doesn't increase performance. Having said that, I want to take you to a different kind of lookup, where you would've disabled the caching. In this kind of Lookup, there is no cache. Everytime a row is sent into lookup, the SQL is executed against database. In this scenario, the database index "may" work. But, if the performance of the lookup is a problem, then "cache-less" lookup itself may be a problem. I would go for cache-less lookup if my source data records is less than the number of records in my lookup table. In this case ONLY, indexing the condition ports will work. Everywhere else, it is just a mere chanse of luck, that makes the database pick up index.
[edit] Dynamic Lookups
Dynamic Lookups are used for implementing Slowly Changing dimensions. The ability to provide dynamic caching gives Informatica a definetive edge over other vendor products. In a Dynamic Lookup, everytime a new record is found (based on the lookup condition) the Lookup Cache is appended with that record.
Although this provides faster data comparison, the caching limits the look up happen on a snapshot of data from Table/view at the cache creation time. The cache creation time may be longer than direct database querying time, in some cases. Large cache file creation for every load can be wasteful effort.
This is a check box in look up transformation. This once selected cannot be reverted. Dynamic lookup allows the cache fie to be modified by the mapping. For every Look up port a corresponding Associated port should be selected from among input ports. This associated port is the column in cache file for comparing with data in Input port. The comparison result will be given out through NewLookupRow, a default port for Dynamic look up. The data in the cache will be then modified or the input data will be added to cache based on this comparison. In case the comparison is not needed for some port Ignore in comparison check box can be ticked. The NewLookUp port gives the following values
2, if the data has any diference from the cache for the compared ports
Based on these values the data can be either added or updated in the target table. This should be in sync with the lookup cache for the lookup be accurate
[edit] Points to be noted 1. Either Update Else Insert or Insert Else Update should be selected in order to update the cached data. This is also needed to get the output 2 in NewlookupRow port for updates. 2. Primary key columns of the lookup table cannot have any of the input ports as Associated Port; Instead a Sequence generator named Sequence-Id should be used.
o o
This can be used only if the port type is Integer or Small Integer. This Sequence is generated from the largest value existing in the cache for the port.
This is the cache stored in the server, when the data does not change in the look up table between many session runs. The caching will be done on the first run and the same will be reused in later session runs. If the source data changes later, then the check box Recache from database can be selected to rebuild the cache. For large look up tables combining Dynamic look up with persistent cache is an excellent performance boost. The cache can be made Persistent and the changes will be updated by Dynamic lookup. This avoids wasteful caching of the large chunk of data front DB on every session run. But this needs careful execution of Dynamic look up to avoid comparison errors.
[edit] Shared Cache
Reusing of cache will save the no of time cache file is created. Sharing needs the cache structure to be same. This includes
The cache can be either named or unnamed. In case of unnamed the cache will be reused only inside a session. Named Cache can be reused by many sessions. Dynamic cache cannot be shared
[edit] Uncached Lookup
Though caching avoids multiple DB access to do the comparison, sometimes the caching process itself would be waste time. This is decided by many factors.
If 50K records are to be cached for mere 5 lookups under good database connection Caching would be wasteful. Five database queries will return just 5 records, instead of 50K records. But for 50 or 100 look up over DB connection with serious delay caching may be better. This varies case by case. This can put in a formula. But I have none currently.
What are the different Lookup methods used in Info... In the lookup transormation mainly 2 types 1)connected 2)unconnected lookup Connected lookup: 1)It recive the value directly from pipeline 2)it iwill use both dynamic and static 3)it return multiple value 4)it support userdefined value Unconnected lookup:it recives the value :lkp expression 2)it will be use only dynamic 3)it return only single value 4)it does not support user defined values
Re: How do you create single lookup transformation using multiple tables? Answer #1
Lookup transformation: Based upon one/more keys the data is retreived from one/more tables. create a single lookup transformation by Joining the multiple tables, having connected the keys defined in lookup tranformation.
3 Yes 4 No 0 Latha Re: How do you create single lookup transformation using multiple tables? Answer #2
Re: How do you create single lookup transformation using multiple tables? Answer # 1 Lookup transformation: Based upon one/more keys the data is retreived from one/more tables. create a single lookup transformation by Joining the multiple tables, having connected the keys defined in lookup tranformation.
Re: How do you create single lookup transformation using multiple tables? Answer #3
we have the lOOKUP OVERRIDE Query in the Lookup transformation. Use the SQl Query to join the tables you lookup on. Thsi is similar to what yo do at the Source Qualifier
17 Yes
2 No
5 Kalyan Re: How do you create single lookup transformation using multiple tables? Answer #4
you cannot join two tables in a lookup. Lookup works only on one underlying table.
Murali Vishnuvajhala Re: How do you create single lookup transformation using multiple tables? Answer #5
join the tables in the database itself then do the look up else override the lookup sql and look up. i believe this would work.
Bsgsr Re: How do you create single lookup transformation using multiple tables? Answer #6
Apologies for giving wrong answer in the past (Answer #4) You can actually join multiple tables in the lookup. Here are the steps. 1. Click on Lookup transformation 2. Click on "Skip" button to the right 3. A green Lookup Transformation will apprear without any ports. 4. Put your query in the SQL Override. 5. Make sure you specify the columns you selected in your query as ports IN THE SAME ORDER 6. Your lookup is ready NOTE: When you specify the query make sure you specify column alises for each column.... else you will get invalid lookup error during run time.