0% found this document useful (0 votes)
452 views7 pages

Interview Questions Abinitio

The document contains answers to various questions about Ab Initio concepts. Key points include: - The relationship between the EME, GDE, and cooperating system (Ab Initio server). The EME acts as a metadata repository, the GDE is the user environment for designing graphs, and the cooperating system is where the Ab Initio server is installed. - The differences between aggregation and rollup. Rollup is more convenient for summarizing data and can display intermediate results, while aggregation does not support intermediate results. - The types of layouts supported in Ab Initio, including serial and parallel layouts which allow components to run sequentially or in parallel depending on data parallelism.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
452 views7 pages

Interview Questions Abinitio

The document contains answers to various questions about Ab Initio concepts. Key points include: - The relationship between the EME, GDE, and cooperating system (Ab Initio server). The EME acts as a metadata repository, the GDE is the user environment for designing graphs, and the cooperating system is where the Ab Initio server is installed. - The differences between aggregation and rollup. Rollup is more convenient for summarizing data and can display intermediate results, while aggregation does not support intermediate results. - The types of layouts supported in Ab Initio, including serial and parallel layouts which allow components to run sequentially or in parallel depending on data parallelism.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Question.1 What is the relation between eme, gde and co-operating system?

Answer: Eme is said as enterprise metadataenv, gde as graphical development env and co-
operating system can be said as abinitio server relation b/w this co-op, eme and gde is as
fallowsco operating system is the abinitio server. This co-op is installed on particular o.s
platform that is called native o.s .coming to the eme, its just as repository in Informatica, its hold
the metadata, transformations, dbconfig files source and targets information’s. Coming to gde its
is end user environment where we can develop the graphs (mapping just like in Informatica)
designer uses the gde and designs the graphs and save to the eme or sand box it is at user side.
Where eme is at server side.

Question.2 What is the use of aggregation when we have rollupas we know rollup
component in abinitio is used to summarize group of data record. Then where we will use
aggregation?

Answer: Aggregation and Rollup both can summarize the data but rollup is much more
convenient to use. In order to understand how a particular summarization being rollup is much
more explanatory compared to aggregate. Rollup can do some other functionality like input and
output filtering of records.Aggregate and rollup perform same action, rollup display intermediate
result in main memory, Aggregate does not support intermediate result.

Question.3 What are kinds of layouts does ab initio supports?

Answer: Basically there are serial and parallel layouts supported by AbInitio. A graph can have
both at the same time. The parallel one depends on the degree of data parallelism. If the multi-
file system is 4-way parallel then a component in a graph can run 4 way parallel if the layout is
defined such as it’s same as the degree of parallelism.

Question.4 How can you run a graph infinitely?

Answer: To run a graph infinitely, the end script in the graph should call the .ksh file of the
graph. Thus if the name of the graph is abc.mp then in the end script of the graph there should be
a call to abc.ksh. Like this the graph will run infinitely.

Question.5 How do you add default rules in transformer?

Answer: Double click on the transform parameter of parameter tab page of component


properties, it will open transform editor. In the transform editor click on the Edit menu and then
select Add Default Rules from the dropdown. It will show two options – 1) Match Names 2)
Wildcard.

Question.6 Do you know what a local lookup is?

Answer: If your lookup file is a multifile and partioned/sorted on a particular key then local
lookup function can be used ahead of lookup function call. This is local to a particular partition
depending on the key.
Lookup File consists of data records which can be held in main memory. This makes the
transform function to retrieve the records much faster than retrieving from disk. It allows the
transform component to process the data records of multiple files fast.

Question.7 What is the difference between look-up file and look-up, with a relevant
example?

Answer: Generally Lookup file represents one or more serial files(Flat files). The amount of
data is small enough to be held in the memory. This allows transform functions to retrieve
records much more quickly than it could retrieve from Disk.

A lookup is a component of abinitio graph where we can store data and retrieve it by using a key
parameter.A lookup file is the physical file where the data for the lookup is stored.

Question.8 How many components in your most complicated graph?

Answer: It depends the type of components you us. Usually avoid using much complicated
transform function in a graph.

Question.9 Explain what is lookup?

Answer: Lookup is basically a specific dataset which is keyed. This can be used to mapping
values as per the data present in a particular file (serial/multi file). The dataset can be static as
well dynamic ( in case the lookup file is being generated in previous phase and used as lookup
file in current phase). Sometimes, hash-joins can be replaced by using reformat and lookup if one
of the inputto the join contains less number of records with slim record length.AbInitio has built-
in functions to retrieve values using the key for the lookup.

Question.10 Have you worked with packages?

Answer: Multistage transform components by default use packages. However user can create his
own set of functions in a transfer function and can include this in other transfer functions.

Question.11 Have you used rollup component? Describe how?

Answer: If the user wants to group the records on particular field values then rollup is best way
to do that. Rollup is a multi-stage transform function and it contains the following mandatory
functions.

 Initialize
 Rollup
 Finalize

Also need to declare one temporary variable if you want to get counts of a particular group.
For each of the group, first it does call the initialize function once, followed by rollup function
calls for each of the records in the group and finally calls the finalize function once at the end of
last rollup call.

Question.12 How do you add default rules in transformer?

Answer: Add Default Rules — Opens the Add Default Rules dialog. Select one of the following:
Match Names — Match names: generates a set of rules that copies input fields to output fields
with the same name. Use Wildcard (.*) Rule — Generates one rule that copies input fields to
output fields with the same name.

1) If it is not already displayed, display the Transform Editor Grid.

2) Click the Business Rules tab if it is not already displayed.

3) Select Edit > Add Default Rules.

In case of reformat if the destination field names are same or subset of the source fields then no
need to write anything in the reformat xfr unless you dont want to use any real transform other
than reducing the set of fields or split the flow into a number of flows to achieve the
functionality.

Question.13 What is the difference between partitioning with key and round robin?

Answer: Partition by Key or hash partition ->This is a partitioning technique which is used to


partition data when the keys are diverse. If the key is present in large volume then there can large
data skew? But this method is used more often for parallel data processing.

Round robin partition is another partitioning technique to uniformly distribute the data on each of
the destination data partitions. The skew is zero in this case when no of records is divisible by
number of partitions. A real life example is how a pack of 52 cards is distributed among 4
players in a round-robin manner.

Question.14 How do you improve the performance of a graph?

Answer: There are many ways the performance of the graph can be improved.

1) Use a limited number of components in a particular phase

2) Use optimum value of max core values for sort and join components

3) Minimize the number of sort components

4) Minimize sorted join component and if possible replace them by in-memory join/hash join

5) Use only required fields in the sort, reformat, join components


6) Use phasing/flow buffers in case of merge, sorted joins

7) If the two inputs are huge then use sorted join, otherwise use hash join with proper driving
port

8) For large dataset don’t use broadcast as partitioner

9) Minimize the use of regular expression functions like re_index in the transfer functions

10) Avoid repartitioning of data unnecessarily

Try to run the graph as long as possible in MFS. For these input files should be partitioned and if
possible output file should also be partitioned.

Question.15 How do you truncate a table?

Answer: From Abinitio run sql component using the DDL “truncate table by using the truncate
table component in Ab Initio

Question.16 Have you ever encountered an error called “depth not equal”?

Answer:When two components are linked together if their layout does not match then this
problem can occur during the compilation of the graph. A solution to this problem would be to
use a partitioning component in between if there was change in layout.

Question.17 What is the function you would use to transfer a string into a decimal?

Answer: In this case no specific function is required if the size of the string and decimal is same.
Just use decimal cast with the size in the transform function and will suffice. For example, if the
source field is defined as string(8) and the destination as decimal(8) then (say the field name is
field1).

out.field :: (decimal(8)) in.field

If the destination field size is lesser than the input then use of string_substring function can be
used like the following. Say destination field is decimal (5).

Outfield: (decimal(5))string_lrtrim(string_substring(in.field,1,5)) /* string_lrtrim used to trim


leading and trailing spaces */

Question.18 What are primary keys and foreign keys?

Answer: In RDBMS the relationship between the two tables is represented as Primary key and
foreign key relationship. Whereas the primary key table is the parent table and foreign key table
is the child table. The criteria for both the tables are there should be a matching column.
What is the difference between clustered and non-clustered indices? …and why do you use a
clustered index?

Question.19 What is an outer join?

Answer: An outer join is used when one wants to select all the records from a port – whether it
has satisfied the join criteria or not.

Ab Initio Interview Questions:

1.What is the difference between scan and rollup?

2.What is the internal code level difference between scan and rollup?

3.Can you call an aggregate function within an aggregate function?

4.What is output_index and output_indexes?

5.What is surrogate key?

6.How many ways you can generate surrogate key in Ab Initio Graph?

7.Can you generate surrogate key in serial and parallel layout in the same way?Explain

8.Explain the different parameters in output table component

9.What is the difference between api and utility mode?

10.Have you ever done any performance improvement in your project? Can you explain how?

11.What are the different thumb rules to improve performance?

12.If there are only updates and inserts coming in a file, how to incorporate it in a table without
using join component?

13.What are the different types of join you can do using a join component?

14.How to do right outer and left outer join in Ab Initio?

15.What are the differences between join and lookup?

16.How the use of lookup improves performance over join?

17.If there is a checkpoint after a sort component, why it is recommended to use checkpoint
sortcomponent instead of using a checkpoint after sort component.

18.What is the significance of driving port in a join component?


19.What is lookup_local function?

20.In which scenario you can use lookup_local instead of using lookup function?

21.What are vectors in Ab Initio?

22.How can you initialize 1000 vector elements with a certain value?

23.What are checkpoints and phases?

24.What is AB_WORK_DIR ? Explain it.

25.What is two-stage routing? When to use Two-stage routing?

26.What is the difference between broadcast and replicate? Can you use broadcast as a replicate
andvice-versa?

27.Can you explain data parallelism,pipeline and component parallelism.

28.Can you give an example of a graph where pipeline and data parallelism is occurringsimultaneously?

29.When pipeline parallelism is broken?

30.When is ablocal used?

31.What is the use of ablocal_expr? Give an example.

32.What is skew? What is the significance of skew?

33.What is component folding?

34.What do you mean by PDL?

35.If you have enabled component folding, then how to run the graph, explain.

36.What is deadlock in abinitio and why does it happens?

37.What are the necessary steps you need to follow to avoid deadlock situation.

38.What is flow buffering?

39.If in the select_expr of filter by expression component you specify 1 or 0 then what will happen?

40.How to test any dml expressions that you will use in the graph from backend.

41.How to test a dbc file from backend.

42.Explain some abinitio commands.

43.What is the order of execution of parameters in an abinitio graph?


44.What are the different types of parameters in abintio?

You might also like