0% found this document useful (0 votes)
377 views10 pages

Abinitio Interview Questions

The document provides detailed instructions and solutions for various tasks in Ab Initio, including extracting specific records, partitioning data, and handling duplicates. It also explains the differences between components and functions, such as output index vs. output indexes, and lookup vs. local lookup. Additionally, it covers graph execution, dependency analysis, and the use of tags in project management.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
377 views10 pages

Abinitio Interview Questions

The document provides detailed instructions and solutions for various tasks in Ab Initio, including extracting specific records, partitioning data, and handling duplicates. It also explains the differences between components and functions, such as output index vs. output indexes, and lookup vs. local lookup. Additionally, it covers graph execution, dependency analysis, and the use of tags in project management.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Q: How to get only 5th to 7th record from a static text file containing 10

records using Abinitio?


One of the solutions can be with Abinitio components "leading Records"
and "Filter By Expression"

Step1: Input records will be fed to "Leading Record" Component.


Step2: Set "num_records" parameter to 7. This will pass only first seven
records from that component
Step3: Fed the output of leading record to Filter by Expression where
we can set "next_in_sequence>=5".
Step4 : The output will have records from record number 5 to 7.
Step5 : the above output can be fed to an output file.
Q: There is one scenario where you have to divide one file having 100
records into 5 files having 20 records each in such a way that first 20
records in first file and so on?
One of the solution can be with ab initio component " partition by round
robin " is below:

Step1 : Input records will be fed to " partition by round robin "
Component.
Step2 : The "block size" parameter of the component shoud be set to 20
in order to send first 20 records to first partition.
Step3 : Now we can connect the output of " partition by round robin " to
five different files.
Step4 : The output files will have 20 records each.
OR

Step1 : we can use reformat with global variable count with initial value
as 0 which will be incremented by 1 for each record till 20 then will
again be set to 1.
Step2 : Also add one new fiels which is "file_name" whic will be updated
after every 20th record.
Step3 : The output of the record then should be fed to " Write Multiple
Files " component which will write records into files defined by
"file_name" field we declared in reformat.
Step4 : The output files will have 20 records each.
Q: Difference between output index and output indexes in reformat
component of AbInitio?
These kind of abinitio interview questions will test the basic
understanding of reformat component of abinitio.
Ouput Index:
Suppose you have 100 records in input file. Within reformat
parameters, you can choose number of outputs you want. Within those
outputs, you can divide input into number of records you want in each
output The total sum of records will remain 100 as that of input
records. you can apply only one tranfor or logic within reformat to a
particular record. this means input record can be moved to only one
output port
Ouput Indexes:
But in ouput indexes you can pass the input record to multiple output
ports. This means multiple transforms can be appied to the same record
and output record count can be greater that input record count.

Output Index Output Indexes


Can pass records to multiple
It can also pass different output ports.
output ports.

One record can be passed to one One record can be transferred to mutiple
output port only. output ports.

Can not increase number of Can increase number of records depends on


records. the number of output ports.

Q: How can we run an AbInitio graph infinitely?


One of the possible solution is that we can end script of the graph.
Within end script we can call the graph or graph's ksh. With this , at the
end of the graph, the graph willstart again and run infinitely
OR

We can use plan. In plan we can add subplan, inside which we can add
graph task to call graph. The subplan can now be configured with
"whileloop" and condition as "True". With this condition as true in
subplan properties, the process will run in loop continuously again and
again.
Q: Difference between lookup and local lookup?
Both lookup and lookup_local functions are used to lookup data on a
lookup files based on keys.
But lookup_local function is used when our lookup file is partitioned.
This will be advantageous only when you are looking up on the same
key on which lookup file is partitioned
Q: function of lookup last in Ab Initio?
lookup_last function returns the last record from the lookup file which is
matching the lookup keys.
Q: Calculate number of vowels in a string using AbInitio graph?
One of the solution is as below:
Step1 : Input records will be fed to "Redefine format" Component.
Step2 : with the help of redefine format , we can read the string with
letters only with dml like - "String(1)". this will read each letterand
output each letter as a record. therefore number of record will be equal
to number of letters in the string.
Step3 : the output of bove will be fed to "Filter By expression"
component where we will filter only vowels like member[vector "A", "E",
"I", "O", "U"]
Step4 : The output will have records only vowels which can be fed to
rollup with same key as field name and we can use count function
Step5 : the above output will give vowels and its count in records..
Q: How to reverse a string using AbInitio ?
There can be many ways to solve this interview question but there is
also a simple way that abinitio provides which is slice option
Suppose you have a string coming as input : "Account"
Now you can reverse this string with slice function of abinitio as below:
in.data[::-1]
In the above statement, "in.data" holds the value of a string which is
being reversed. The output of above function will be "tnuoccA".
Q: What are the roles of Co-Op or co-operating system in Abinitio ?
Below are the roles of co-op :
Manage and run Ab Initio graph and control ETL process.
Provides abinitio extension to the operating system.
ETL processing along with monitoring and debugging.
Metadata management and interaction with EME.
Q: What is an ICFF file in Abinitio and when would you use one ?
ICFF or Index Compressed Flat File is a kind of lookup file that can store
very high volume of data while also providing quick access to individual
records.
Advantages:
# In normal lookup there is a limit to keep data bu in ICFF, we can keep
high volume of data without overloading physical memory.
# Requirement of disk space is less than database.
# High speed of transaction.
# High performance as compared to DB.
Working Of ICFF:
ICFF is actually a combination of two files : one is data file and an index
file. These two files combined makes an ICFF file.
ICFF stores data in data blocks which are in data file and index file
contains pointers back to individual data blocks.
During lookup opertion most of the compressed lookup data remains on
disk and the graph loads only tiny index file into memory.
ICFF Keys:
# Should have fixed length
# Should be not null
Q: How to create an ICFF File ?
ICFF or Index Compressed Flat File is created by using " Write Block
Compressed Lookup " component of Ab initio.
This component has one input and two outputs.
As we already know, ICFF file is combination of two files : one is data
file and an index file
Input data is fed to the input port. For output, data file is connected to
output data port and index file is connected to out index port .
With this, data file will be written in data file and indexes will be
written in index file.
DML of data file will be the DML of data and index file will have DML as
void.
You have to configure the component " Write Block Compressed Lookup
" where you will need to add keys on which you want to create ICFF .

Q: Convert 2 way partitioning to 8 way partitioning?


this can be done with a partitioning component with all to all flow
Step1 : Fed the incoming 2 way partitioned data to a partitioning
component as per you requirement. If random distribution is fine then
"partition by round robin" can be used else if key based then "partition
by key" can be used.
Step2 : It should be noted that the layout of the new partitoning
component should be the layout of the new 8way partitioned path.
Step3 : Enable "all to all" flow at the output flow. Also rest of your
components should be using 8 way layout only. In this we we will
achieve the desired result.
Q: you have records with fields item-id, item-name, purchase-1,
purchase-2 as below. you have to create two records if purchase-2 is
non zero value otherwise create only single record with the help of GDE
in Ab Initio

This can be done with a normalize component in abinitio


Step1 : Fed the incoming data to normalize component
Step2 : now we have to decide the length of normalize component. We
can use if-else within length function.if purchase-2is non zero value
then length will be 2 otherwise 1. This means for non zero purchase-2, 2
records will be there, otherwise only one record
Step3 : The output of the normalize will give us the desired result.
Q: How to get top two transactions per date using AbInitio?
In this case we have two colums "date" and " transaction amount"
Step1 : First we will use "sort" component to sort data on both date and
amount. first would be date then amount (descending)
Step2 : The output will be fed to scan component with "date" field as
key. In this case we can use extended scan where we will use variable
rank and increment rank ( like rank = rank+1 ) for each record within
that key (which is date) .As the records are already sorted in
descending order for amount, the heighest amount will be given rank 1
and so on.
Step3 : Within scan component we can use "output_select" function
where we will use "rank<=2". This will give the desired result.
Q: How to remove duplicate record of distance between two cities using
abinitio?
In this case we have two records one is " Boston to LA " and another is
"LA to Boston". technically both of these records are same and we want
only signle record to be present and remove the duplicate.
Step1 : First we will use "Reformat" component where we will use
vector variable as below:
let string("")[] vec1 = [], in this vector we will append the city1 and
city2 usinf vector_append function
Now we will sort vec1= vector_sort(vec1). and then we will assign
vec[0] to out.city1 and vec[1] to out.city2
the above output will give boston in city1 and LA in city2 for both the
records
Step2 : the output of reformat will be fed to "Sort" on city1 and city2
Step3 : the above output will be fed to Dedup on same keys as sort
The output of dedup will give us the desired result
Q: Difference between "vector_bsearch" function and
"vector_bsearch_all" in abinitio?
vector_bsearch is a binary search algorithm used to check presence of
element. If element is present, it returns only first index of the element.
It's syntax is : vector_bsearch(vector_element, search-element, {key})
vector_bsearch_all is also binary search algorith but it returns first and
last index of the element and shows duplicate presence of elements.
It's syntax is : vector_bsearch_all(vector_element, search-element,
{key})
Q: How many ways parallel should you run in abinitio?
The number of way i.e. 2way, 4way, 8wayor more should you run
depends on following:
# Number of CPUs available.
# Number of disk mounts available.
# how much memory is available.
Depending on these factors , admin can choose the specific way of
partitioning withing the project.
Q: What is dependency analysis in Ab Initio?
It is a process or utiliy given in abinitio by which journey of data can be
traced from start to end.
EME can examine the project and data transformation within all the
processes i.e. from graph to graph and from component to component.
Q: How to check complexity of a graph or any other object in Ab Initio?
air project analyze-dependencies "relative-path-of-object" -complexity
OR
air project analyze-dependencies "relative-path-of-object" -complexity-
details
OR
air sandbox validate-graph "sandbox-path-of-object" -complexity
Q: Behaviour of abinitio components with null key?
A) Join component with null key will result in crtesian product
B) Rollup component with null key will resuly in single record in output.
C) Scan component with null key will output all the records.
D) Sort component with null key will output all the records as is.
E) Dedup Sort component with null key and keep unique will result in
zero records, with keep first or keep last will give single record.
F) Merge component with null key will result in failure of the process
Q: Common vector functions in Ab Initio.
A) vector_append(vector_element, value)
It adds an object to existing vector
list=[vector 1,2,3,4]
vector_append(list, 6)
list=[vector 1,2,3,4,6]
B) vector_avg(vector_element)
It gives average value of vector elements
list=[vector 1,2,3,4,5]
vector_avg(list) = 3
C) vector_bsearch(vector_element, search_element)
If element is found, it returns the index of the element
list=[vector 1,2,3,4,5]
vector_bsearch(list, 3) -> 2, as 3 is at 2nd index starting from 0
D) vector_concat(vector1, vector2)
It will combine two vector elements
list=[vector 1,2,3,4,5], list1=[vector 6,7,8]
vector_concat(list, list1) -> [vector 1,2,3,4,5,6,7,8]
Q: We have an Ab Initio plan with multiple tasks being executed in
sequential order. After some tasks are completed, plan gets failed. Now
we want to reruno restart the plan from the part it failed and skip the
successfully completed task. How?
We can use " plan-admin skip-on-plan-restart " utility for this. Below is
the syntax:
plan-admin abc.rec skip-on-plan-restart taskname
after the above command, run you plan or plan pset again
air sandbox run plan.pset
abc.rec is the recovery file of the failed plan
taskname is the task you want to skip when running the pset again. You
can get it by running plan-admin dumpjob command on recovery file
plan-admin skip command should be run when plan is not running.
Q: Check all the objects locked by particular user?
air lock show -user 'user-id'
Q: Identify all the objects impacted by a particular object?
air object uses 'object-eme-path'
Q: Check latest version of an object?
air -branch 'branch-name' object versions 'object-eme-path'
Q: Check difference between two versions of an object?
air -branch 'branch-name' object changed 'object-eme-path' -version1
'version-number' -version2 'version-number' -diff
if objects are on two different branch:
air object changed 'object-eme-path' -branch1 'branch-name' -version1
'version-number' -branch2 'branch-name' -version2 'version-number' -
diff
Q: Important "air tag" commands in Ab Initio.
Creating project only tag:
air tag create "tag-name" "project-path" -project-only
Creating only object level tag:
air tag create "tag-name" -file "file-name-with-path" -exact
add all the objects, for which tag to be create, in a file and then use the
above command which will create tag having only certain objects

Add object to an existing tag:


air tag create "tag-name" -add "object-path" -exact
Change version of an object in an existing tag:
air tag change-version "tag-name" "object-path" -version "version-
number" -exact
above command is to change to specific version number, in case change
to latest eme version use below command:
air tag change-version "tag-name" "object-path" -exact

You might also like