100% found this document useful (1 vote)
1K views13 pages

Abinitio Interview Questions

This document discusses various m_ commands used to operate on the multifile system in Ab initio. Some key commands listed include m_ls to list files and directories, m_dump to view file contents, m_touch to create empty files, m_db to access database info, and m_mkfs to create file systems. It is suggested to put the m_ commands in the $AB_HOME/bin directory for easier use. A list of over 20 additional m_ commands is also provided.

Uploaded by

Prasad Koorapati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
1K views13 pages

Abinitio Interview Questions

This document discusses various m_ commands used to operate on the multifile system in Ab initio. Some key commands listed include m_ls to list files and directories, m_dump to view file contents, m_touch to create empty files, m_db to access database info, and m_mkfs to create file systems. It is suggested to put the m_ commands in the $AB_HOME/bin directory for easier use. A list of over 20 additional m_ commands is also provided.

Uploaded by

Prasad Koorapati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

M_commands

m_ls –> Listing files and directories


m_dump –> to view the file content without GDE use.
m_touch –> create the mfs empty file.
m_db –> to access all relevant database info.
m_mkfs –> create the mfs file system
m_mkdir –> create the mfs file directory
m_cat –> To view the mfs file data.
m_eval –> to execute all abinitio functions
m_kill –> to kill the abinitio process
m_password –> Encrypting password
m_mp –> executing abinitio graph

Ab initio Developed wrapper calls to operate on the multifile system. They


followed m_ prefix in every unix equivalent (some of them are unique)
commands. The better way, i would suggest, is to put ls m_* command in
your $AB_HOME/bin directory.

I have done the same, here is the displayed list:

m_aggregate
m_attach
m_broadcast
m_cat
m_catalog_add
m_catalog_delete
m_catalog_export
m_catalog_import
m_chmod
m_cleanup
m_cleanup_du
m_cleanup_rm
m_cp
m_db
m_db_cleanup
m_db_dump
m_db_env
m_db_layout
m_dbload
m_dbunload
m_df
m_du
m_dump
m_dump_helper
m_dump_helper_sas
m_env
m_eval
m_expand
m_kill
m_ls
m_lscatalog
m_merge
m_mkcatalog
m_mkdir
m_mkfile
m_mkfs
m_mv
m_partition
m_reformat
m_rm
m_rmcatalog
m_rmdir
m_rmfs
m_rollback
m_rtel
m_run_sas
m_sas_dump
m_sas_dump_v1
m_sas_gen_dml
m_sas_transform
m_select
m_sort
m_touch
m_view_errors
m_wc

1. How can you view the data in the Multifile

you can use m_dump command to view data in proper format


m_dump <DML Name> <File Name>

2. If m loading a file of 1 million records.and the graph fails


after loading 10,000 records.
In this case if we use rollback command then what will happen?
when used rollback, if graph have any checkpoints, it will asks u, again re-run that entire
graph or continue from successful checkpoint. every checkpoint contain that details.
(Or)
m_rollback will rollback all the records that has been loaded till the last successful check
point of the graph.

3. Mention what the difference between “look-up” file and “look is up” in abinitio?
4. What will be the skew for, input file->partition by key-> partition by round robin->output file

5. Difference between Force_error & Force_abort?


Force_error can be written by the developer to exit by throwing an user specified error.
Eg : if (age<18) force_error("Age not suitable for Voting")

in above case the graph fails with Exit 3 along with error message in error log file saying -
Age not suitable for Voting

Whereas , force_abort just fails the graph without any message even this with return code
exit 3 for failure
6. how does deadlock occur?..
It is the condition, where the graph will stop processing
due to mutual dependency of data.

For Ex:

Let’s consider a concatenate component, having three


inputs. Let’s say first i/p receives 20 million records,
second i/p receives 1000 records, third i/p receives 500
records.
Now even though the concatenate receives i/p’s at second
and third i/p port, the concatenate won’t work until it
receives all the i/p’s at it’s first i/p port. So the
graph’ll stop processing until the first i/p port receives
all its data. This condition is called DEADLOCK.

This is now minimised (not prevented) by “Automated flow


buffering”. This will in turn provide more workspace in
network resource allocation. So that the processing’ll be
faster.

The Automated flow buffering is available from 1.8 version.


7. How can you force the optimizer to use a particular index?
8. How to add default rules in transformer?
9. How transaction file different from that of a sort file?
10. You can ask about - meta pivot, leading records, read & write multiple files component ?
How to ..
Meta Pivot splits records by data fields (columns), converting each input record into a series
of separate output records. There is one separate output record for each field of data in the
original input record. Each output record contains the name and value of a single data field
from the original input record.

Leading Records copies a specified number of records from its in to its out port, counting
from the first record in the input file. We can Specify the number of records to copy from in
port to out port.
Read Multiple Files reads records from the in port, calls a transform function to get the name
of a local file from each input record, reads records from the files, and writes these records to
the out port.

Write Multiple Files reads input records and derives a local filename from each single record.
Each input record is written to the designated local file, passing through an optional
transform function.
11. what is decoding & what is NVL?
Decoding means It decodes the expression in the way of
If-then-else structure.

NVL means it provides the value for null values.


EX:NVL(comm,0)
12. What do you mean by continuous graph? what is it significance?..
We have a continuous graph wherein we have generate record
component in batch mode we need to write everything to the
Multipublish queue with throttle in between and the
requirement is like if we have a 1000 records and if the
graph fails in between let us say at 500th record, the
first 499 records should be loaded into the multipublish
queue and when we restart the graph again it should start
from 500th record but not from 1st record...

13. What is the significance of record required indicator of Join component ? How to use in
abinitio graphs
inner Join
Record-required-0 = True
Record-required-1 = True

Outer Join

Record-required-0 = False
Record-required-1 = False

Inner join is matching records on both input , so both must be set to true and vice versa for
outer join

or

when we specify Explicit Join user has privilege to tell how the records need to got to output,
here is the detailed meaning behind

Left Outer Join


Record-required-0 = True
Record-required-1 = false

Right outer Join


Record-required-0 = False
Record-required-1 = True

Outer Join
Record-required-0 = True
Record-required-1 = True

Inner Join
Record-required-0 = False
Record-required-1 = False

14. what is different daily and monthly develop the abinitio graphs?.
we can maintain the daily and monthly execution of the graphs by creating jobs in the
scheduling system.
those jobs can be manipulated on according to the execution date and time.
we can also write scripts and create the jobs for the same which will tell us when that
particular created job will get triggered from the schedule, be it on daily basis or monthly
execution.
15. How you can generate surrogate key? How to use in abinitio graph?
you can also make use of assign key component. next_in_sequence() -serial files
(next_in_sequence()*number of partitions)+this partition() -multi files
16. what is deeup in unique only?..
To explain with example, If you have 10 records with dedupe
key having values of 10,10,20,30,30,30,30,40,40,50. On
deduping UNIQUE, you will get 20 & 50 in your output.
Remaning records comes out in the dup port.
(Or)
keep Parameter of Dedup Sorted Component

(choice, required)

Specifies which records the component keeps to write to the


out port. Choose one of the following options:

first — Keeps the first record of a group


last — Keeps the last record of a group
unique-only — Keeps only records with unique key values
The component writes the remaining records of each group to
the dup port.

Default is first.

17. How you can run a component for certain conditions?


For this we can make the component conditional.
Go to File>preferences>conditional , then a condition tab
is getting added in the properties of the component.
Added condition in that and make it conditional.

18. i have 2 files. First file contains a,b,c,d as rows and second file contains 1,2,3,4 as
rows. how do we make single a1b2c3d4 and 4 different rows a1, b2, c3,d4.
we can use fuse component
or
We can use Partition by round robin component with block size 1.
Or
Another method we can also use interleave component which gathers data in round robin.
Or
1. Use fuse component to join two files and will give output
col1 col2
a1
b2
c3
d4
Then in Reformat do out.col :: string_concat (col1,col2)
19. what is the difference between .dbc and .cfg file?
.dbc is used establish the connection between data base
server and Abinitio server.

.cfg file:Where all the environment variables are declared


in order to the application and to support the multi
environments.
20.  How can you count the total no .of records in the specified partition in the Multifile?
m_cat <mfs file name> | wc -l
gives the record count in mfs file provided every record are present in new line..
or
m_wc .dmlfile `partitions multifile`
21. How we can remove temp files of a failed job if .rec is not available ? How to use in abinitio
graph..
if the rec file is not present you can check the path on which the temprorary files are created
with either .job or with filename.dat.lock creaed by your id and then delete it.
write m_cleanup command to clean all temporary files created after job fail.
22. Can you run air sandbox run using pset ?
yes,we can use the air sandbox run <psetname> unix command to run the pset from Unix
environment. Or
Yes, we can run using pset
air sandbox run <psetname.input.pset>
ex air sandbox run test.input.pset
23. In MFS i developer developed 2-way, but supporters r
supporting 4-way on same records how is possible?
Yes its possible. The Multifile system depth of your system
depends on the Number of CPU's you have. If you have 1 CPU
then the max depth of your MFS partition can be 4. In this
case also you can create 2 way partitions.
24. EmpId RollNo RollNo2
A t1 se
A1 se tm
i want output as

A Emp
A TL
A1 SE etc
how i can get output please send me your answer
25. What is AB_LOCAL expression where do you use it in ab-
initio?
we use AB_LOCAL(expression) to increase the SQL query
performance by supplying the name of large table in
expression. This way we make it as a driving table.
AB_LOCAL is used for both parallel unload and for
determining the driving table in the complex queries.
If you use an SQL SELECT statement to specify the source
for Input Table, and if the statement involves a complex
query or a join of two or more tables in an unload, Input
Table may be unable to determine the best way to run the
query in parallel. In such cases, the GDE may return an
error message suggesting you use ABLOCAL(tablename) in the
SELECT statement to tell Input Table which table to use as
the basis for the parallel unload.

To do this, you would put an ABLOCAL(tablename) in the


appropriate place in the WHERE clause in the SELECT
statement, and specify the name of the "driving table"
(often the largest table, but see below) as a single
argument.

When you run the graph, Input Table will replace the
expression "ABLOCAL(tablename)" with the appropriate
parallel query condition for that table.

For example, suppose you want to join two tables-


customer_info and acct_type-and customer_info is the
driving table. You would code the SELECT statement as
follows:

select * from acct_type, customer_info

where ABLOCAL(customer_info) and

customer_info.acctid = acct_type.id

Note that when using an alias for a table, you must tell
ABLOCAL(tablename) the alias name as well.

select * from acct_type, customer_info custinfo

where ABLOCAL(customer_info custinfo) and

custinfo.acctid = acct_type.id
26. I had 10,000 records r there i loded today 4000 records, i
need load to 4001 - 10,000 next day how is in Type 1 and
how is it on type 2?
simply take a reformat component and then put
next_in_sequence()> 4000 in select parameter.
27. What is the diff between abinitiorc and .abinitiorc files ?..
.abinitiorc is an user config file which resides in user
home directory.

$HOME/.abinitiorc

where as abinitiorc is an system configuration file which is


setup by system admin.

$AB_HOME/config/abinitiorc

28. Output for sort and dedupsort with NULL key?


Whenever we sort a set of records with NULL key
automatically it consider the records as one group & data
will be as per the input serial number and will be sorted
according to that.But dedup will again cosider the records
as one group & output will be the first record
automatically.
e.g.input records:1,XYZ,100;
2,ABC,700;
5,JJJ,400;
7,KKK,500;
using NULL key sort component will give output as:
1,XYZ,100;
2,ABC,700;
5,JJJ,400;
7,KKK,500;
Dedup sort will give output as:

1,XYZ,100;

Or
Output for sort and dedupsort with NULL key?
Your result will come as per you specify the first/last
record. as it does not sort the records.

ex:
Input
ABC
----------
1ad
4fh
6yg
7gu
2io

Output
a: if selected first record in dedup
ABC
----------
1ad

b: if selected last record in dedup


ABC
----------
2io

headers and trailers are processed like this only.


Or
Output for Sort when pass key as {}

Ans - Sort component will does not perform any ordering on data, it will give a same input to
your output.

Output for Dedup Sorted when pass key as {}

Ans- 1.If your keep parameter is first then it will give first record from input records.
2. If your keep parameter is last then it will give last record from input records.
3. If your keep parameter is unique only then it give zero records in output.
but if your input file contains only one record then it give one record in output.

29. What is air_project_parameters and air_sandbox_overrides?


what is the relation between them?
.air-project-parameters
Contains the parameter definitions of all the parameters
within a sandbox. This file is maintained by the GDE and
the Ab Initio environment scripts.

.air-sandbox-overrides
This file exists only if you are using version 1.11 or a
later version of the GDE. It contains the user's private
values for any parameters in .air-project-parameters that
have the Private Value flag set. It has the same format as
the .air-project-parameters file.

When you edit a value (in GDE) for a parameter that has the
Private Value flag checked, the value is stored in the .air-
sandbox-overrides file rather than the .air-project-
parameters file.
30. How to create project (public,private,common,client) what are
the differences between them?
common Objects: They are objects which can be used in any of the sandboxes within that
workspace.
Private Objects- They are objects are used within that sandbox
31. How we khow a database is connected in Abinitio?.
We need to execute the command in unix as mentioned below

m_db test <dbcfile path>

Ex: m_db test /home/userid/sandboxpath/db/abc.dbc


32. If m loading a file of 1 million records.and the graph fails
after loading 10,000 records.
In this case if we use rollback command then what will happen?
when used rollback, if graph have any checkpoints, it will asks u, again re-run that entire
graph or continue from succeful checkpoint. every checkpoint contain that details.
m_rollback will rollback all the records that has been loaded till the last successful check
point of the graph.
33. How to connect mainframe to Abinitio?
there is a utility for this i.e cobol-to-dml utility or

34. One file contains header,body,trailer records and header in a single row as well as trailer
too.How to segregate these header,trailer and body records and once it gets segregated,i
want to make the body data in reverse i.e if i have 10 body records,the 10th record should be
the first record,9th record should be the second line, etc..

35. How can you run a graph continuously without using continuous components ?..
We can try one approach by calling the graph's own deployed
script name from its end script call.

For Example:
Suppose, we have a graph my_graph.mp and its deployed
script my_graph.ksh in $AI_RUN.

Now in the end script of our graph we can call the deployed
scripts as given below:

if [ $mpjret -eq 0 ]; then


$AI_RUN/my_graph.ksh
else
echo "Graph Failed"
fi

36. Hello, I need help on passing parameter to Oracle Stored Procedure.I am a oracle DBA,need
to us..
RUN SQL COMPONENT
exec proc-name (parameters)
exec proc-name ('$DATE')
exec proc-name ('20080101')
37. Which component breaks the pipe line parallelism in graph?
All the components which will wait for records in line will break pipeline parallelism. Like sort,
roll up, scan, Sort component. Because it chance the order by data by
ascending,descending orders.
38. what is meant by fancing in abinitio ?
In SW world fencing means job controlling on priority basis. In AI it actually refers to
customized phase breaking. A well fenced graph means no matter what is source data
volume process will not cough in dead locks. It actually limits the number of simultaneous
processes.
39. how many types of project we can create in Abinitio? I mean
public,private.......etc?
We can create 2 types of projects in Ab initio.
1.private - Specific to the Applications and depends on the user.
2.Common - To be included in every user sandbox.
common or public is same thing which is used interchangeably.
40. what is difference between chk point n phase.Say im loading a file (containig 1 lakh
records) and m.

Phases divide the graph in to parts and executes one after the other to reduce the
complexity and encounter deadlocks.
check points are like intermediate nodes which saves the data in to the disk permanently.We
have to manual delete the data if we have checkpoints.If we have a successful checkpoint
we can always roll back and rerun the graph from that point in case of a failure.
41. What is difference between API mode and Utility Mode
In API mode, the insertion of data into database follows all the constraints of database. The
API mode will interact with the database on a per-record basis. This means that every record
is handled by the database But in Utility mode, that disables the constraints & inserts the
data into database
42.  How can you run a graph continuously without using
continuous components ?
We can try one approach by calling the graph's own deployed
script name from its end script call.

For Example:
Suppose, we have a graph my_graph.mp and its deployed
script my_graph.ksh in $AI_RUN.

Now in the end script of our graph we can call the deployed
scripts as given below:

if [ $mpjret -eq 0 ]; then


$AI_RUN/my_graph.ksh
else
echo "Graph Failed"
fi

-----------------
Though this approach mimics the style of a continuous
graph, it can't replace the benefits of it. This approach
can be further enhanced to take care of roll-back and
failure handlings by using "trap" command with a user
defined unix script/function.
43. How to migrate the code FROM development to QA and QA to
production?
List down all your objects first.
then checkin the code to EME.
create tag for the objects to be migrated.
now share the tag and objects list with the migration team and they will do it for you.
note: Developer generally never migrate
44. What is the difference between partitioning with key and
round robin?
1)Partition by key needs a key where round robin doesnot need
2)Round Robin always tries to distribute the records
equally.where partition by key not. (Or)
Partitioning by key distributes the data into various
multifile partitions depending upon of the fields present in
the input(key), while Partitioning by round robin distribues
data equally among the partitions irrespective of the key
field. Round robin ensures equitable distribution of data
among the partitions, while By key may lead to inequitable
distribution.
45. How Does MAXCORE works?
MAXCORE parameter is used when we use Inmemory: Input need
not be sorted option. It is the maximum memory in bytes
that the component uses per partition. This parameter can
be seen in Join, Sort and Rollup
Maximum memory usage in bytes, Before spilling data on the
disk.The defalut max core for a sort component is 10485760

You might also like