How To Create Buildups in Datastage
How To Create Buildups in Datastage
Buildops:
You can define a Build stage to enable you to provide a custom operator that can be executed from
a DataStage Parallel job stage.
Code that is executed at the end of the stage (after all records have
been processed).
When you have specified the information, and request that the stage is generated, DataStage
generates a number of files and then compiles these to build an operator which the stage executes.
The generated files include:
Support RCP
Provide immediate access to the Orchestrates C++ classes
Properties of Buildops:
Parallel only
Must have at least one input interface and one output interface
Interfaces are static
Partitioning type is Same
The tool includes the GUI portion and the Unix command buildop
The BuildOp GUI helps to create the operator definition file (.opd file) and call the buildop
command to generate the operator executable file.
Source:
Consider following is a file definition (meta-data) for your input file (Source):
Target:
Lets assume that we want to apply following transformation rules to the source file in order to
populate our target file.
- Select build
General Tab:
Type the stage type name (this is the name you will be giving to your buildop)
Supply category (the folder where your buildop will be residing within stages on your palette)
Supply sort description (this will help others to know what this buildop is doing)
Execution mode is Parallel by default
Class Name will be automatically populated and is same as your stage type name
Creator tab:
Here you can supply the author name and version number.
Properties Tab:
For example say if you are doing lookups using your buildop then you can specify the Key
value property here..!!!
(A) Interfaces:
(1) Input:
Input tab is used to define your input links coming in to the buildop.
Port Name: you can supply your own port names here (compare port names with the link
names in existing stages say merge stage which has two input links master and update
Auto Read: if you specify this as true then you dont need to use macros for reading the
input records (discussed later). And if you specify this as false then you need to use
macros to read the input records.
Table Name: select the table definition or meta data using the table Name column. Our
calculator source file meta data is saved under Saved\Tst_buildop\a directory so we can
select that here.
RCP: you can make RCP enable by selecting true. Or leave it false.
(2) Output:
Output tab is used to define your output links coming out from the buildop.
Port Name: you can supply your own port names here (compare port names with the link
names in existing stages say merge stage which has two output links output and reject
Auto Read: if you specify this as true then you dont need to use macros for reading the
input records (discussed later). And if you specify this as false then you need to use
macros to read the input records.
Table Name: select the table definition or meta data using the table Name column. Our
calculator source file meta data is saved under Saved\Tst_buildop\ans directory so we can
select that here.
RCP: you can make RCP enable by selecting true. Or leave it false.
(3) Transfer:
Transfer tab is used to specify how the data transfer will occur in our sample buildop we
will take one record from source and write one record to the target so our inpu will be ina
(source) and output will be ans (target). Select Auto Transfer property to true or you
have to use transfer macros while coding.
(B) Logic:
(1) Definitions:
All #Incldue statements, #Define statements would always go under Definitions tab.
(compare this with our C++ header files from training)
(2) Pre-Loop:
Any code specified within this tab will be executed first while running the build op
In our example we need to read first record from our source ina for this task we can use the
buildop macro readRecord(input name)
In the our example readRecord(ina.portid_) will read the first available record from our source file.
Where ina is our source or input name
(3) Per-Record:
Actual Code for our Calculator: (refer to Explanation in next section for details)
while(!inputDone(ina.portid_))
{
{
ans.ans_max = ina.fno;
writeRecord(ans.portid_);
}
else
{
ans.ans_max = ina.sno;
writeRecord(ans.portid_);
}
readRecord(ina.portid_);
}
prefix[indent] = '\0';
cout << "evnet id " << prefix;
if (logDetail->eventId < 0)
cout << "unknown";
else
cout << logDetail->eventId << endl;
switch(logDetail->type)
{
case DSJ_LOGINFO:
cout << "INFO";
break;
case DSJ_LOGWARNING:
cout << "WARNING";
break;
case DSJ_LOGFATAL:
cout << "FATAL";
break;
case DSJ_LOGREJECT:
cout << "REJECT";
break;
case DSJ_LOGSTARTED:
cout << "STARTED";
break;
case DSJ_LOGRESET:
cout << "RESET";
break;
case DSJ_LOGBATCH:
cout << "BATCH";
break;
case DSJ_LOGOTHER:
cout << "OTHER";
break;
default:
cout << "Successful execution of Buildop";
break;
}
cout << endl;
cout << "Message" << prefix;
}
Explanation:
while(!inputDone(ina.portid_))
This statement uses a macro inputDone which loops through all the input records in source until
it finds end of records.
This statement first adds two numbers (values in columns fno and sno in our source) and then later,
it assigns the result to the output field ans_add.
This statement first multiplies two numbers (values in columns fno and sno in our source) and then
later, it assigns the result to the output field ans_mul.
This statement first devides two numbers (values in columns fno and sno in our source) and then
later, it assigns the result to the output field ans_div.
This statement first does apply exponential function on two numbers (values in columns fno and
sno in our source) and then later, it assigns the result to the output field ans_exp.
ans.org_a = ina.fno;
ans.org_b = ina.sno;
These statements compare values within sno and fno columns in source and then assigns greater
of those two values to output column ans_max
Rest of the code is a datastage API code which writes the error massages to the datastage director
job log. (need advanced knowledge of datastage API coding)
Thats it.your buildop is ready.now you need to generate the underlying codes to make the
buildop ready to execute in order to do this you need to click generate button on the GUI.
When you click Generate button the GUI will create necessary files including
(1) Ingx_Calculator.opd
(2) Ingx_Calculator.h
(3) Ingx_Calculator.C
(4) Ingx_Calculator.O
Under applicable project directory on Unix box (where Datastage engine is residing)
Clicking Generate button will give you following confirmation dialog which confirms that
your buildop has been created and ready to use.
In you palette you will be able to see the buildop youve just created.
Inside the buildop stage you will see following mappings that you can define:
Log entries:
Output: