Cob2.Close of Business-Batch - Job.control Errors r13

Download as pdf or txt
Download as pdf or txt
You are on page 1of 50

1. Welcome to the Close Of Business BATCH.JOB.

CONTROL & Errors learning


unit. This learning unit will help you understand COB, the different stages of COB
and the applications in T24 associated with COB.

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 1


After completing this learning unit/course, you will be able to:

Understand the internal working of COB


Understand the working of tSM & tSA
Visualize failure scenarios
Understand the types of errors generated during COB

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 2


What have you seen till now -

1. You have learnt the various stages of COB


2. The fields in BATCH record
3. tSM and tSA
4. TSA.SERVICE
5. TSA.WORKLOAD.PROFILE
6. TSA.PARAMETER
7. Running COB in phantom and Interactive mode
8. Date Change
9. You learnt what is the NS Module
10. Monitoring COB using enquiries
11. EB.EOD.ERROR

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 3


1. Default Printer specifies the default printer to which all output is to be directed for
the process. If no printer name has been specified in field PRINTER.NAME for the
individual jobs, then this field allows the user to define a default printer to direct the
output. If this field is left blank and no printer has been specified for the individual
jobs, then the default SYSTEM printer defined in DE.FORM.TYPE will be used.

2. Printer Name field holds the Name of the printer where the output of the job
needs to be sent to. If left blank, the printer specified in the Default Printer field
will be used

3. Data field holds the Input parameters for the job specified in JOB NAME. Usually
contains ENQUIRY REPORT ids or REPGEN ids. In this screen shot when you
see ENQ followed by EU.FX.PL.TODAY you might think that this is the name of an
enquiry NO. This is the ID of a ENQUIRY.REPORT application and should be
specified in this manner in the DATA field

4. Job Message : In case the job results in an error, the error message is stored in
this field. The error message gets cleared once the job error has been corrected
and the jobs Job Status changes to 2

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13


1. In T24, most of the COB routines are multi-threaded (Not OS multi threading but a
simulated multi-threading)

2. A routine is broken up into 3 parts - LOAD, SELECT, RECORD


1. LOAD routine - initialization of common variables and parameters
2. SELECT routine - selects IDs from file
3. RECORD routine - contains actual processing logic

3. LOAD and RECORD routine are executed by all tSA and SELECT routine is
executed by only ONE tSA. The main job of the SELECT routine is to select all the
IDs that need to be processed. Now, imagine a scenario, where in interest needs
to be accrued for 1000 accounts as part of a COB job. Would you like all tSAs to
perform a select on and get the list of account ids and want each of them to accrue
interest for all the accounts? Of course not. It would suffice if one agent performs
a select and stores the selected ids in a common file so that all agents can share
and process the ids. This is the reason why only one agent performs the select
while the others wait until the select is complete.

4. Instead of executing one long routine (by only one tSA), breaking it up into logical
parts and then executing it (by multiple tSA), is faster

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 5


1. As you know a multi threaded routine comprises of 3 parts. Each part is a separate
routine that does a specific task.
For example: If we look at a sample routine ROUTINE1, it will be made up of three
parts ROUTINE1.LOAD, ROUTINE1.SELECT and ROUTINE1 (This is called the
record routine).
2. Once the tSAs are started, each of the tSAs execute the .LOAD routine.
3. Then only 1 of the tSAs will execute the .SELECT routine while the others wait.
As a result of the .SELECT routine a LIST file will be populated. This list file will
contain the actual list of IDs to be processed from the database (as a result of the
select statement inside the .SELECT routine)
4. Once the LIST is ready, all the tSAs will pick up IDs from the list file, and start
executing the IDs one by one by calling the record routine, viz the routine name
itself i.e. ROUTINE1 (does not have any suffix to it)

Q. Now, how is this different from single threaded routine?


Ans . In case of a single threaded routine, there would be just one routine that would
have the logic of .LOAD, .SELECT and execution built into it. Therefore, only 1 tSA
would process the whole routine. However, in case of a multi threaded routine, the
logic is split and written in the 3 separate routines, and multiple tSAs can execute
it at the same time, thereby decreasing the total time taken to execute a routine,
and increasing throughput.

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 6


1. How will I know if a routine is single threaded or multi threaded?
Any job (routine) that will be executed during COB, will contain an entry in the
application PGM.FILE. The id of the record in this application will be the name of
the job and the field TYPE will be set to B

2. If the field BATCH.JOB contains a NULL value or @BATCH.JOB.CONTROL,


then the routine is a multi threaded and will have 3 parts to it.
For e.g.: AZ.CYCLE.DATES.LOAD, AZ.CYCLE.DATES.SELECT and
AZ.CYCLE.DATES
NOTE that the .LOAD and the .SELECT routines need not have an entry in the
PGM.FILE, they will automatically picked up by the T24 system while executing
the routine.

3. If the field BATCH.JOB for that particular job contains the name of a routine (can be
the same name as that of the job) then the routine is single threaded. In other
words, if the field BATCH.JOB is not NULL or does not contain
@BATCH.JOB.CONTROL then the routine is single threaded

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 7


1. The first agent (tSA) to be started is called the tSM (T24 Service Manager). tSM monitors one or more
agent processes called tSA (T24 Service Agent). Every time an agent is started, it will first check if it is the
tSM. If not, then it will continue to work as an agent.
2. tSA invokes a routine called S.JOB.RUN, that in turn invokes a routine EB.SORT.BATCH only ONCE,
which sorts the COB jobs in A, S, R, D, O order.
3. Assume that today is 04-Jan-2010 and the table here contains sample records from the BATCH application.
EB.SORT.BATCH fetches all the processes from the BATCH application whose PROCESS.STATUS field
is not equal to 2, i.e. it picks up all the processes that are not complete. It then sorts the jobs based on the
Batch Stage and the Frequency. Once EB.SORT.BATCH returns the list of jobs to be processed,
S.JOB.RUN invokes a core T24 subroutine called BATCH.JOB.CONTROL, about which you will learn in the
next few slides
4. In what order do you think will EB.SORT.BATCH give the jobs to each tSA for execution?
5. Each job is given in the form of a dynamic array to a tSA containing the details that you see on the screen
now
PROCESS.NAME:'_':JOB.NAME:'_':RTN:'_':JOB.DATA:'_':COMPANY.ID:'_':NEXT.RUN.DATE:'_':ACTI
VATION.FILE:'_':BATCH.STAGE:'_':JIDX:'_':BATCH.PROCESS.STATUS
E.g.: BNK/AC.START.OF.DAY_ACCOUNT.DEBIT.LIMIT.UPD_BATCH.JOB.CONTROL_ _GB0010001_ _
_D110_1_0
1. PROCESS.NAME - BNK/AC.START.OF.DAY (@ID of BATCH record)
2. JOB.NAME - ACCOUNT.DEBIT.LIMIT.UPD (JOB NAME in the above BATCH record)
3. RTN - BATCH.JOB.CONTROL (Contents of the field BATCH.JOB in the PGM.FILE entry for the above
JOB.NAME)
4. JOB.DATA - (Contents of field JOB.DATA for this job in the BATCH record)
5. COMPANY.ID - GB0010001
6. NEXT.RUN.DATE - (not needed for a job with FREQUENCY D)
7. ACTIVATION.FILE -
8. BATCH.STAGE - D110 ; JIDX - 1 (multi value position of the above job in the BATCH record)
9. BATCH.PROCESS.STATUS 0

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 8


1. In the previous screen, each of the agents were given a list of Jobs to execute according to
the Batch Stage. Now lets understand how each tSA processes these jobs.
2. First, the JOB.PROGRESS field is updated in the F.TSA.STATUS file to 5 (Means the tSA
is now going to invoke a core T24 routine called BATCH.JOB.CONTROL).
BATCH.JOB.CONTROL is the heart of the COB process and takes care of the entire
execution logic. This routine is invoked once per job. This tSA first executes the .LOAD
routine. In our example the name of the job is JOB5, therefore the name of the load routine
will be JOB5.LOAD. As a result of the LOAD routine all the common variables required by
this routine will be initialized.
3. Since, all the tSAs execute the same job, they need to know where to pick up contract ids
in order to process them. The contract ids are all stored in a LIST FILE.

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 9


1. What is a LIST FILE?
A LIST file is one that holds all the ids of the contracts from the database

2. How did the tSA know which LIST file to use?


Within BATCH.JOB.CONTROL, there is logic to find out a free LIST file

3. This routine will loop through F.LOCKING with @ID starting with F.JOB.LIST.1,
and will return the one (List file name) that is first available

4. If this LIST file does not exist then BATCH JOB CONTROL will create the LIST file

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 10


1. Before deciding which LIST FILE to use, it updates the JOB.PROGRESS field in
F.TSA.STATUS to 3. The value 3 means that the tSA is managing a record in a file
called F.BATCH.STATUS.
What is F.BATCH.STATUS?
This is a file that contains the status of a particular job (within a batch record). A
record ID in this file is the FLAG.ID, viz. the ProcessName-JobName-Multivalue
Position of the job in the Batch record.

2. All the tSA try to Read and Lock a record with @ID as Flag ID in F.BATCH.STATUS.
One tSA succeeds while the other tSAs wait for the Lock to be released. First time
this record will not exist in F.BATCH.STATUS, we use a jBASE command to read and
lock a record; this command will lock a record even if the record does not exist.

3. The tSA which has acquired the lock on F.BATCH.STATUS record with @ID as Flag
Id will perform the List File allocation (refer to next slide for list file allocation logic)

4. The value in the second position of the record is checked. If it not processing, then the
SELECT has to be performed. A value of Processing signifies that the select is over
and the processing of contract IDs should begin. Since the .SELECT has not yet been
executed for this job, this position will be empty.

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 11


Step 1: Read a record in F.LOCKING with @ID equal to Process.Name-Job.Name-
Job.Position. Process Name is the name of the BATCH record. Job Name is the job
within the process and Job. Position is the position of the job in the BATCH record.
This is collectively called as FLAG.ID (P.N-J.N-M.Vpos). If the record exists, then read
the list file name that has been assigned for the job and continue with the select logic.
Step 2: Start a transaction block
Step 3: If the record does not exist, then Read and Lock a record in F.LOCKING with @ID
equal to Process.Name-Job.Name-Job.Position. The first time this record will not exist
in F.LOCKING. We use a jBASE command to read and lock a record. This command
will lock a record even if the record does not exist. Therefore, the tSA that came in
first will try to read a record in F.LOCKING and will retain the lock till the transaction
ends.
Step 4: Get the name of the LIST FILE into which the .SELECT routine will populate
values into.
Step 5: The name of the FILE in this case is F.JOB.LIST.<Seq No> e.g.: F.JOB.LIST.2. If
the file (LIST FILE) does not exist, then the routine will automatically create it, thereby
ensuring that the file physically exists.
We create a record in F.LOCKING with @ID as LIST FILE NAME. The contents of
this record is the FLAG.ID. This is done to stop other threads from picking up the LIST
FILE to populate data into it.
We also create another record in F.LOCKING with @ID as the FLAG ID. The
contents of this record is the name of the LIST FILE. This is done so that we can
publish to other agents that this particular LIST FILE is to be used for this JOB
Step 6: End the transaction block

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 12


5. Step 1: Start a transaction Block
6. Step 2: Update the JOB.PROGRESS field in F.TSA.STATUS to 2, which stands for
selecting contracts to populate into the LIST FILE record.
Execute the Job5.SELECT routine. This routine will take care of populating the
contracts in the LIST FILE record based on the logic described earlier
7. Step 3: The .SELECT routine can be called multiple times based on a values in a
common variable CONTROL.LIST. The values here are separated by FM. For each
value a select is performed. The contents of CONTROL.LIST are written to
F.BATCH.STATUS. Therefore, Update the record in F.BATCH.STATUS with the
value of CONTROL.LIST along with the first line containing <Control List
Value>VMPROCESSINGVMMAX.IST.IDVMKEYS.PROCESSED. MAX.LIST.ID is a
variable that contains the total number of records in the LIST FILE.
KEYS.PROCESSED is a variable that contains the total contracts present in the LIST
FILE.
8. Step 4: End the transaction Block
Once the transaction block ends, the lock that was held on F.BATCH.STATUS by the
first tSA will be released. The other tSA that was waiting for the lock on
F.BATCH.STATUS, will now get the lock on the record in F.BATCH.STATUS with
@id equal to FLAG.ID. However, now this tSA will find that the contents of the record
is already set to PROCESSING, which means that some other tSA has already
completed executing the .SELECT routine, and will continue to the next step viz.
executing the record routine for each contract.

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 13


A question which you should try and answer at this juncture
Q. What is the output of a select statement?
A. A list of contract ids that are selected from the database.
These contract ids are written into a record of a LIST FILE. During the selection
process, there is a call to a core T24 routine called BATCH.BUILD.LIST. This
routine decides how to write data into the LIST FILE

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 14


1. Lets assume the select routine retrieved 2 contract ids. It called
BATCH.BUILD.LIST which in turn created 2 records in the LIST FILE, one for
each contract ID

2. For e.g.: The multithreaded routine that we have written is FT.PROCESS.EOD.


Therefore, this routine will have three parts - FT.PROCESS.EOD.LOAD,
FT.PROCESS.EOD.SELECT and FT.PROCESS.EOD.

As a result of FT.PROCESS.EOD.SELECT routine, a list of FT contract ids will be


fetched from the database. These contract ids will be written into a LIST FILE
record. The diagrams above show you the content of each LIST File record

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 15


1. For conceptual view, the LIST FILE as a result of the previous select will look like
this.

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 16


1. To improve performance we can club a group of contracts and write it into the
LIST FILE delimited by VMs (Value Markers). This process is called bulking

2. The client might want to Bulk records in the a particular job to improve
performance. In that case, he can open the PGM.FILE entry for that job and enter
the number of contracts to bulk in the field BULK.NO. It also possible to specify
the bulk no in the parameter to BATCH.BUILD.LIST. However, the value
configured in PGM.FILE takes precedence. The value in this field should be
numeric and between 1 and 200 as the maximum bulk limit is 200.

3. Till now we have seen that every row in the LIST record has only 1 contract ID in
it. We can have up to a maximum of 200 contract IDs per row in a LIST record.
Each contract in the row is delimited by VMs (Value Markers). In this example of a
multi threaded routine CLEAR.WF, the bulk count has been set as 3, so the list file
record has got 3 contracts separated by a VM.

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 17


1. For conceptual view, the LIST FILE as a result or the previous select will look like
this if the Bulk number has been set to 3

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 18


Continuing from the previous slides, a tSA has completed the .SELECT routine.
The other tSA; waiting on the lock to be released on F.BATCH.STATUS will now
obtain it. It will find out that the value in the F.BATCH.STATUS record has been
set to PROCESSING, which means that the .SELECT routine has been
executed. So, both the tSAs are ready to execute the record routine.

1. STEP 1:Both the tSAs will check if the content of F.BATCH.STATUS is set to
PROCESSED, if not, it means there are contract IDs to be executed. The tSA
will update the JOB.STATUS field in F.TSA.STATUS record to 1. This value
stands for Processing contracts.
2. STEP 2: Initialise a variable, which will hold all the LIST records to be processed.
The name of this variable is FULL.LIST.
3. STEP 3: Set the JOB.PROGRESS field in F.TSA.STATUS to 4 which means
selecting from the LIST FILE.
4. STEP 4: The tSA that executed the select, will store all the LIST record ids
separated by FM (field markers). All the other tSAs that didnt execute the
.SELECT routine will perform a select on the LIST FILE, extract the LIST record
ids and store them in a variable. This is done to improve performance. Every
select statement executed is an I/O on the database. So, the tSA that executed
.SELECT routine need not again execute a select statement to pick up LIST
records from the LIST FILE.

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 19


This slide explains the working of one tSA only.
The tSA has to now extract its portion of the work and start executing the RECORD routine. To
do so the tSA-
1. STEP 1: Populates the LIST records it has to execute in a variable. This variable contains
the LIST record ids separated by FM (Field Markers). For e.g.: This tSA has to execute
LIST record 1
2. STEP 2: Update JOB.PROGRESS field in F.TSA.STATUS to 3. This means that the tSA
has started processing the contracts one by one.
3. STEP 3: Extracts the first LIST record id from variable.
4. STEP 4: Reads and lock the corresponding record in the LIST FILE. Remember each LIST
record contains contract IDs.
5. STEP 5: Starts a Transaction Block
6. STEP 6: Extracts the first contract ID. Executes the RECORD routine for this particular
contract ID
7. STEP 7: Checks if all the contract IDs within the particular LIST record have been
processed. If not, it deletes the particular contact ID from the LIST FILE. Then it ends the
transaction block and goes on to pick on next contract ID within the same LIST record.
However if this contract processed was the last contract to be processed in the LIST
record, then it deletes the LIST record from the LIST FILE directly. Then it ends the
transaction block. It proceeds to pick up the next LIST record ID to process.
The important thing to note here is that the transaction block is around each contract ID
inside the LIST record separated by FM (field markers). Both, the deletion of contract IDs
within the LIST record or of the LIST record itself, happens within the same transaction
block. This implies that even if the tSA dies after it has executed the RECORD routine, and
removed the LIST record from the LIST FILE the changes will not be committed till the
transaction block ends.

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 20


This slide explains the working of another tSA (tSA 3)
The tSA has to now extract its portion of the work and start executing the RECORD routine. In
this case, the LIST record that the tSA has to process is 2. To do so the tSA-
1. STEP 1: Populates the LIST records it has to execute in a variable. This variable contains
the LIST record ids separated by FM (Field Markers).
2. STEP 2: Update JOB.PROGRESS field in F.TSA.STATUS to 3. This means that the tSA
has started processing the contracts one by one.
3. STEP 3: Extracts the first LIST record id from the variable.
4. STEP 4: Reads and lock the corresponding record in the LIST FILE. Remember each LIST
record contains contract IDs.
5. STEP 5: Starts a Transaction Block
6. STEP 6: Extracts the first contract ID. Executes the RECORD routine for this particular
contract ID
7. STEP 7: Checks if all the contract IDs within the particular LIST record have been
processed. If not, it deletes the particular contact ID from the LIST FILE. Then it ends the
transaction block and goes on to pick on next contract ID within the same LIST record.
However if this contract processed was the last contract to be processed in the LIST
record, then it deletes the LIST record from the LIST FILE directly. Then it ends the
transaction block. It proceeds to pick up the next LIST record ID from ID.LIST.
The important thing to note here is that the transaction block is around each contract ID
inside the LIST record separated by FM (field markers). Both, the deletion of contract IDs
within the LIST record or of the LIST record itself, happens within the same transaction
block. This implies that even if the tSA dies after it has executed the RECORD routine, and
removed the LIST record from the LIST FILE the changes will not be committed till the
transaction block ends.

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 21


In case of Bulking, the concept we saw earlier, each LIST record contains a group
of contract IDs in every row rather than just one. These group of contract IDs are
separated by VM (Value Marker). In the above screenshot, the bulk count has
been set to 9.

1. When a tSA extracts contract IDs in this case, it extracts all IDs till the first FM
(Field Marker). In the above example, tSA 3 extract contract1, contract2 and
contract3 together i.e. till it encounters the first FM. After starting the transaction
block it processes all the 3 contracts within the same transaction block. The record
routine is executed thrice (once for each contract ID) within a loop. Once the loop
is over, the tSA removes the group of contract IDs from the LIST FILE in one go.

Q. How does this improve performance?


A. As you know already, once a contract is processed, it is removed from the LIST
FILE. Every removal of a contract is 1 I/O (input/output) on the disk. So, now
instead of 3 write statements (1 for each contract) on the disk only 1 takes place.
This quickens the entire processing of COB. Thus bulking helps improve
performance. However, it can be used only if we know that the result of select
statement will retrieve lots of contract IDs

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 22


Once the tSA(s) have executed all the IDs in its ID.LIST it will return to read
F.BATCH.STATUS. The following steps take place.
1. Step 1: Start a transaction Block
2. Step 2: Both the tSA(s) try to read and lock a record in F.BATCH.STATUS with
@ID as FLAG.ID. One tSA will be successful in doing so. If the tSA finds that the
record is locked then, it will end the transaction block and exit.
3. Step 3: The tSA that succeeded in the reading the record and obtaining the lock
will check if the record contains any value.
4. If yes, delete the first line of the record and write it back to F.BATCH.STATUS.
This is because the .SELECT routine has to be called again for the next value in
CONTROL.LIST.
5. If no, then it (tSA) will update the contents of F.BATCH.STATUS record with a
string PROCESSED. This is done to tell all the tSAs that all the contract IDs
have been processed and this Batch Job is complete.
6. End the transaction block.
The following steps are done to clean up F.LOCKING and send the control back to
S.JOB.RUN, so that the next Batch Job can be picked up for execution
7. Step 4 : Start a transaction block
8. Step 5: Read and lock a record in F.LOCKING with @id as FLAG.ID. Only 1 tSA
will be successful in doing so.
9. Step 6: Delete the record in F.LOCKING with @ID as FLAG.ID
10. Step 7: Read and lock a record in F.LOCKING with @id as LIST FILE name. Only
1 tSA will be successful in doing so.
11. Step 8: Delete the record in F.LOCKING with @ID as LIST FILE name

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 23


12. Step 9 : End the transaction block.

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 23


1. Multiple batch records with the same batch stage
2. Randomize batch records with the same batch stage
3. Never randomize jobs
4. Ideal when multiple single threaded routines need to be executed simultaneously

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 24


1. Take a look at this table that contains a set of records from the BATCH
application. For each of the records, a batch stage has been specified. Note that
they are all the same. This being the case, when EB.SORT.BATCH is internally
called to sort the records, what EB.SORT.BATCH will do is, it will pick up all
BATCH records with the same BATCH.STAGE and will randomize them and at
the same time ensuring that the sequence of jobs within the BATCH records dont
change.

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 25


1. When single threaded routines get executed, as you are aware, they will neither
have a .LOAD nor a .SELECT component. They will only have one routine which
will contain the entire logic. The following are the steps that will happen internally
when a single threaded routine is executed.

1. BATCH.JOB.CONTROL (BJC) will realize that it is a single threaded


routine and hence will not search for a .LOAD routine and hence no LOAD
routine will get executed.

2. BJC running on the agent that holds a lock on F.BATCH.STATUS will


write a value SingleThreaded on to the list file

3. The agent that gets the lock on the ID in the LIST file will be the one which
executes the single threaded routine

4. The entire single threaded routine is within a transaction block

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 26


1. Assume that tSA 2 is executing a single threaded routine
2. An agent recognizes that it is executing a single threaded routine using the field BATCH.JOB in
the corresponding PGM.FILE record. This field will have the name of the routine to be executed
prefixed with @. Once it does, it displays that it is executing a single threaded job along with the
job name

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 27


1. tSA will update JOB.PROGRESS field to 3 in F.TSA.STATUS to denote that it is processing
the job
2. Next, it will read and lock a record in F.BATCH.STATUS with @ID as Process.Name-
Job.Name-Job.Position. This ID is called FLAG.ID
3. It proceeds with list file allocation. Follows the same logic as that for a multi threaded
routine.
4. Then, it checks if the content of the record is Processed or Processing or NULL. NULL
denotes that the job is yet to be executed. Processing denotes that the LIST file is ready for
processing and Processed denotes that the LIST file has been processed (meaning the job
is done)
5. Since, it is only now that the job is being executed, tSA2 updates JOB.PROGRESS field to
2 in F.TSA.STATUS. This denotes that contracts are to be selected.

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 28


1. Then, it starts a transaction block
2. Since it is a single threaded routine, there is no SELECT routine to be executed as well.
Unless there is a LIST file with IDs, the BATCH.JOB.CONTROL framework will not be able
to execute the job. Hence, what BATCH.JOB.CONTROL does is, it creates a record with
ID 1 and writes a value SingleThread in it.
3. Now, the number of keys to processed is 1
4. The next step is to update F.BATCH.STATUS with the keyword processing to denote that
the LIST file is ready for processing along with the maximum list of keys to process (which
is 1) and the number of keys to process (which is also 1 in this case)
5. Marks the end of a transaction block

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 29


Now that the LIST file is ready with the string SingleThreaded, it is now time to
execute the actual single threaded routine. This is what happens internally.
1. Read and lock the F.BATCH.STATUS record. Which ever agent gets the lock , is
the one that updates JOB.PROGRESS field to 1 in F.TSA.STATUS to denote that
it is processing the contents of the LIST file. Note that the LIST file doesnt have
any contracts in it, rather it just has the value SingleThreaded written in it
2. Update JOB.PROGRESS field to 7 in F.TSA.STATUS to denote that it is executing
a single threaded routine. The agent also locks the one record in the LIST file with
ID 1 and contents as SingleThreaded.
3. Start a transaction block
4. Execute the routine specified in the field BATCH.JOB in the appropriate PGM.FILE
record
5. End the transaction block. Once a transaction block ends, it deletes the record
from the LIST file. Now there are no more records to be processed.
6. Update F.BATCH.STATUS record to PROCESSED to denote that the LIST file
has been processed
7. Start a transaction block
8. Delete the records in F.LOCKING with ID as FLAG.ID and ID as
F.JOB.LIST.<seq.no>
9. End the transaction block

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 30


It is vital to understand that the tSM is itself a background process like the tSA but the
only difference is that it will not execute COB jobs but will monitor tSAs. Therefore,
when we initiate the tSM, it is as good as initiating the first tSA. This tSA is like a
master agent and will control all the other agents. At any point in time, there can
be only one tSM running on one T24 server but we can have as many
numbers of tSAs as required on a T24 server. Therefore,
TSA.WORKLOAD.PROFILE record used by the TSM service should never
contain more than one agent.
1. Execute the command START.TSM with or without the DEBUG option
depending on how you wish to start tSM (Interactive or phantom mode)
2. Once tSM is started, it will check to see which agent is available to perform the
role of a manager. For this, it scans through a file named F.TSA.STATUS to see if
there are any records with STATUS set to STOPPED or DEAD. If you are running
COB for the first time, then, this file will be empty.
3. Depending on which agent is available, it sets the field NEXT.SERVICE in the
F.TSA.STATUS file to TSM for that particular agent.
4. Then, it internally starts the agent allocated to be the tSM (1 in this case)
5. As soon as an agent starts, the first thing that it checks for is to see if it is a tSM or
a normal agent. This, it checks based on the value in NEXT.SERVICE field.

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 31


This explains the internal working of an agent that is designated to perform a role other
than that of a manager (tSM)
1. Execute the command tSA followed by the agent number followed by DEBUG
(optional) to start an agent
2. Next, it checks if any service has been assigned to the agent (2 in this case) that
has been started by referring to the F.TSA.STATUS file.
3. This internally executes the tSA program followed by the agent number
4. Checks if it has to perform the role of a tSM. This is the first check that all tSAs
perform when they are started
5. Since the field NEXT.SERVICE for agent 2 is set to COB, it realizes that it has to
execute COB
6. Then, it updates the F.TSA.STATUS file with details of agent 2 that has been
started
7. Next, it internally calls a routine named S.JOB.RUN to trigger off COB processing
8. Next, it invokes a routine named EB.SORT.BATCH which is the one that will sort
all jobs in the BATCH application based on BATCH.STAGE and FREQUENCY.
9. Once done, jobs start getting processed.

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 33


This illustration will help you to understand how multiple agents help perform a job and
how failure of a single agent does not stop COB
1. Each server can have only 1 tSM. tSM, when started will launch the required
number of agents for that server
2. Note that this tSM and the agents have been launched in Server A. On Server B,
another tSM has been launched, which internally launches the required number of
agents on that server
3. Let us assume that all agents are helping in executing a particular COB job.
Assume that one of the tSAs on Server A is the one which has executed the
.SELECT routine has hence has populated the contents of the LIST file
4. As you can see, all agents pick up IDs to be processed from the same LIST file
thus enabling to achieve multithreading
5. If you wish, you could add in one more server(s), start agents on that server and
this agent can also participate and process IDs from the LIST file
6. Even if one of the agents crash, since all other agents are active, they will continue
to pick up data from the LIST file.
7. This process will continue until the LIST file becomes empty.

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 34


1. How does the tSA tell the tSM that it is alive?

2. From time to time within the flow of BATCH.JOB.CONTROL, we call a routine


SERVICE.HEARTBEAT

3. This routine takes care of updating the following fields in F.TSA.STATUS every
60 sec
3.1 LAST.CONTACT.TIME
3.2 JOB PROGESS
3.3 LAST MESSAGE

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 35


1. How do we ensure that TSM is alive?

2. On every call to SERVICE.HEARTBEAT, the tSAs(any agent other than the one
allocated for TSM) checks if the TSM is alive.

3. In Phantom mode, agents re-launch the manager, if TSM has not reported for a
specified period of time(an update of the last contact time of TSM is maintained in
F.LOCKING with records key as TSM:<Server Name>. If difference between
current time and last contact time as updated for TSM is greater than 120
seconds, then the tSA will re-launch the TSM)

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 36


1. Log files get stored under a directory &COMO&
2. LIST &COMO& to get a list of log files
3. JED &COMO& <Log file Name> to view the contents of the log file
4. Log file ID : tSA_<Agent Number>_<Date>_<time>
5. Remember
5.1 If an agent is restarted, there will be 2 log file for that agent as the log file
ID is based on date and time
5.2 Log files do not get cleared automatically

From time to time within COB processing, mainly within BATCH.JOB.CONTROL, a file
called &COMO& is updated. This file resides in the T24 Home directory
(bnk.run) and contains the log of what every tSA has done. There is one record
per tSA per day in this file. The @ID of the record in this file is tSA_<Agent
Number>_<Date>_<Time>. The date format is YYYYMMDD and the time format
is HH-MM-SS
E.g.: tSA_2_20100324_17-32-10.

We can view the log stored in &COMO& through the JED editor. At the jshell prompt,
type
JED &COMO& tSA_<Agent Number>_<Date>_<Time> to view the log of a
particular tSA.

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 37


E.g.: jsh ---> JED &COMO& tSA_2_20100324_17-32-10

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 37


1.What are the different types of errors that could be encountered during execution of
COB?
1.1 Errors that are caused due to jBASE level problems, like error reading a file,
error opening a file etc., are called jBASE level errors.
1.2 Errors that are caused due to O/S level problems, like inappropriate
permissions on a file, memory inadequate, etc., are called O/S level errors.
1.3 Within the individual COB jobs we can raise two types of errors viz. Non Fatal
Errors, Fatal Errors and Critical Errors. You will look at these 3 types of errors in
the next slides.

2. What happens when a jBASE/OS level error occurs?


The agent executing COB will crash out, by either going to the jBASE debugger
prompt or the jsh (jshell prompt). The error has to corrected before restarting COB.
The PROCESS.STATUS and the JOB.STATUS field in the BATCH record will be
1 (meaning running). There will be no record created in the application
EB.EOD.ERROR and EB.EOD.ERROR.DETAIL, this is because the error was not
raised by the COB job. An error will be logged in these applications only when the
error raised can be corrected before starting COB during the next working day i.e.
the error is not critical.

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 38


1. What is a Non Fatal Error?
If an error is encountered during the execution of a job then, within the individual
COB job/routine we would have called a core T24 API called FATAL.ERROR. To
this routine we can pass a parameter telling it not to fatal out. This means that the
agent executing COB will not crash out, instead in this case the agent will write the
error to applications called EB.EOD.ERROR and EB.EOD.ERROR.DETAIL, and
then continue executing COB. A field in EB.EOD.ERROR called FIX.REQUIRED
is set to YES indicating that the error/errors are fixed before COB is started next
day. The PROCESS.STATUS field and the JOB.STATUS field in the BATCH
record will be 1, signifying that the current job and therefore the process is
running.

2. What is a Fatal Error?


If an error is encountered during the execution of a job then, within the individual
COB job/routine we would have called a core T24 API called FATAL.ERROR. To
this routine we can pass a parameter telling it to fatal out. This means that the
agent executing COB WILL crash out, writing the error to applications called
EB.EOD.ERROR and EB.EOD.ERROR.DETAIL. However, before crashing out, it
(agent) removes the contractID from the LIST FILE and then crashes. This means
that the agent can be restarted and will start processing from where it last ended,
minus the contract that caused it to fatal out. The PROCESS.STATUS field and
JOB.STATUS field in the corresponding BATCH record remains 1 (which stands
for Running)

3. When a Fatal Error is encountered, the problem needs to be reported to the


Temenos Help Desk to get it fixed

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 39


1. When a FATAL.ERROR occurs, and you are running TSM in Interactive/Debug
mode. The error will be displayed on the screen and will be written on the COMO
before the agent crashes out, and returns to the jshell prompt. The contract that
caused the error will also be removed from the LIST FILE. If you are running TSM
in Phantom mode, the agent will still crash out however it will not be displayed on
the screen. The AGENT.STATUS enquiry will show that the agent has been
stopped and new agent will automatically be launched by TSM once the
REVIEW.TIME has been reached

2. The error is written on to the EB.EOD.ERROR and EB.EOD.ERROR.DETAIL file.


If there are multiple agents running, it will start COB from where this agent left off.
You will have to manually start the agent again in Interactive Mode.

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 40


Q. How do you tell T24 that a job is a CRITICAL job?
A. Set the ADDITIONAL.INFO field in PGM.FILE entry for that job to .CRITICAL

1. What happens when a critical job crashes?


The error is treated as a Critical Error.

If an error is encountered during the execution of a critical job then, the agent
executing COB will crash out in this case and return to the jBASE prompt (In
Interactive Mode). In phantom mode you will be able to see that the agent has stopped
in the AGENT.STATUS enquiry output. It (the agent) DOES NOT write the error to
applications called EB.EOD.ERROR and EB.EOD.ERROR.DETAIL, DOES NOT
remove the contractID from the LIST FILE and then crashes. The PROCESS.STATUS
field and JOB.STATUS field in the corresponding BATCH record is updated to 3
(which stands for Error/Hold). This means that if a critical job fails, the cause of error
needs to be corrected immediately before COB is restarted again.

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 41


1. Repeated crashes at system level i.e., crashes in LOAD and SELECT , will stop
the service unlike before when crashes in LOAD & SELECT routines will not stop
the service rather other agents will try and complete this task i.e. repeated crashes
at system level was not detected by tSM and it tried to achieve the task through
other agents.

2. Repeated crashes on the same job within a time interval stops the service.
Duration and the number of crashes can be configured at TSA.PARAMETER table
through fields (STOPPAGE TIME & STOP COUNT Number of times the crash
happens for a service i.e. for the agents within a specified time).
The first time the agent is marked as DEAD, TSM updates a record with key as
<Service Name>-STOP in F.LOCKING with the stop time and stop count as one. If
not the first crash, then the time difference between the first stop time and the
current time is checked and if it has exceeded the stop time period specified in
TSA.PARAMETER, then the current time is updated as the first stop time.
Otherwise if the stop count has exceeded the limit specified, then the service is
stopped.

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 42


1. Date specifies the bank date on which the service was started.
2. Started specifies the time on which the service was started, in the format
DD/MM/YYYY HH:MM:SS
3. Stopped specifies the time on which the service was stopped, in the format
DD/MM/YYYY HH:MM:SS
4. Elapsed specifies the elapsed time of the service, i.e., the difference between the
start and stop time for the service in seconds.
5. Transactions specifies the number of transaction processed by the service
6. Stoppage Time specifies the time interval for a service to be monitored for the
number of crashes allowed as specified in the field STOP.COUNT. If no value is
specified here, the value is taken from TSA.PARAMETER. The time given in this
field must be grater than TIME.OUT/DEATH.WATCH.
7. Stop Count specifies the number of crashes allowed for a service in a time period
as specified in the field STOPPAGE.TIME. If no value is given for this field, the
value is taken from TSA.PARAMETER
e.g., STOPPAGE.TIME = 100 STOP.COUNT = 3
Here the Service can only crash 3 times in a time period of 100 seconds. At the
fourth crash within 100 seconds, the service will be stopped.

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 43


1. Fatal errors(both Online and during COB) provide more useful information.

2. Setting the field REPORT.CORE.DUMP in SPF to Y will invoke the


JBASECOREDUMP function on fatal error. Call stack & core dump are written
onto a centralized TAFC log with a unique ID.

3. The above changes will be packaged as a part of TEC framework.

4. Key to be transaction reference & unique number.

5. All variables and their values at the time of crash are also stored.

6. The core dump uses jBASE system functions to obtain the required information.

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 44


1. What happens when a tSA crashes while processing the LOAD routine?
Ans. Since there are multiple tSAs executing the LOAD routine, if this tSA fails
another tSA will execute it anyways
2. What happens when a tSA crashes just after executing and populating the LIST
file
Ans. The LIST FILE to be used is written onto F.LOCKING for a particular job,
therefore even if this tSA crashes, other tSAs will read F.LOCKING and pick up
the LIST FILE to be used
3. What happens if a tSA crashes while processing the SELECT routine?
Ans. The SELECT routine is executed only by one tSA and is wrapped around
jBASE transaction management. So, if the tSA crashes while doing the select any
partial updates will be rolled back. The other tSA waiting for the lock to be
released on F.BATCH.STATUS will perform the select
4. What happens if a tSA crashes while executing a contract ID in the LIST record?
Ans. Each contract ID(s) being processed in the LIST file is wrapped around a
transaction block, therefore any failure in tSA will result in a rollback. The contract
ID will not be removed from the LIST record. Any other agent will pick up that
particular LIST record and execute the contract ID(s).

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 45


So, since the TSM is still running, it will prompt the user to start the 3 agents required
to run COB.
The user starts the first tSA. It (tSA) calls EB.SORT.BATCH. This in turn reads
F.BATCH and picks up all the processes with PROCESS.STATUS ready (0) or
running(1). So, it leaves out the process BNK/PL.CLOSE.EOD as its
PROCESS.STATUS is already 2 (completed).
It reads the JOB.STATUS of Job7 and finds that its set to 2. This means that Job7
has been completed successfully.
The tSA then executes the Job8.LOAD routine and tries to read a record in
F.LOCKING with @ID equal to FLAG.ID. (Process.Name-Job.Name-Multi value
position of that job within the process, e.g.: BNK/FT.EOD-Job8-2). It finds out that
such a record exists as the previous tSA (that crashed) already wrote this info into
F.LOCKING file.
The tSA now has the name of the LIST FILE into which the LIST records should
be populated into. E.g.: F.JOB.LIST.2
It now tries to read a record in F.BATCH.STATUS with @ID equal to FLAG.ID.
However such a record does not exist as the previous tSA crashed before writing
the information into F.BATCH.STATUS.

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 46


1. False
2. False
3. True
4. False
5. True

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 48


In this learning unit, you learnt about the internal working of COB

You will now be able to:

1. Understand the internal working of COB


2. Understand the working of tSM & tSA
3. Visualize failure scenarios
4. Understand the types of errors generated during COB

COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 49


COB2.Close Of Business BATCH.JOB.CONTROL & Errors-R13 50

You might also like