100% found this document useful (1 vote)

2K views22 pages

Checkpoint Restart V2.1

Uploaded by

Vineet Prakash

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

2K views22 pages

Checkpoint Restart V2.1

Uploaded by

Vineet Prakash

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 22

EMEA DEVELOPMENT

STANDARD

CHECKPOINT RESTART

Prepared by: Development Centre

Date: 15th January 1999
Table of Contents
1.0 An overview to checkpoint restart .......................................................................................................................3
1.1 Why it is necessary ..........................................................................................................................................3
1.2 How it works ....................................................................................................................................................3
1.3 Database Backout............................................................................................................................................4
1.4 ARC..................................................................................................................................................................4
1.5 Standard for Checkpointing and Restart .........................................................................................................5
2.0 Checkpoint/Restart and Program Structure .........................................................................................................5
2.1 Overview ..........................................................................................................................................................5
2.2 How to identify the best L.U.W ......................................................................................................................5
2.3 When to issue checkpoints ..............................................................................................................................6
2.4 Where to code checkpoint/restart logic ...........................................................................................................7
2.5 How to use the checkpoint area itself .............................................................................................................7
2.6 Initialising items which are stored in the checkpoint area .............................................................................7
2.7 Database Repositioning after XRST................................................................................................................8
3.0 Testing a Checkpoint/Restart Program ...............................................................................................................8
3.1 Testing Procedure.............................................................................................................................................8
3.2 ARC Pacing Class.............................................................................................................................................8
3.3 Use of ARC Online Panels...............................................................................................................................9
4.0 GSAM Files.........................................................................................................................................................11
4.1 Accessing GSAM files ...................................................................................................................................11
4.2 GSAM variable length record files. ..............................................................................................................12
5.0 System Design Considerations...........................................................................................................................14
5.1 Drivers.............................................................................................................................................................14
5.2 Reports............................................................................................................................................................14
5.3 GSAM to GSAM.............................................................................................................................................14
5.4 VSAM files ....................................................................................................................................................14
5.5 Sorts ................................................................................................................................................................15
6.0 Checkpoint/Restart with Other Products...........................................................................................................16
6.1 Director...........................................................................................................................................................16
6.2 Other Products................................................................................................................................................18
7.0 Defunct Checkpoint/Restart Components .........................................................................................................19
7.1 BMP Restart ...................................................................................................................................................19
7.2 CBLTBMC......................................................................................................................................................19
7.3 CBLTDLX......................................................................................................................................................19
7.4 SPLTDLI.........................................................................................................................................................19
Appendix A - DB2 Restart Logic.............................................................................................................................20
A.1 Simple Key....................................................................................................................................................20
A.2 Compound Key - Wrong!!.............................................................................................................................20
A.3 Compound Key - Right..................................................................................................................................21
A.4 Compound Key - Best...................................................................................................................................21

-2-
1.0 An overview to checkpoint restart

1.1 Why it is necessary

There are two main reasons why all batch programs using databases at Brighton should take checkpoints and why some
need to be 'checkpoint/restartable'.
Firstly, if a job runs for a long time, then it can be a severe waste of expensive machine resources to have to re-run it from
the beginning if, for example, there is a hardware problem. Some batch programs run for many hours: if these abend it
would be extremely inefficient to have to re-run them from scratch; what is needed is the ability to make them 'start where
they left off'.
Secondly, we operate a block-level data sharing environment where the databases can be accessed in parallel both for read
and update access across online and batch systems. The system software which controls access to databases has to ensure
that databases which are being used simultaneously do not get corrupted. Further, even if this is achieved, it also has to
make certain that programs do not get inconsistent views of the databases because, say, some segments have been updated
by a program which is still in the process of updating others.
IMS makes this happen through the use of locks which are managed by the IMS/VS Resource Lock Manager (IRLM). The
IRLM actually supports eight levels of locking, but for simplicity’s sake we can think just in terms of update locks and read
locks.
As an application program updates a DL/1 database, all of the segments in that database record (i.e. the root segment and
all of its dependants) are locked up so that no other program can look at them. (The lock may be effected at the higher level
of physical block: not only the database record in question, but any others sharing the same block in the database, would
then be locked up). As the program proceeds, updating other segments in other database records, more and more of the
database is being locked up so that other programs cannot read it. The updates to the database will be happening physically,
but the other programs are barred from looking at the results of those updates.
In the same way, a program which is reading a DL/1 database will acquire read locks. These locks will allow other
programs to read the same data but will prevent parallel update access until the lock is released.
The thing that stops this process continuing until the entire database is unavailable to any other program is that the updating
program reaches a 'sychronization point'. This happens when a CHKP call is issued in a batch program.
Similarly, DB2 establishes locks in order to serialise access to data. These locks will be released when the application
issues a COMMIT call. Both IMS and DB2 data may be accessed and updated by the same program. When the program
issues a CHKP call then a DB2 COMMIT will also be invoked. The two-phase commit process ensures that the data base
updates are synchronised between IMS and DB2. This ensures that the integrity of the data is maintained and that both the
DB2 updates and the IMS updates in a unit of work are both either committed or backed out. Should it be necessary to
recover the IMS data bases to a point in time, the DB2 tables will need to be recovered to the same point in time for the
data bases to be consistent
There are a number of problems associated with a program which accesses large areas of a database between synch points:
• It will slow down or lock out other programs which want to access the same data. This is particularly undesirable if the
other programs are online transactions.
• The more locks a program holds, the more likely it is to encounter ‘deadly embrace’ problems in which two programs
mutually lock each other out, each of them holding locks that the other is waiting for.
• The more locks that the lock managers are having to manage at any one time, the greater will be the system overhead.
This problem can be very significant in the case of DB2 in a parallel sysplex environment, where the presence of large
numbers of unnecessary read locks can cause serious performance degradation to the whole DB2 subsystem.
• There is a particular problem in the situation where one of the subsystems fails. As DB2 restarts it will back out
updates for all tasks which were in-flight at the time of failure, back to their last commit point. If a program has run
for two hours without taking a commit then it will typically take two hours to back out the updates, thus delaying DB2
availability across the entire system. A similar problem will affect IMS as it backs out in-flight updates for BMP’s and
online transactions.
For all of these reasons it is necessary to have batch programs which a) issue DL/I CHKP calls at 'frequent' intervals and b)
are capable of restarting processing from the last checkpoint taken. The precise rules for this are given in section 1.5
below.

1.2 How it works

There are two types of checkpointing: (which cannot be mixed in the same program) basic and symbolic. A basic
checkpoint call has only one function: it is a synch point, and will cause locks to be released so that other programs can use
the affected segments.

-3-
A symbolic checkpoint call also does this. However, it has another function in that the application program can store away
information which will allow it to restart again if it fails. This information will vary between various different types of
programs, but it tends to be things like: the root key of the database record currently being processed, the page number of
the report which is being output, accumulators giving financial totals and item counts of whatever has been processed, etc.
Symbolic checkpointing is the only one which is of relevance in our environment: basic checkpointing is not (and should
not be) used in application programs.
As a side effect of both types of call, the current position on all databases is destroyed and must be re-established, by the
application, before any further processing can be done. The exception to this for DB2 is the use of SQL DECLARE
CURSOR WITH HOLD which maintains position across a COMMIT call.
There are two other IMS facilities which need to be explained before it is possible to show the logic of a typical
checkpoint/restartable program.
Firstly, there is a type of file called a GSAM (Generalised Sequential Access Method) database, which is not really a
database at all! It is basically a ordinary sequential file (BSAM, or VSAM ESDS) which is accessed through DL/I ISRT
(for writing) and GN (for reading) calls.
It might seem rather perverse to give the ability to access sequential files via DL/I calls, but there is a very good reason.
This is that on a restart call (explained below!), GSAM files are positioned to the record being processed at the last
checkpoint. Also, after a checkpoint call, positioning is not lost, as it is on true databases.
If this facility did not exist, it would be hard to restart a program which used sequential files as it would have to position
itself to the right place. This would involve doing things like reading through the input file until the relevant record had
been read, which is all rather messy. The situation is even worse for output files, where the program would have to read
from the old output file and write them to the new one until correctly positioned.
Secondly, there is a DL/I call XRST (for restart), which must be the very first call which the program issues. This can do
one of two things. Either, it will tell the application that it is not a restart and that processing can continue normally. Or,
alternatively, it will tell the program that it is a restart and that special action is necessary. In this case DL/I will do some
extra things. The information which the program saved at the last checkpoint call before it failed will be returned to the
program. The position is restored for GSAM files, so no application work is needed for them. However, for other databases
position is not restored and it is the application’s responsibility to restore position as necessary.
The general scheme of things for a batch program which is checkpoint/restartable is as follows:

It issues a XRST call to IMS, which tells it whether a restart is taking place or not.
If necessary, it then re-opens any DB2 cursors needed and re-positions on the various databases. It then resumes
processing using the data which was stored away in the checkpoint area.
Otherwise, it will open any GSAM files which are to be used and, if any of them are for output, it will issue an initial
CHKP call to IMS.
It then performs the main logic of the program:
Do the Logical Unit of Work (L.U.W.) - the lowest USEFUL function which it performs in a repetitive way,
e.g. process an input record.
Issue a CHKP call.
Re-position on the various databases (but not on DB2 cursors declared WITH HOLD)
until there is no more work to be done.

1.3 Database Backout

When a database update program abends it is likely to have performed updates since the last checkpoint was taken. These
updates must be backed out in order for data integrity to be maintained as the program restarts. In the production
environment this occurs as follows:

• DB2 automatically backs out table updates to the last COMMIT point.
• IMS BMP jobs automatically back out DL/1 updates back to the last checkpoint.
• IMS stand-alone DL/1 jobs do not backout automatically. However production jobs include a job step which performs
this function, using the IMS feature known as DBRC. In fact, DBRC will prevent access to a database which needs
backing-out until this step is performed.

When testing your checkpoint/restart programs, however, you may need to perform a manual batch backout.

1.4 ARC
Another piece of software which is involved in the process of checkpoint restart is ARC (APPLICATION RESTART
CONTROL) from BMC Software. ARC provides a wide range of features but at the time of writing its use is limited to
replacing the functionality of its predecessor product, BMP Restart. Broadly, these functions are:

-4-
• To provide a central repository of the checkpoint/restart characteristics of each group of jobs.
• To intercept CHKP calls to ensure that these are passed to IMS at an appropriate rate. Typically checkpoints will only
be allowed to be honoured once a certain time period has elapsed. This function is known as pacing.
• To maintain the restart information for each job, such that jobs can be restarted without manual intervention.
ARC is administered by Information Management and it is their responsibility to ensure that checkpoint pacing is set
correctly. However there are certain features of ARC that can assist you in testing your checkpoint/restartable programs
and these are noted below. For more information please refer to the following manuals on BookManager bookshelf BMC
Software Products for DB2, IMS, CICS, & MVS:
APPLICATION RESTART CONTROL User Guide V 2.1
APPLICATION RESTART CONTROL Reference Manual V 2.1

1.5 Standard for Checkpointing and Restart

The following applies to all programs which are to run in the production environment, whether as a regular scheduled job
or a ‘one-off’:

• Any program which accesses a DB2 or IMS database, whether read or update, must issue regular checkpoint calls.
• Any program which updates a DB2 or IMS database must also be written to be checkpoint restartable.
• Any scheduled program whose longest execution is likely to run longer than 20 minutes elapsed, must also be written to
be checkpoint restartable.

Thus a short-running read-only IMS or DB2 program will issue an XRST call and regular CHKP calls, but can use
READ/WRITE instructions for sequential files. They should not use GSAM. Such programs will always run 'from the top',
whether starting normally or restarting after an abend. The XRST call is necessary to keep IMS happy, but the program
should perform the same 'from the top' processing whether the XRST call indicates normal start or restart.

Other IMS or DB2 programs will be either long running, or do updates. These will issue an XRST call and regular CHKP
calls, but in addition will be truly restartable. This means they must use GSAM for sequential files, and will be written
and tested so that they use a checkpoint save area, if necessary, and do any required repositioning following an XRST call
which indicates a restart. Following an abend, they must rerun from the restart point to completion.

Programs which do not access IMS or DB2 databases need not take checkpoints.

An alternative for one-off read-only DL/1 programs which do not require absolute integrity of extracted data that is to use a
PSB with PROCOPT=GO. The access for these programs will ignore locks and therefore they run the risk of reading
uncommitted data, which will potentially lack referential integrity with other data read during the same run.

2.0 Checkpoint/Restart and Program Structure

2.1 Overview
The following are a set of guidelines for coding checkpoint/restartable programs. They are by no means hard and fast rules
but they have proved themselves in existing programs. Generally, checkpoint/restart CAN be installed in programs almost
invisibly, if a little care is taken. It can also be fairly easy to retrofit it to an existing program, if that was reasonably well
written initially. It can however be a blight if misused, and in programs which have been troublesome, one or more of the
following guidelines has invariably been broken.

2.2 How to identify the best L.U.W

There are two distinct, and counteracting, forces at work here. For conceptual reasons, it is much clearer if the L.U.W. is
quite high (in the hierarchy of control breaks of the program). This tends to reflect the fact that as one moves up the
structure of a program, the units of work are more generalised, less detailed, and are nearer to its overall function. On the
other hand, both for program isolation and checkpointing pacing reasons, there is pressure to make the L.U.W as low as
possible.
Usually, working out which is the Logical Unit of Work to use for checkpoint/restart purposes is not too hard.
Occasionally, you may stumble across a program which has got several possible candidates. This tends to point towards the
fact that the program has got too much function and that it should be reduced into two, or more, smaller programs. If it is a
new program then this path should be pursued. If it is an older one, into which checkpoint/restart is being installed, you
have my sympathies! - it is likely to cause a good deal of trouble.
In most cases however, there does not tend to be much choice. (Indeed in many instances, there may only be one possible
L.U.W., in which case it doesn't need to be thought about much more!). This tends to boil down to transaction driven

-5-
update programs, which can have many input records which to apply to a single database record. An update program which
applies unbilled financial transactions to C/M Accounts, would be a typical example.
The rule is that where there is a choice of L.U.W. then go for the lowest. The pacing function of ARC applied externally to
the program will then ensure that honoured checkpoints occur at an appropriate rate. Choosing the lower level L.U.W. may
well incur more re-positioning overhead after successful checkpoints. However, transaction driven programs usually have
to find the particular database record they are interested in, so the re-positioning tends to be 'lost' into this.

2.3 When to issue checkpoints

Whether to issue checkpoint calls before or after the L.U.W is really just a matter of style. It can be important under the
wrong circumstances, however. The issue can best be examined by looking at two program fragments:
program 1:
Initialise.
XRST call.
If restart
reposition on database(s)
Otherwise
open files,
CHKP call.
read input file.
While input file non-empty
main logic,
CHKP call,
if checkpoint honoured
reposition on database(s) (if necessary),
read next input record.
Write trailers to o/p files.
Close files.
Terminate.

program 2:
Initialise.
XRST call.
If restart
reposition on database(s)
Otherwise
open files.
While input file non-empty
read next input record,
if not End Of File
CHKP call,
if checkpoint honoured
reposition on database(s) (if necessary),
main logic.
Write trailers to o/p files.
Close files.
Terminate.
Generally speaking, most people intuitively prefer the first solution, probably without being able to explain exactly why.
There are however some who vociferously advocate the second. This causes confusion amongst those trying to design their
programs structure, so the air needs clearing a little. The first method is, in almost every instance, preferable to the second.
There are two basic reasons for this.
Firstly, the first method uses the well-known structured technique: 'read ahead'. It is a very useful tool, which should not be
discarded lightly, unless there are clear advantages in doing so.
Secondly, the first method is much clearer in its approach to how the checkpoint relates to the program logic. That is to say,
the contents of the checkpoint area reflect the L.U.W. which has just been completed. The philosophy is that a checkpoint
acknowledges total completion of a particular L.U.W. On the other hand, in the second method a checkpoint indicates that
the program is about to start processing a particular L.U.W. This does not seem, to most people, a particularly clear way of
expressing what checkpoint/restart is all about.
The above arguments are not meant to show that there is never a case for not using the second method. They should
however have convinced you that it shouldn't be used very often. Only if you think that there is a strong case should you
design a program this way and you should clear it with a senior developer in your area first.

-6-
2.4 Where to code checkpoint/restart logic
This might sound, on the face of it, somewhat obvious. You have a section, which is performed in the initialisation phase,
which does the necessary work for handling restarts or opening GSAM files. There would also be a section, performed from
the main control loop, which does the necessary work for checkpointing (including re-positioning after honoured
checkpoint calls, if necessary).
It feels like such a natural way to do it, that it is difficult to conceive how else to approach it. However, if
checkpoint/restart is being retrofitted to an existing program, then it is quite possible that this way won't easily fit with the
current structure. If this is the case then you are strongly recommended to change that, rather than use a different way of
implementing checkpoint/restart. In general, where problems have been found in the past with such exercises, it has been
with those programs which are not structured very well. Signs of trouble, are things like references to restart flags in
sections which don't have anything to do with it. For example, in a section which reads an input file for the rest of the
program, it shouldn't need to look at a flag to see whether it should read the file or take it from the checkpoint area (after a
restart). Traces of pathological connections between functionally distinct areas of programs are a sure warning that there
could be severe problems ahead.

2.5 How to use the checkpoint area itself

1. One does occasionally stumble across programs in which the entire working storage section of the program has been
turned into a huge checkpoint area. In some respects it does rather neatly solve the problem of what to actually put into
the checkpoint area - it can simply be forgotten about. However, there will be performance degradation if the whole of
working storage is saved with every checkpoint call - so this method must not be used. The ARC feature ‘Automatic
Checkpoints’ also takes this approach and hence is not to be used for the same reason.
2. Do not include the current input record in the checkpoint area. This can cause a problem in the situation where
Production Support need to strip an input record prior to a restart. If the input record has been included in the
checkpoint area then it will be restored on restart and exactly the same problem which caused the first run to fail will
recur, despite the record having been stripped.
3. Do include some information in the checkpoint area that indicates how far processing has got, even if this is not needed
for your restart logic. The availability of such information can occasionally be critical to Production Support resolution
of problems where a restarting program is failing. Examples would be an input record count or the current key being
processed.
4. With only a part of working storage defined as a checkpoint area, there is some disagreement over how the checkpoint
areas are used. There is one school which insists that the checkpoint area should contain copies of the various items
which need to be kept: these are moved in from their 'proper' locations before a checkpoint call and moved out after a
successful restart call. Whilst this does distinguish between the two uses of the items, it does lead to a couple of
problems. The minor problem is that there is a slight overhead caused by moving the items into the checkpoint area:
this isn't at all significant in performance terms, but it is worth considering. The major problem is that it is extremely
easy for the descriptions (PICtures) of the copies in the checkpoint area to get out of step with the real ones (e.g.
careless maintenance). Or alternatively, the relevant item may not actually be stored or retrieved, even though it is
defined correctly. It is quite easy for one item to get forgotten when writing a large list of MOVE statements,
particularly if this is not in the same sequence as the checkpoint area, which quite often happens (unfortunately!). This
has proved to be a real headache with some programs, to the point where it can be very hard to show that they are
working correctly. The other school of thought, is that items which need to be saved in the checkpoint area should be
grouped there, and used directly, regardless of their prefixes (WA-, WT-, etc) which normally dictates their ordering.
This does get around the problem rather neatly; I must admit to being of this school myself.

2.6 Initialising items which are stored in the checkpoint area

If it is at all possible, it is worth trying to do this statically, using VALUE clauses, rather than dynamically, using MOVEs
in the initialisation code. Doing the initialisation dynamically quite often causes two problems.
Firstly, it complicates the restart logic and makes the load module size slightly larger. Also, as a general principle, it is
usually considered healthier to give values to variables statically rather than dynamically (although with large arrays and
tables this can be somewhat difficult).
Secondly, and more seriously, it can encourage people into making silly errors which cause the program to crash under
some circumstances. Take a program which does the following:
1) XRST call,
2) CHKP call,
3) If NOT restart initialise some stored items,
4) main logic, etc.
Because the initialisation is done after the first checkpoint call (which is always honoured remember), the items are stored
on the checkpoint database in uninitialised form. After subsequent honoured checkpoints, the items will then have sensible
values in them. If the program attempts a restart after that initial checkpoint it will either fail 0C7 (or some such abend), or

-7-
worse still simply not work properly - usually causing mayhem and pandemonium in other innocent and unsuspecting
programs.

2.7 Database Repositioning after XRST

Notwithstanding what has been said thus far, you should be aware that IMS will attempt to automatically re-establish
position on DL/1 databases during an XRST call for a restart. The consensus of opinion is that it is best not to rely on this,
so your program should explicitly reposition databases based on information saved in the checkpoint area. The
programming and run-time overhead is small and you will know exactly where you stand.

For DB2, Appendix A contains some advice provided by Information Management on how to code DECLARE CURSOR
statements that will work correctly on a restart.

3.0 Testing a Checkpoint/Restart Program

3.1 Testing Procedure

The main body of testing should be exactly as normal - the checkpointing aspects of the program should be tested after the
main test plan has been completed satisfactorily.
The idea with checkpoint restart testing is to set up two series of runs, one of which is a control set. During the other set of
runs, the program is made to fall over at a reasonably well defined points, is restarted and the output from the combined
runs should match that from the control run. It is necessary to perform this test several times, stopping the program at
various different points during the sequence. The minimum set of acceptable test points are: restarting after the 1st
checkpoint, restarting after the last, and restarting after a failure in the main process loop.
ARC can help with this process. For example, add the following JCL to the job step to force ARC to abend the job with a
U1771 abend after 100 honoured checkpoints have been taken:
//ARCSYSIN DD *
TRMAFTERCKP=100
Initially tests should be performed in such a way that checkpoint requests are always honoured. Add the following JCL to
the job step to make this happen:
//ARCSYSIN DD *
PACECHKP=N
You should then run the program using high numbers of records (not actual productions volumes, but more than normal
program test volumes), with the checkpoint pacing set as per production. This should prove, once and for all, if the program
has got any problems with locking up too much of the enqueue pool.
Finally, two general points on testing. Firstly, if a change needs to be made to a program before it can be restarted then
mostly this can be done. It may sound like a rather obvious point but you can't change the definition of the checkpoint area!
Or at least, not with any expectation of the program actually running for very long after it has been restarted.
Secondly, if you do not abend immediately after a checkpoint then it may be necessary to manually backout DL/1 database
changes that occurred in the period between the last checkpoint call and the abend. The job must be logged (using the
IEFRDER ddname) to a file which is then input to the backout utility. For more details on batch backout JCL, see
BACKOUT on DEVL.STANDARD.JCL. Batch backout is needed if the program contains non-GSAM DLI updates and is
not running as BMP. Programs using SPLTDLI or CBLTDLX always need to run a batch backout, although as noted below
we should be eliminating the use of these modules.
For programs running as BMP or programs that only update GSAM files no manual backouts are necessary - for BMP
programs, IMS handles the backout, and repositioning occurs automatically on GSAM files.

3.2 ARC Pacing Class

The ARC Pacing class determines the pacing characteristics of the job. The great majority of production jobs are setup to
run with pacing class member PACESK. At the time of writing this is defined to honour a checkpoint every 10 seconds
elapsed and to return as status code of ‘SK’ where checkpoints are not honoured.
On Testbed the default pacing class is currently PACECLAS, although this is under review. For compatibility with
production add the following to your test JCL:
//ARCSYSIN DD *
PACENAME=PACESK

-8-
3.3 Use of ARC Online Panels
ARC provides a range of features to allow you to display and control your checkpoint/restart testing. On Testbed enter
‘TSO BMCAES’ on a command line:

Application Enhancement Series V2.1.01

Command ===>

Select an option. Then press Enter.

Application Enhancement Series (AES)

__ 1. AES records
2. BATCH CONTROL FACILITY (BCF)
3. APPLICATION RESTART CONTROL (AR/CTL)

AES Common Options

11. Display, Print, Jobcard, Allocation, and Profile Options
12. Messages
13. Security
14. Product authorization
15. Exit

PF 1=HELP 2=SPLIT 3=END 4=RETURN 5=RFIND 6=RCHANGE

PF 7=UP 8=DOWN 9=SWAP 10=LEFT 11=RIGHT 12=RETRIEVE

Enter option 3:

APPLICATION RESTART CONTROL

Command ===>

Type or verify the ID of the BMC Consolidated Subsystem (BCSS) to use.

Blank the field to request the public BCSS subsystem.
BMC Consolidated Subsystem ID . . . BCSS
Current BCSS subsystem type . . . : PUBLIC VER 1.1
Select an option. Then press Enter.

__ 1. Active jobsteps 7. Reattach options

2. Current shift identifier 8. Dynamic allocation options
3. Checkpoint pacing options 9. Program and data set options
4. Reports 10. Remote VSAM options
5. Processing options 11. AES common utilities
6. Cursor repositioning options 12. Exit

Copyright (C) 1994-1998 BMC Software, Inc. as an unpublished licensed work.

PF 1=HELP 2=SPLIT 3=END 4=RETURN 5=RFIND 6=RCHANGE
PF 7=UP 8=DOWN 9=SWAP 10=LEFT 11=RIGHT 12=RETRIEVE

-9-
Enter option 1:

Select Active Record Type

Command ===>

BCSID : BCSS

Select an option. Then press Enter.

_ 1. Restart control records

2. VSAM recovery records

PF 1=HELP 2=SPLIT 3=END 4=RETURN 5=RFIND 6=RCHANGE

PF 7=UP 8=DOWN 9=SWAP 10=LEFT 11=RIGHT 12=RETRIEVE

Enter option 1:

Limit List of Records

Command ===>

BCSID : BCSS Record type : Restart control

Type information. Then press enter.

JOBNAME . . . . . . Y0HG****
JOBSTEP . . . . . . ********
PROCSTEP . . . . . . ********
PROGNAME . . . . . . ********
PSBNAME . . . . . . ********

PF 1=HELP 2=SPLIT 3=END 4=RETURN 5=RFIND 6=RCHANGE

PF 7=UP 8=DOWN 9=SWAP 10=LEFT 11=RIGHT 12=RETRIEVE

-10-
Enter your job details:

List Records
Command ===> Scroll ===> CSR

BCSID : BCSS Record type : Restart control

Note: Intervention required for highlighted jobs. Commands: REFresh

Type one or more action codes. Then press Enter.
D=Delete P=Print S=Display M=Modify C=Display/change restart data set
K=Restart data display O=Set manual restart J=Submit restart

Act JOBNAME JOBSTEP PROCSTEP PROGNAME PSBNAME Run Start Date/Time Status

_ Y0#0OFB LOADTOFF DV1BATCH PBTO00AU FRI 10/16/1998 09:53 ABENDED

_ Y0AATR34 STEP010 STEP040 GM03400 GM03400 WED 06/24/1998 09:30 ABENDED
_ Y0ABTR34 STEP010 STEP040 GM03400 GM03400 WED 06/24/1998 09:33 ABENDED
_ Y0AD47AD NN05000 PS040 NN05000 NNPP5047 THU 05/21/1998 16:38 ABENDED
_ Y0AD47V2 NN05000 PS040 NN05000 NNPP5047 THU 09/03/1998 17:30 ABENDED
_ Y0AD49A9 NN05000 PS040 NN05000 NNPP5049 TUE 07/21/1998 17:07 ABENDED
_ Y0AD53AD NN05000 PS040 NN05000 NNPP5053 WED 10/21/1998 20:12 ABENDED
_ Y0AD54AD NN05000 PS040 NN05000 NNPP5054 WED 08/26/1998 14:56 ABENDED
_ Y0AE1900 PROCSTEP STEP050 ISB1900 ISB1900 FRI 09/18/1998 16:48 ABENDED
_ Y0AKGM34 STEP040 GM03400 GM03400 MON 02/09/1998 12:00 ABENDED
PF 1=HELP 2=SPLIT 3=END 4=RETURN 5=RFIND 6=RCHANGE
PF 7=UP 8=DOWN 9=SWAP 10=LEFT 11=RIGHT 12=RETRIEVE

From here you can select options to delete a checkpoint to force a cold start on the next test run or to examine
the contents of the checkpoint data.

4.0 GSAM Files

4.1 Accessing GSAM files

GSAM files must be used for restartable sequential file processing programs.
To access a GSAM file in your program, do the following
1. Code a DBD for each GSAM file.
DBD NAME=xxxx,ACCESS=(GSAM,BSAM),PASSWD=NO
DATASET DD1=Ixxxx,DD2=Oxxxx,RECFM=FB,RECORD=n,SIZE=0
DBDGEN
FINISH

END
where Ixxxx,Oxxxx are the JCL DD names (input and output)
RECORD=n refers to LRECL . For variable-length files IMS will add two bytes to this value for QSAM
compatibility as per the note below.
SIZE=0 refers to BLKSIZE. SIZE must be specified as 0 in the DBD to stop DLI calculating it at DBDGEN time.
2. Make an entry in the programs PSB for the GSAM dataset.
PCB TYPE=GSAM,DBDNAME=xxxx,PROCOPT=LS
Always use PROCOPT=LS or GS.
3. Code a PCB mask.
01 DLI-xxxx-PCB.
03 DLI-xxxx-DBD-NAME PIC X(8).
03 DLI-xxxx-SEG-LEVEL PIC XX.
03 DLI-xxxx-STATUS-CODE PIC XX.
03 DLI-xxxx-PROC-OPT-CODE PIC X(4).
03 DLI-xxxx-RESRVED PIC S9(9) COMP.
03 DLI-xxxx-SEG-NAME PIC X(8).

-11-
03 DLI-xxxx-KFB-LENGTH-CNT PIC S9(9) COMP.
03 DLI-xxxx-SENS-SEG-CNT PIC S9(9) COMP.
03 DLI-xxxx-KFB-AREA PIC X(8).
03 DLI-xxxx-UNDEF-LENGTH PIC S9(9) COMP.
4. Use the GN call to read GSAM records sequentially. Use the ISRT call to write new records to the end of the file.
There are several things to note with using GSAM files.
1. It is not necessary to explicitly open or close the file using OPEN and CLSE calls. GSAM implicitly performs these
functions. The first call to a GSAM file will open it and, if opened, the file will be closed at the end of the program.
Implicit closing also occurs if, on reading sequentially, end-of-file is reached (status GB). Be aware that the next GN
call following this will cause the file to be re-opened, and the first record to be retrieved again.
2. GSAM datasets must be pre-allocated. This avoids the need for 2 sets of JCL - one for normal processing which creates
the dataset and one for restart processing which adds to an existing dataset. Furthermore, they must be pre-allocated
using HUT21, rather than IEFBR14, to avoid creating empty files without end-of-file markers.
If the DDNAME of the file in the pre-allocation step starts 'DD', HUT21 will open and close the file after allocating it.
This ensures that the file is empty. So all files in the pre-allocation step must be named 'DD......' e.g. DD1, DD2, etc.
3. You do not need to re-establish position on GSAM files after returning from the checkpoint or after restarting a
program - unlike for all other IMS databases. After checkpointing or restarting, repositioning will occur automatically.
However, be sure to code a DISP of SHR on all your GSAM files - or else unpredictable results may occur on restart.

4.2 GSAM variable length record files.

Define your DBD with RECFM=VB;

DBD NAME=GSIJ,ACCESS=(GSAM,BSAM),PASSWD=NO
DATASET DD1=IGSIJ,DD2=OGSIJ,RECFM=VB,RECORD=352,SIZE=0
Set RECORD=max record/segment size, including 2 byte length field.
In your program (example taken from GS26000):
01 WP-GSAM-AREAS.
03 WP-GSIJ-REC.
05 WP-GSIJ-LEN PIC S9(4) COMP.
05 WP-GSIJ-DATA PIC X(400).

01 GSIJA-SE-FIN-DTL-HDR-REC.
03 GSIJA-FINE-REC-CD PIC XX.
.
01 GSIJB-SE-FIN-ACCT-TRANS-REC.
03 GSIJB-FINE-REC-CD PIC XX.
.
01 GSIJC-SE-FIN-DTL-SUBM-REC.
03 GSIJC-FINE-REC-CD PIC XX.
.
MOVE +352 TO WP-GSIJ-LEN.
MOVE GSIJB-SE-FIN-ACCT-TRANS-REC TO WP-GSIJ-DATA.
CALL 'CBLTDLI' USING DLI-ISRT,
WL-GSIJ-PCB,
WP-GSIJ-REC.
Some points on GSAM vs QSAM variable length record considerations.
Note that the GSIJA- record layout generated from Data Dictionary does not mention the length field. This is so that the
layout can be used whether the file is being accessed via GSAM or QSAMIORT. Explanation follows:-
GSAM uses a 2 byte segment length field. This follows standard DL/1 variable length segment conventions, i.e. 2 byte
S9(4) COMP. The length value includes the 2 bytes for the length field. For an example, see WP-GSIJ-REC above.

-12-
QSAM uses a 4 byte Record Descriptor Word. This is made up of a 2 byte S9(4) COMP record length and a 2 byte
FILLER. See QSAMIORT's description in the Utilities section of Standards for examples. The QSAM record length
includes 2 bytes for the length field and 2 bytes for the filler.
A common scenario is a VB dataset output as GSAM in a checkpoint restartable program, and read via QSAMIORT in
another program. Have we a compatibility problem between GSAM and QSAMIORT record prefixes? The answer is no -
because DL/1 outputs standard QSAM VB records; i.e. DL/1 inserts the FILLER and adds 2 to the segment length before
writing the record. Thus the GSAM output program may set record length to 352, but both physically, and to QSAMIORT,
the record length will be 354.
If a GSAM or QSAM program is coded like GS26000 (above), then data is moved from an unformatted WORKING
STORAGE IO-area to the formatted record layout. This means both the GSAM and the QSAM program can use the same
Data Dictionary-generated layout. The different length record prefixes are defined in the unformatted IO-areas.

-13-
5.0 System Design Considerations
All too often, Checkpoint/Restart capability is added as an afterthought, either during program design or, worse
still, after program build has been completed. Many of the problems that occur with checkpoint restart could be
avoided if sufficient thought was given to these matters at the outset. In other words, one of the key elements of
Systems Design process should be to think through the restart implications of a potential failure at every single
point in a batch process flow. Here are some guidelines to help with this.

5.1 Drivers
All batch processes will be driven in some way. The driver can either be a transaction file (‘hit file’) or a
sequential scan of a database. Where you have a choice then use a hit file whenever possible rather than a
database scan. This means that you don't need special routines for re-positioning the true databases. There may
also be benefits in performance, testing and production support.

You should also avoid having more that one driver (whether the driver is a database or a hit file) in any one
program. The merge-update logic does not fit in at all well with checkpoint-restart. Instead, generate a single
sequential hit file using conventional merge logic and then use this to update the database in a subsequent step.

5.2 Reports
Avoid the production of reports in checkpoint/restartable programs as it can be difficult to work out how to
restart these. If at all possible, write the details required by the report to a GSAM file and then have a small
program to print the report in a separate step. The side-benefit here is that we may be able to find a better way
to get the data to the customer than a mainframe report, such as transmitting the data to a PC file. We could do
this without touching the database program.

5.3 GSAM to GSAM

Avoid the situation where an output GSAM file from one step feeds directly to an input GSAM file in another.
The reason for this is that as the first program takes checkpoints, the buffers for its GSAM files are flushed
resulting in the file containing a number of short blocks. Now consider the situation where the second program
has to be restarted. In the normal course of events this will not cause a problem and the second program will re-
position on its input GSAM files correctly. However if there has been intervening production support activity
(e.g. stripping a record) then the file will get re-blocked and the GSAM positioning for the second program will
now be incorrect. To avoid this problem change the job flow to insert an IEBGENER step between the two
programs to re-block the data.

5.4 VSAM files

The general advice is don't use VSAM files for checkpoint/restart programs unless you absolutely have to!
Certainly for newly designed processes DB2 tables should be used in preference to VSAM.
Unfortunately, the facilities provided for checkpointing updates to VSAM files do not mesh in with those
provided for IMS databases. For example, if a program fell over at a certain point, the updates to the database
would be backed out to a particular point (i.e. the last checkpoint). However, updates to a VSAM file would not
be consistent with this: they could be up to a few after this, or (because of purging of unwritten buffers) a few
before it. Generally, the problem only exists with KSDSs. GSAM files can be used in place of ESDSs; whilst
RRDSs are so rare (at BROC) that they have not yet been used in a checkpoint/restartable program.
There are three ways out of this particular problem, both with their own advantages and disadvantages.
Firstly, the KSDS can be defined as a SHISAM database. This particular database organisation is just a KSDS
with IMS wrappings. It only allows root segments, which will simply be the records in the KSDS, as it is viewed
elsewhere. IMS looks after data integrity, so that the updates will always be synchronised with those to the
'proper' databases.
Secondly, by brute force it is possible to carve a way through the difficulty. The problem only exists if the file
has to be updated across checkpoints (or, by definition across L.U.W.s). (If the job does not update the file then
you don't have a problem incidentally!). Synchronisation, of a sort, can be ensured by closing and re-opening
the file after every honoured checkpoint call. This would guarantee that no updates before the checkpoint were
lost (because doing this causes the unwritten VSAM buffers back to the file). However, it has two obvious
disadvantages. It incurs large amounts of system overhead in the opening and closing: this would only be
acceptable in performance terms if it did not happen very often, say less than a hundred times in a particular
run. Also, it does not mean that updates after a successful checkpoint will be backed out, if the program

-14-
subsequently abends. Accordingly, the application program will have to cater for this: e.g. by doing a
REWRITE if a WRITE fails because the record was already present.
Thirdly, if the access pattern of the program to the KSDS is amenable, the problem can simply evaporate. This
would happen if the program reads the KSDS at the beginning of a run and updates it again at the end. This is
quite frequently the case with control or totals files which are also quite often VSAM. In this case, because
updates to the file only happen at one time (in the programs termination phase), they do not cross L.U.W.s and
so cannot get out of synch with the database updates.

5.5 Sorts
It is an unfortunate fact that sorts and checkpoint/restart don't mix well. It is notionally possible to get around
the incompatibilities, but it is almost never worth it. It complicates the logic of the program to the point where it
is incredibly tortuous. It is still not possible to checkpoint the program during the sort input/output phases, and it
doesn't carry any advantage in saved I/O operations.
This point does sometimes escape the notice of systems designers. It is not unheard of to find a spec for a
checkpoint/restartable program, which is also supposed to do some kind of internal sort. The solution is always
to split the program into 2 (or occasionally more) smaller ones. Usually, one of them will be a stand-alone (JCL)
sort to do the sorting for the checkpoint/restartable one.

-15-
6.0 Checkpoint/Restart with Other Products

6.1 Director

6.1.1 Introduction.

DIRTOBMC is a user exit module that allows DIRECTOR programs to use the ARC checkpoint/restart
software. This means that DIRECTOR can be used instead of COBOL when there is a requirement to write a
'oneoff' checkpoint/restartable program.
Using the DIRECTOR language, instead of COBOL, to code and test a checkpoint/restartable program can
considerably reduce the amount of programming effort required. However DIRECTOR should only be used for
production support and systems development tasks and not for regular scheduled production applications.
You will need to have a reasonable understanding of the DIRECTOR language, checkpoint/restart programming
techniques in order to successfully use DIRTOBMC. Examples of checkpoint/restartable DIRECTOR programs
can be found in DEVL.STANDARD.JCL members DIRBMC1 and DIRBMC2.

6.1.2 How to use the checkpoint module.

To use DIRTOBMC the following fields should be defined at the beginning of your DIRECTOR program.
AREAFUNC LEN=4 4 byte function code
AREAPCB LEN=40 IO PCB mask return area
AREASAVE LEN=100 Checkpoint save area (any length)
AREAFUNC can be set to XRST or CHKP. The XRST call is used for detecting a normal start or a restart and
the CHKP call is used for issuing checkpoints. Positions 11-12 of AREAPCB contain the status-code returned
after making a call to DIRTOBMC.
The checkpoint save/return area AREASAVE is for storing data that is required after a restart situation.

6.1.3 The XRST call.

The XRST call should be made only once at the beginning of the DIRECTOR program before any processing
has taken place and should therefore be coded in the outer most DO/ENDDO process.
An example XRST follows:
AREAFUNC POS=1,C'XRST'
UEXIT DIRTOBMC(AREAFUNC,AREAPCB,AREASAVE)
| | |
| | ---> CHECKPOINT RETURN AREA
| ---> IO PCB MASK RETURN AREA
---> FUNCTION
IF (AREAPCB(11),EQ,C'RS')
perform restart processing
ENDIF
The status-code from the XRST call is passed back to your DIRECTOR program in bytes 11-12 of AREAPCB.
A status-code of spaces indicates a normal start and a status-code of 'RS' indicates a checkpoint restart.
If an 'RS' status-code is detected then any data saved by a previously issued CHKP call will be returned in
AREASAVE. It is then the responsibility of the DIRECTOR program to perform any database repositioning
that may be required to successfully restart the program.
Repositioning can be achieved by using the data previously saved in AREASAVE by the last real CHKP call. It
is up to you to save any data that may be required for database repositioning. Note that GSAM files are
automatically repositioned for you.

-16-
6.1.4 The CHKP call.
To make a checkpoint call set the function area AREAFUNC to CHKP and save any data you may require for a
restart in AREASAVE (a root segment key value for example). An example CHKP call follows:
AREAFUNC POS=1,C'CHKP' * set function
AREASAVE POS=1,KFBA(1,11) * save any restart data

UEXIT DIRTOBMC(AREAFUNC,AREAPCB,AREASAVE)
| | |
| | ---> CHECKPOINT SAVE AREA
| ---> IO PCB MASK
---> FUNCTION
Note that checkpoint ids do not need to be specified as they are generated by DIRTOBMC for you.
It is best to place your CHKP call in a position which will make restarting most easy. The following rules for
placing the CHKP call should serve most programs:
1. If your program is driven by an input GSAM file then place the CHKP call just before the GSAM read
statement (the GN call).
2. If your program is database driven (ie. you are sequentially processing a database) then place the CHKP
call just before reading a root segment.
Actual checkpoint frequency will be controlled by ARC. The status-code from the CHKP call is passed back to
your DIRECTOR program in bytes 11-12 of AREAPCB. If using pacing class PACESK then a status-code of
'SK' indicates that checkpoint processing has been skipped and processing can continue as usual. A status code
of spaces means that a checkpoint has been taken and the DIRECTOR program must therefore reposition on any
databases it is sequentially processing.

6.1.5 GSAM files.

Sequential files that require automatic repositioning after a restart situation must be GSAM and therefore
declared in the PSB you are using. Files that do not require repositioning can be processed using the
DIRECTOR READ and WRITE statements.
A set of ten GSAM DBDS are available for general use located in library DEVT.TESTBED.DBDLIB. They are
named DIR1 to DIR10 and can be used to access any sequential datasets required.
An example GSAM file read follows:
DO DBD=DIR1
GN
ENDDO

IF STATUS#
DOEXIT
ENDIF
A status-code of 'GB' is returned when end of file is detected.
The format of a GSAM file write is as follows:
DO DBD=DIR2
ISRT
ENDDO

IF STATUS#
DOEXIT
ENDIF

-17-
6.1.6 Testing.
The restart code of a checkpoint/restartable DIRECTOR program must be thoroughly tested before it is allowed
to run in the production environment. DEVL.STANDARD.JCL members DIRBMC1 and DIRBMC2 provide
examples of the JCL required. The following points may also be helpful.
• An old style checkpoint database is not required however your PSB must specify COMPAT=YES on the
PSBGEN statement.
• Any GSAM files must be pre-allocated prior to running the DIRECTOR step for a normal start.
• Load library DEVT.TESTBED.LOADLIB which contains DIRTOBMC must be placed on the STEPLIB
statement.
• The IEFRDER IMS logging DD statement should be present in your run JCL. This file is required for
running a database backout.
In order to test that restart is working correctly:
1. Place displays in your DIRECTOR program (using the PRINT just after the CHKP call to show which
database record has just been processed.
2. Cause an abend to occur in your DIRECTOR program when a record has been read. You can call module
ZCDB029 to cause an abend (see DEVL.STANDARD.JCL(DIRBMC1) for example code).
3. Run a database backout job after the abend. See member DEVL.STANDARD.JCL(DICHKPBO) as an
example.
4. Re-run your DIRECTOR program and display which key is returned AREASAVE to prove that you are
restarting in the correct place.

6.2 Other Products

Any product which supports a call interface with standard IBM linkage can in theory be made
checkpoint/restartable using ARC Common Calls. Module ASMTARC can be called to request XRST and
CHKP functions which are entirely analogous to the standard IMS offerings. You will need to STEPLIB to
library IMSVS.BMC.AESLOAD to pick up this module. Details can be found in the ARC User Guide.
A simple test with SELCOPY has been run to establish that this works. Please feed back any further product
experience to the Development Centre.

-18-
7.0 Defunct Checkpoint/Restart Components

7.1 BMP Restart

BMP Restart is the predecessor product to ARC. At the time of writing production jobs are being converted to
run with ARC. By default all Testbed jobs will run with ARC but if ARC is suppressed (by including
//AES£EXCL DD DUMMY in the job step JCL) then BMP Restart will operate. The detailed standards for use
of BMP Restart are available from the Development Centre, should you happen to need them during this
transition period.

7.2 CBLTBMC
This module was developed in response to a bug with GSAM processing in the original version of BMP Restart.
It is called instead of the standard module CBLTDLI. Where calls to CBLTBMC are encountered in a program
which is undergoing maintenance then you should perform a straight replacement to equivalent calls to
CBLTDLI.

7.3 CBLTDLX
CBLTDLX was the in-house written checkpoint/restart method in use prior to BMP Restart. This module was
called instead of CBLTDLI and checkpoint information was maintained on DL/1 databases. Where CBLTDLX
is encountered in a program which is undergoing maintenance then it should be removed and replaced with
standard calls to CBLTDLI . Please refer to section Converting COBOL To DLX Modules in the Utilities
Replacement Guide for details of how to achieve this.

7.4 SPLTDLI
This is a version of the database split utility which calls CBLTDLX rather than CBLTDLI. Where encountered
you can make a straight replacement with the standard module SPLTDLZ. Better still, there is another
technique for accessing split databases that is now available which avoids the need for these specialised
modules. Please contact the Development Centre for details.

-19-
Appendix A - DB2 Restart Logic

A.1 Simple Key

DB2 DATA
Col_A (key) other
37421 … … …
37422 … … …
37423 … … …
37424 … … …
37425 … … …
37426 … … …

SQL CODE
DECLARE crs1 CURSOR WITH HOLD FOR
SELECT …
FROM tables
WHERE …
AND Col_A > :ws-a
ORDER BY Col_A

(ws-a = key value saved in checkpoint area)

When the key of a DB2 table consists of just one column the restart logic is very simple to code and is intuitive.

The only thing to bear in mind with any cursor that you want to stay open across logical units of work is that
they must have an ORDER BY statement. They should also be defined WITH HOLD so the program doesn’t
continually close and re-open the cursor after every checkpoint. It is a significant overhead.

A.2 Compound Key - Wrong!!

DB2 DATA
Col_A Col_B
37421 00
37421 01
37421 02
37422 00
37422 01
37422 02

SQL CODE
SELECT …
FROM tables
WHERE …
AND Col_A > :ws-a
AND Col_B > :ws-b
ORDER BY Col_A, COL_B

A common mistake in re-positioning logic comes when we have to deal with multi-column keys. The most
common error is to code the cursor as above.

The problem comes to light if the program abends after processing, say, the second of the rows in our sample
table. In this case, Col_A = 37421 and Col_B = 01

When the program restarts, it would appear to pick up from where it left off and complete with return code of 0.
However, the fourth and fifth rows in our table would not have been processed by either the first run or the
restart. DB2 can give no indication that these rows have been missed as it has returned all the rows the program
requested.

-20-
A.3 Compound Key - Right

DB2 DATA
Col_A Col_B
37421 00
37421 01
37421 02
37422 00
37422 01

SQL CODE
SELECT …
FROM tables
WHERE …
AND ((Col_A = :ws-a
AND Col_B > :ws-b)
OR Col_A > :ws-a)
ORDER BY Col_A, COL_B

This is the correct restart logic to use.

It says that we want to process all rows which have a Col_A value greater than the abend value and, in addition,
we want to process all rows that have the same Col_A value and a larger Col_B value.

However, even this sometimes causes problems from a performance perspective. DB2’s optimizer, which
decides what is the best access path to take, doesn’t realise that the predicates referring to Col_A on either side
of the OR are set to the same value. As a result, DB2 sometimes chooses an inefficient route to the data.

A.4 Compound Key - Best

DB2 DATA
Col_A Col_B
37421 00
37421 01
37421 02
37422 00
37422 01

SQL CODE
SELECT …
FROM tables
WHERE …
AND Col_A >= :ws-a
AND ((Col_A = :ws-a
AND Col_B > :ws-b )
OR Col_A > :ws-a)

In most instances this is the best logic to use:

By adding an additional predicate into the restart logic we are explicitly telling DB2 that every row on the
restart will be greater than, or equal to, a particular value. This is usually enough to have the optimizer choose
the best access path.

NB: There are some circumstances when the best access path for a restart of a program is different from the
best for a normal run. This is because when a program restarts it will typically process fewer rows and DB2
will always estimate number of rows that will be returned. It then uses that value as a factor to determine the
access path to take. As a result, if the query is a join DB2 may choose a less than ideal method of joining the
tables (it has three to choose from) or even for simple queries on a single table DB2 may choose to use an index
when a table scan is best.

-21-
Therefore, when coding restart logic for long running programs on the critical path, or if you have concerns
about performance then contact your project team’s DBA. If your project team has not been allocated one then
contact Richard Livett, X4211.

-22-

Easytrieve
No ratings yet
Easytrieve
20 pages
DFSORT Tricks PDF
100% (1)
DFSORT Tricks PDF
79 pages
MAINTEC Mainframe Interview Questions
No ratings yet
MAINTEC Mainframe Interview Questions
27 pages
Matching File
No ratings yet
Matching File
2 pages
VSAM Interview Questions and Answers 214
No ratings yet
VSAM Interview Questions and Answers 214
13 pages
Defining and Installing in CICS Regions
100% (1)
Defining and Installing in CICS Regions
22 pages
Mainframes
100% (1)
Mainframes
186 pages
Ibm Mainframes: CICS Training Class-01
No ratings yet
Ibm Mainframes: CICS Training Class-01
23 pages
Useful Tricks Using Dfsort and Icetool
100% (1)
Useful Tricks Using Dfsort and Icetool
7 pages
Cics Class 04
No ratings yet
Cics Class 04
15 pages
Re Entrant
100% (1)
Re Entrant
26 pages
CSS 9 1
No ratings yet
CSS 9 1
28 pages
AMOS, Advanced MO Scripting, User Guide
100% (1)
AMOS, Advanced MO Scripting, User Guide
2 pages
Basics of Icetool
No ratings yet
Basics of Icetool
10 pages
Intro To CICS PDF
No ratings yet
Intro To CICS PDF
53 pages
VSAM - Patni
No ratings yet
VSAM - Patni
75 pages
Red Hat Portfolio Overview
No ratings yet
Red Hat Portfolio Overview
27 pages
IMS-DC Presentacion
No ratings yet
IMS-DC Presentacion
37 pages
CICS MQ Program - Which Reads From MQ and Updates Db2
100% (1)
CICS MQ Program - Which Reads From MQ and Updates Db2
86 pages
Mainframe Q&A Kindle
No ratings yet
Mainframe Q&A Kindle
123 pages
Batch Xpediter Set Up For Program Calling Both IMS DB and DB2
No ratings yet
Batch Xpediter Set Up For Program Calling Both IMS DB and DB2
20 pages
Mainframe Vol-II Version 1.2
100% (1)
Mainframe Vol-II Version 1.2
246 pages
MVS Questions
No ratings yet
MVS Questions
3 pages
DFSORT
No ratings yet
DFSORT
958 pages
What's New at MAINFRAMES 360: Bulletin Board
No ratings yet
What's New at MAINFRAMES 360: Bulletin Board
11 pages
JCL
No ratings yet
JCL
142 pages
8051 Microcontroller
No ratings yet
8051 Microcontroller
31 pages
Cool Gen Tips and Tricks 1
100% (2)
Cool Gen Tips and Tricks 1
158 pages
VSAM Handout
100% (2)
VSAM Handout
60 pages
IMS DC Structure
100% (1)
IMS DC Structure
21 pages
File Manager Reference Material For IMS
No ratings yet
File Manager Reference Material For IMS
593 pages
How To Make SMF Report
No ratings yet
How To Make SMF Report
114 pages
Programming Projects in C: For Students of Engineering, Science, and Mathematics
0% (1)
Programming Projects in C: For Students of Engineering, Science, and Mathematics
15 pages
JCL Utilities - Uncovered
100% (3)
JCL Utilities - Uncovered
33 pages
WSTE 06182009 CICSWebServicesStructureDebugging Wiese
No ratings yet
WSTE 06182009 CICSWebServicesStructureDebugging Wiese
62 pages
Cobol Performance Tuning
No ratings yet
Cobol Performance Tuning
55 pages
Debugging Natural Applications
No ratings yet
Debugging Natural Applications
53 pages
DB2 Restart
100% (1)
DB2 Restart
2 pages
CICS
No ratings yet
CICS
2 pages
HPM - Specification and Technical Data
No ratings yet
HPM - Specification and Technical Data
48 pages
Mainframe COBOL Tips & Tricks
No ratings yet
Mainframe COBOL Tips & Tricks
24 pages
Pli Qa
100% (3)
Pli Qa
41 pages
Ims DC Transition
100% (3)
Ims DC Transition
39 pages
PLSQL
100% (2)
PLSQL
4 pages
Restart Logic in DB2
100% (1)
Restart Logic in DB2
24 pages
eIDMS PDF
No ratings yet
eIDMS PDF
99 pages
Solve SOC7 Abend
No ratings yet
Solve SOC7 Abend
5 pages
Sort Final
No ratings yet
Sort Final
17 pages
Static and Dynamic Call
100% (1)
Static and Dynamic Call
19 pages
Db2 Checkpoint Restart
No ratings yet
Db2 Checkpoint Restart
8 pages
Rexx 1.0
No ratings yet
Rexx 1.0
13 pages
Ims DC
100% (2)
Ims DC
49 pages
JS Strict Mode
No ratings yet
JS Strict Mode
14 pages
2"mobile Printer-Setting Integration Tool Manual (RT V1.1)
No ratings yet
2"mobile Printer-Setting Integration Tool Manual (RT V1.1)
23 pages
Cics Interview Questions
No ratings yet
Cics Interview Questions
43 pages
MongoDB Reference Card
No ratings yet
MongoDB Reference Card
28 pages
Kubernetes Scenario Based Questions
No ratings yet
Kubernetes Scenario Based Questions
9 pages
Middleware
100% (1)
Middleware
15 pages
Classic Shell: Version 4.3.1 - General Release
No ratings yet
Classic Shell: Version 4.3.1 - General Release
3 pages
Pradnya Pramod Mohite: Budget Management App
No ratings yet
Pradnya Pramod Mohite: Budget Management App
22 pages
Enutb85 BTC
No ratings yet
Enutb85 BTC
36 pages
MainFrame Sample Questions
No ratings yet
MainFrame Sample Questions
24 pages
Kualification of Basis Consultamt
No ratings yet
Kualification of Basis Consultamt
46 pages
JS Strings, Validation + Functions, RegExp, Modal, Lists
No ratings yet
JS Strings, Validation + Functions, RegExp, Modal, Lists
25 pages
Dell Embedded Box PC 3000 and 5000 Series
No ratings yet
Dell Embedded Box PC 3000 and 5000 Series
3 pages
On Online Shopping
No ratings yet
On Online Shopping
29 pages
Com - Bat.loader Logcat
No ratings yet
Com - Bat.loader Logcat
14 pages
Red Hat Consulting: Strategic Migration Planning Guide
No ratings yet
Red Hat Consulting: Strategic Migration Planning Guide
48 pages
Text Streams and Filters
No ratings yet
Text Streams and Filters
16 pages
Department of Computer Science and Engineering
No ratings yet
Department of Computer Science and Engineering
11 pages
AZ 204 Demo
No ratings yet
AZ 204 Demo
19 pages
SQL Quiz-Y
No ratings yet
SQL Quiz-Y
75 pages
DSRC User Guide
No ratings yet
DSRC User Guide
21 pages
3 - Moats - Switching Cost
No ratings yet
3 - Moats - Switching Cost
10 pages
Connecting Python Application With Mysql
No ratings yet
Connecting Python Application With Mysql
4 pages
File Management
No ratings yet
File Management
5 pages
Registration Form: Adult Intensive Program Young Adult Intensive Program (Ages 14-18)
No ratings yet
Registration Form: Adult Intensive Program Young Adult Intensive Program (Ages 14-18)
1 page
IBM InfoSphere Replication Server and Data Event Publisher
From Everand
IBM InfoSphere Replication Server and Data Event Publisher
Pav Kumar-Chatterjee
No ratings yet
PLI Basic Training Using VSAM, IMS and DB2
From Everand
PLI Basic Training Using VSAM, IMS and DB2
Robert Wingate
1/5 (1)
Mvs Jcl in Plain English
From Everand
Mvs Jcl in Plain English
Donna Kelly
5/5 (1)
DB2 9.7 for Linux, UNIX, and Windows Database Administration: Certification Study Notes
From Everand
DB2 9.7 for Linux, UNIX, and Windows Database Administration: Certification Study Notes
Roger E. Sanders
5/5 (1)
A Guide to Db2 Performance for Application Developers: Code for Performance from the Beginning
From Everand
A Guide to Db2 Performance for Application Developers: Code for Performance from the Beginning
Craig S. Mullins
No ratings yet
COBOL for the Approved Workman
From Everand
COBOL for the Approved Workman
Wesley Sweetser, Jr
No ratings yet
Interview Questions for IBM Mainframe Developers
From Everand
Interview Questions for IBM Mainframe Developers
Robert Wingate
1/5 (1)
IMS-DB Basic Training For Application Developers
From Everand
IMS-DB Basic Training For Application Developers
Robert Wingate
No ratings yet
DB2 10 for z/OS: The Smarter, Faster Way to Upgrade
From Everand
DB2 10 for z/OS: The Smarter, Faster Way to Upgrade
John Campbell
No ratings yet
COBOL Programming Interview Questions: COBOL Job Interview Preparation
From Everand
COBOL Programming Interview Questions: COBOL Job Interview Preparation
equitypress
4.5/5 (2)
MVS JCL Utilities Quick Reference, Third Edition
From Everand
MVS JCL Utilities Quick Reference, Third Edition
Robert Wingate
5/5 (1)
Mainframe Interview Cases
From Everand
Mainframe Interview Cases
Krishna Rath
No ratings yet
DB2 Interview Questions, Answers, and Explanations: DB2 Database Certification Review
From Everand
DB2 Interview Questions, Answers, and Explanations: DB2 Database Certification Review
equitypress
No ratings yet

Checkpoint Restart V2.1

Uploaded by

Checkpoint Restart V2.1

Uploaded by

EMEA DEVELOPMENT

Prepared by: Development Centre

1.1 Why it is necessary

1.2 How it works

1.3 Database Backout

1.5 Standard for Checkpointing and Restart

2.0 Checkpoint/Restart and Program Structure

2.2 How to identify the best L.U.W

2.3 When to issue checkpoints

2.5 How to use the checkpoint area itself

2.6 Initialising items which are stored in the checkpoint area

2.7 Database Repositioning after XRST

3.0 Testing a Checkpoint/Restart Program

3.1 Testing Procedure

3.2 ARC Pacing Class

Application Enhancement Series V2.1.01

Select an option. Then press Enter.

Application Enhancement Series (AES)

AES Common Options

PF 1=HELP 2=SPLIT 3=END 4=RETURN 5=RFIND 6=RCHANGE

APPLICATION RESTART CONTROL

Type or verify the ID of the BMC Consolidated Subsystem (BCSS) to use.

__ 1. Active jobsteps 7. Reattach options

Copyright (C) 1994-1998 BMC Software, Inc. as an unpublished licensed work.

Select Active Record Type

Select an option. Then press Enter.

_ 1. Restart control records

PF 1=HELP 2=SPLIT 3=END 4=RETURN 5=RFIND 6=RCHANGE

Limit List of Records

BCSID : BCSS Record type : Restart control

Type information. Then press enter.

PF 1=HELP 2=SPLIT 3=END 4=RETURN 5=RFIND 6=RCHANGE

BCSID : BCSS Record type : Restart control

Note: Intervention required for highlighted jobs. Commands: REFresh

_ Y0#0OFB LOADTOFF DV1BATCH PBTO00AU FRI 10/16/1998 09:53 ABENDED

4.0 GSAM Files

4.1 Accessing GSAM files

4.2 GSAM variable length record files.

5.3 GSAM to GSAM

5.4 VSAM files

6.1.2 How to use the checkpoint module.

6.1.3 The XRST call.

6.1.5 GSAM files.

6.2 Other Products

7.1 BMP Restart

A.1 Simple Key

(ws-a = key value saved in checkpoint area)

A.2 Compound Key - Wrong!!

This is the correct restart logic to use.

A.4 Compound Key - Best

In most instances this is the best logic to use:

You might also like