Cics Problem Determination Guide
Cics Problem Determination Guide
SC34-6826-03
CICS Transaction Server for z/OS
SC34-6826-03
Note!
Before using this information and the product it supports, be sure to read the general information under “Notices” on page
365.
This edition applies to Version 3 Release 2 of CICS Transaction Server for z/OS, program number 5655-M15, and
to all subsequent versions, releases, and modifications until otherwise indicated in new editions.
© Copyright IBM Corporation 1997, 2011.
US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract
with IBM Corp.
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
What this book is about . . . . . . . . . . . . . . . . . . . . . . xi
Who this book is for . . . . . . . . . . . . . . . . . . . . . . . xi
What you need to know to understand this book . . . . . . . . . . . . . xi
How to use this book . . . . . . . . . . . . . . . . . . . . . . . xi
Notes about terms used in this book . . . . . . . . . . . . . . . . . xii
Contents v
Resource type KC_ENQ . . . . . . . . . . . . . . . . . . . . 130
VTAM terminal control waits . . . . . . . . . . . . . . . . . . . 131
Interregion and intersystem communication waits . . . . . . . . . . . . 133
IIOP waits . . . . . . . . . . . . . . . . . . . . . . . . . . 133
Transient data waits . . . . . . . . . . . . . . . . . . . . . . 133
Resource type TD_INIT—waits during initialization processing . . . . . . 133
Resource type TDEPLOCK–waits for transient data extrapartition requests 134
Resource types TDIPLOCK, ENQUEUE, TD_READ, Any_MBCB,
Any_MRCB, MBCB_xxx, and MRCB_xxx . . . . . . . . . . . . . 134
XRF alternate system waits . . . . . . . . . . . . . . . . . . . . 137
CICS system task waits . . . . . . . . . . . . . . . . . . . . . 139
FEPI waits . . . . . . . . . . . . . . . . . . . . . . . . . . 139
Recovery manager waits . . . . . . . . . . . . . . . . . . . . . 140
CICS Web waits . . . . . . . . . . . . . . . . . . . . . . . . 140
Chapter 12. Dealing with external CICS interface (EXCI) problems . . . . 201
Contents vii
Getting dumps of the MVS logger and coupling facility address spaces 219
Contents ix
Finding the control blocks from the keywords . . . . . . . . . . . . . 315
Finding the keywords from the control blocks . . . . . . . . . . . . . 324
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . 339
The CICS Transaction Server for z/OS library . . . . . . . . . . . . . 339
The entitlement set . . . . . . . . . . . . . . . . . . . . . . 339
PDF-only books . . . . . . . . . . . . . . . . . . . . . . . 339
Other CICS books . . . . . . . . . . . . . . . . . . . . . . . 341
Determining if a publication is current . . . . . . . . . . . . . . . . 341
Accessibility . . . . . . . . . . . . . . . . . . . . . . . . . 343
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . 367
Note: For problem determination of the ONC/RPC feature, see the CICS External
Interfaces Guide.
Throughout this book, the term APPC is used to mean LUTYPE6.2. For example,
APPC session is used instead of LUTYPE6.2 session.
Structural changes
Structural changes
There are changes to the way that information is organized in each section.
Sometimes, you cannot solve the problem yourself if, for example, it is caused by
limitations in the hardware or software you are using. If the cause of the problem is
CICS code, you need to contact IBM, as described in Chapter 20, “IBM program
support,” on page 305.
As you go through the questions, make a note of anything that might be relevant to
the problem. Even if the observations you record do not at first suggest a cause,
they could be useful to you later if you need to carry out systematic problem
determination.
1. Has the CICS system run successfully before?
If the CICS system has not run successfully before, it is possible that you have
not yet set it up correctly. You can check that CICS installed correctly by
running batch or online verification procedures. See the CICS Transaction
Server for z/OS Installation Guide for more information. If you have verified
that CICS installed successfully, check the appropriate migration guide for any
possible impacts to your system.
If you are currently migrating to CICS Transaction Server for z/OS, Version 3
Release 2, ensure that you are aware of all the changes that have been made
for this release. For details of these, see the appropriate CICS migration guide.
2. Are there any messages explaining the failure?
If a transaction abends, and the task terminates abnormally, CICS sends a
message reporting the fact to the CSMT log (or your site replacement). If you
find a message there, it might immediately suggest a reason for the failure.
Were there any unusual messages associated with CICS start up, or while the
system was running before the error occurred? These might indicate some
system problem that prevented your transaction from running successfully.
If you see any messages that you do not understand, use the CICS messages
transaction, CMAC, for online message information. If you do not have access
to a CICS system to run the CMAC transaction, look in CICS Messages and
Codes for an explanation. A suggested course of action that you can take to
resolve the problem might also be included with the explanation.
3. Can you reproduce the error?
a. Can you identify any application that is always in the system when the
problem occurs?
v Check for application coding errors.
What to do next
Perhaps the preliminary checks have enabled you to find the cause of the problem.
If so, you should now be able to resolve it, possibly with the help of information in
the rest of the CICS information set.
If you have not yet found the cause, you must start to look at the problem in greater
detail. Begin by finding the best category for the problem, using the approach
described in Chapter 2, “Classifying the problem,” on page 7.
If you have the IBM INFORMATION/ACCESS licensed program, 5665-266, you can
look on the RETAIN database yourself. Each of the problems there has a
classification type.
Classify your problem using one of the following software categories from RETAIN.
Use the appropriate reference to get further information on how to diagnose each
category of problem.
v ABEND (for transaction abends, see Chapter 4, “Dealing with transaction
abend codes,” on page 25; for system abends, see Chapter 5, “Dealing with
CICS system abends,” on page 37)
v WAIT (see Chapter 6, “Dealing with waits,” on page 49)
v LOOP (see Chapter 8, “Dealing with loops,” on page 141)
v POOR PERFORMANCE, or PERFM (see Chapter 9, “Dealing with performance
problems,” on page 159)
v INCORRECT OUTPUT, or INCORROUT (see Chapter 10, “Dealing with incorrect
output,” on page 167)
v MESSAGE
All but the last of these, MESSAGE, are considered in the information on problem
determination. If you receive a CICS error message, you can use the CICS
message transaction, CMAC, for online message information. If you do not have
access to a running CICS system, look in CICS Messages and Codesfor an
explanation. If you get a message from another IBM program, or from the operating
system, you need to look in the messages and codes book from the appropriate
library for an explanation of what that message means.
CICS Messages and Codes might give you enough information to solve the
problem quickly, or it might redirect you to other information sources for further
guidance. If you are unable to deal with the message, you may eventually need to
contact the IBM Support Center for help.
One type of problem that might give rise to a number of symptoms, usually
ill-defined, is that of poor application design. Checking the design of an application
Look for the section heading that most nearly describes the symptoms you have,
and then follow the advice given there.
Consider, too, the possibility that CICS might still be running, but only slowly. Be
certain that there is no activity at all before carrying out the checks in this section. If
CICS is running slowly, you probably have a performance problem. If so, read
“CICS is running slowly” on page 9 to confirm this before going on to Chapter 9,
“Dealing with performance problems,” on page 159 for advice about what to do
next.
If CICS has stopped running, look for any message that might explain the situation.
The message might appear in either of the following places:
v The MVS™ console. Look for any message saying that the CICS job has
abnormally terminated. If you find one, it means that a CICS system abend has
occurred and that CICS is no longer running. In such a case, you need to
examine the CSMT log (see below) to see which abend message has been
written there.
If you do not find any explanatory message on the MVS console, check in the
CSMT log to see if anything has been written there.
v The CSMT log. CSMT is the transient data destination to which abend
messages are written. If you find a message there, use the CMAC transaction or
look in CICS Messages and Codes to make sure there has been a CICS system
abend.
If you see only a transaction abend message in the CSMT log, that will not
account for CICS itself not running, and you should not classify the problem as
an abend. A faulty transaction could hold CICS up, perhaps indefinitely, but CICS
would resume work again if the transaction abended.
Here are two examples of messages that might accompany CICS system abends,
and which you would find on the CSMT log:
If you can find no message saying that CICS has terminated, it is likely that the
CICS system is in a wait state, or that some program is in a tight loop and not
returning control to CICS. These two possibilities are dealt with in Chapter 6,
“Dealing with waits,” on page 49 and Chapter 8, “Dealing with loops,” on page 141,
respectively.
You will probably notice that the problem is worst at peak system load times,
typically at mid-morning and mid-afternoon. If your network extends across more
than one time zone, peak system load might seem to you to occur at some other
time.
If you find that performance degradation is not dependent on system loading, but
happens sometimes when the system is lightly loaded, a poorly designed
transaction could be the cause. You might classify the problem initially as “poor
performance”, but be prepared to reconsider your classification later.
The following are some individual symptoms that could contribute to your perception
that CICS is running slowly:
v Tasks take a long time to start running.
v Some low priority tasks will not run at all.
v Tasks start running, but take a long time to complete.
v Some tasks start running, but do not complete.
v No output is obtained.
v Terminal activity is reduced, or has ceased.
Some of these symptoms do not, in isolation, necessarily mean that you have got a
performance problem. They could indicate that some task is in a loop, or is waiting
on a resource that is not available. Only you can judge whether what you see
should be classified as “poor performance”, in the light of all the evidence you have.
You might be able to gather more detailed evidence by using the tools and
techniques that CICS provides for collecting performance data. The following is a
summary of what is available:
v CICS statistics. You can use these to gather information about the CICS system
as a whole, without regard to tasks.
v CICS monitoring. You can use this facility to collect information about CICS
tasks.
v CICS tracing. This is not a specific tool for collecting performance data, but you
can use it to gather detailed information about performance problems.
For guidance about using these tools and techniques, and advice about
performance and system tuning in general, seeWhat to investigate when analyzing
performance in the CICS Performance Guide.
Note: Do not overlook the possibility that the task might simply be doing
unnecessary work that does not change the final result—for example,
starting a skip sequential browse with large gaps between the keys, or failing
to finish one because it is holding on to resources.
First, make sure that the task is still in the system. Use CEMT INQ TASK to check its
status, and make sure that it has not simply ended without writing back to the
terminal.
If the terminal has a display unit, check to see whether a special symbol has been
displayed in the operator information area that could explain the fault. If the
operator information area is clear, next check to see that no message has been
sent to any of the transient data destinations used for error messages, for example:
v CDBC, the destination for DBCTL related messages
v CSMT, the destination for terminal error and abend messages
v CSTL, the destination for terminal I/O error messages
v CSNE, the destination for error messages written by DFHZNAC and DFHZNEP
For details of the destinations used by CICS, see the CICS System Definition
Guide. If you can find no explanation for the problem, the fault is probably
associated with the task running at the terminal. These are the possibilities:
v The task is in a wait state.
v The task is in a loop.
v There is a performance problem.
Use the CMAC transaction or look in CICS Messages and Codes for an explanation
of the message, and, perhaps, advice about what you should do to solve the
problem. If the code is not there, or the explanation or advice given is not sufficient
for you to solve the problem, turn to Chapter 4, “Dealing with transaction abend
codes,” on page 25.
Also, CICS responds to many errors that it detects by sending messages. You
might regard the messages as “incorrect output”, but they are only symptoms of
another type of problem.
If you have received an unexpected message, and its meaning is not at first clear,
use the CMAC transaction or look in CICS Messages and Codes for an
explanation. It might suggest a simple response that you can make to the message,
or it might direct you to other sources of information for further guidance.
These are the types of incorrect output that are dealt with in this information:
v Incorrect trace or dump data:
– Wrong destination
– Wrong type of data captured
– Correct type of data captured, but the data values were unexpected
v Wrong data displayed on the terminal.
You can find advice about investigating the cause of any of these types of incorrect
output in Chapter 10, “Dealing with incorrect output,” on page 167.
If you see this message, or you know (through other means) that a storage violation
has occurred, turn to Chapter 11, “Dealing with storage violations,” on page 191 for
advice about dealing with the problem.
In many cases, storage violations go undetected by CICS, and you only find out
that they have occurred when something else goes wrong as a result of the overlay.
You can avoid many storage violations by enabling transaction isolation, storage
protection, and command protection.
Any of the following symptoms could be caused by a wait, a loop, a badly tuned or
overloaded system:
v One or more user tasks in your CICS system fails to start.
v One or more tasks stays suspended.
v One or more tasks fails to complete.
v No output is obtained.
v Terminal activity is reduced, or has ceased.
v The performance of your system is poor.
This section gives you guidance about choosing the best classification. However,
note that in some cases your initial classification could be wrong, and you will then
need to reappraise the problem.
Waits
For the purpose of problem determination, a wait state is regarded as a state in
which the execution of a task has been suspended. That is, the task has started to
run, but it has been suspended without completing and has subsequently failed to
resume.
The task might typically be waiting for a resource that is unavailable, or it might be
waiting for an ECB to be posted. A wait might affect just a single task, or a group of
tasks that may be related in some way. If none of the tasks in a CICS region is
running, CICS is in a wait state. The way to handle that situation is dealt with in
“What to do if CICS has stalled” on page 104.
If you are authorized to use the CEMT transaction, you can find out which user
tasks or CICS-supplied transactions are currently suspended in a running CICS
system using CEMT INQ TASK. Use the transaction several times, perhaps
repeating the sequence after a few minutes, to see if any task stays suspended. If
you do find such a task, look at the resource type that it is waiting on (the value
shown for the HTYPE option). Is it unreasonable that there should be an extended
wait on the resource? Does the resource type suggest possible causes of the
problem?
Use INQUIRE TASK LIST to find the task numbers of all SUSPENDED, READY,
and RUNNING user tasks. If you use this command repeatedly, you can see which
tasks stay suspended. You may also be able to find some relationship between
several suspended tasks, perhaps indicating the cause of the wait.
If it seems fairly certain that your problem is correctly classified as a wait, and the
cause is not yet apparent, turn to Chapter 6, “Dealing with waits,” on page 49 for
guidance about solving the problem.
However, you should allow for the possibility that a task may stay suspended
because of an underlying performance problem, or because some other task may
be looping.
If you can find no evidence that a task is waiting for a specific resource, you should
not regard this as a wait problem. Consider instead whether it is a loop or a
performance problem.
Loops
A loop is the repeated execution of some code. If you have not planned the loop, or
if you have designed it into your application but for some reason it fails to terminate,
you get a set of symptoms that vary depending on what the code is doing. In some
cases, a loop may at first be diagnosed as a wait or a performance problem,
because the looping task competes for system resources with other tasks that are
not involved in the loop.
Some loops can be made to give some sort of repetitive output. Waits and
performance problems never give repetitive output. If the loop produces no output,
a repeating pattern can sometimes be obtained by using trace. A procedure for
doing this is described in Chapter 8, “Dealing with loops,” on page 141.
If you are able to use the CEMT transaction, try issuing CEMT INQ TASK
repeatedly. If the same transaction is shown to be running each time, this is a
further indication that the task is looping. However, note that the CEMT transaction
is always running when you use it to inquire on tasks.
If different transactions are seen to be running, this could still indicate a loop, but
one that involves more than just a single transaction.
If you are unable to use the CEMT transaction, it may be because a task is looping
and not allowing CICS to regain control. A procedure for investigating this type of
situation is described in “What to do if CICS has stalled” on page 104.
Consider the evidence you have so far. Does it indicate a loop? If so, turn to
Chapter 8, “Dealing with loops,” on page 141, where there are procedures for
defining the limits of the loop.
Poor performance
A performance problem is considered to be one in which system performance is
perceptibly degraded, either because tasks fail to start running at all, or because
they take a long time to complete once they have started.
In extreme cases, some low priority tasks may be attached but then fail to be
dispatched, or some tasks may be suspended and fail to resume. The problem
might then initially be regarded as a wait.
If you get many messages telling you that CICS is under stress, this can indicate
that either the system is operating near its maximum capacity, or a task in error has
used up a large amount of storage—possibly because it is looping.
You see one of the following messages when CICS is under stress in one of the
DSAs:
An example of poor application design is given here, to show how this can give rise
to symptoms which were at first thought to indicate a loop.
Environment:
CICS and DL/I using secondary indexes. The programmer had made
changes to the application to provide better function.
Symptoms:
The transaction ran and completed successfully, but response was erratic
and seemed to deteriorate as the month passed. Towards the end of the
month, the transaction was suspected of looping and was canceled. No
other evidence of looping could be found, except that statistics showed a
high number of I/Os.
Explanation:
The programmer had modified the program to allow the user to compare on
the last name of a record instead of the personnel number, which it had
done in the past. The database was the type that grew through the month
as activity was processed against it.
It was discovered that in making the change, the program was no longer
comparing on a field that was part of the key for the secondary index. This
meant that instead of searching the index for the key and then going
directly for the record, every record in the file had to be read and the field
compared. The structure of the source program had not changed
significantly; the number of database calls from the program was the same,
but the number of I/Os grew from a few to many thousands at the end of
the month.
Note that these symptoms might equally well have pointed to a performance
problem, although performance problems are usually due to poorly tuned or
overloaded systems, and affect more than just one transaction. Performance
problems tend to have system wide effects.
Whereas XRF, EXCI, and MRO errors can easily be classified in a straightforward
way, confirming that you have a storage violation can be difficult. Unless you get a
CICS message stating explicitly that you have a storage violation, you could get
almost any symptom, depending on what has been overlaid. You might, therefore,
classify it initially as one of the RETAIN symptom types described in “Using
symptom keywords as a basis for classifying problems” on page 7.
What to do next
If you have already decided that you should refer the problem to the IBM Support
Center, you can find advice about dealing with the Center in Chapter 20, “IBM
program support,” on page 305.
How much of this kind of information you need depends on how familiar you are
with the system or application, and could include:
v Program descriptions or functional specifications
v Record layouts and file descriptions
v Flowcharts or other descriptions of the flow of activity in a system
v Statement of inputs and outputs
v Change history of a program
v Change history of your installation
v Auxiliary trace profile for your transaction
v Statistical and monitoring profile showing average inputs, outputs, and response
times.
Product information
Product information can refer to the CICS Information Center, or libraries for any
other products you use with your application.
Make sure that the level of any documentation you refer to matches the level of the
system you are using. Problems often arise through using either obsolete
information or information about a level of the product that is not yet installed.
For a list of the destinations used by CICS, see the CICS System Definition Guide.
Use a copy of the appropriate messages and codes documentation to look up any
messages whose meaning you do not know. All CICS messages and codes are
documented in CICS Messages and Codes. Make sure that you also have some
documentation of application messages and codes for programs that were written at
your installation.
Symptom strings
CICS produces symptom strings in CICS system and transaction dumps and in
message DFHME0116.
The symptom string provides a number of keywords that can be directly typed in
and used to search the RETAIN database. If your installation has access to the IBM
INFORMATION/ACCESS licensed program, 5665-266, you can search the RETAIN
database yourself. If you report a problem to the IBM Support Center, you are often
asked to quote the symptom string.
Although the symptom string is designed to provide keywords for searching the
RETAIN database, it can also give you significant information about what was
happening at the time the error occurred, and it might suggest an obvious cause or
a likely area in which to start your investigation.
Change log
The information in the change log can tell you of changes made in the data
processing environment that may have caused problems with your application
program. To make your change log most useful, include the data concerning
hardware changes, system software (such as MVS and CICS) changes, application
changes, and any modifications made to operating procedures.
Dumps
Dumps are an important source of detailed information about problems. Whether
they are the result of an abend or a user request, they allow you to see a snapshot
of what was happening in CICS at the moment the dump is taken.
Statistics
Statistics are often overlooked as a source of debugging information, but those that
relate to an application program can help solve problems.
Statistics are most often used in system tuning and diagnosis, but they also contain
information that can indicate problems with the way your application handles
resources. For example, you may notice from these statistics that tables are being
loaded, or programs linked, for which there is no known requirement.
You can also use statistics to check terminals, files, queues, and so on for
irregularities in their activity. For example, if a terminal has a number of errors
recorded for a particular transaction that equal the number of times that transaction
was run, this may indicate that an incorrect data stream is being sent to that
terminal. See CICS statistics in the CICS Performance Guide for more information
about using statistics.
Monitoring
You can use CICS monitoring to provide information for debugging applications. In
addition to the system-defined event monitoring points (EMPs) that already exist
within CICS code itself, you can define user event monitoring points in your own
application programs by using the EXEC CICS MONITOR POINT command.
At a user EMP, you can add your own data (up to 256 counters, up to 256 clocks,
and a single character string of up to 8192 bytes) to fields reserved for you in
performance class monitoring data records. You could use these extra EMPs to
count how many times a certain event happens, or to time the interval between two
events. Your definitions in the Monitoring Control Table (MCT) specify the type and
number of fields that are available for your use within each task’s performance
record. For further information on the MCT see the CICS Resource Definition
Guide. See the CICS Application Programming Reference for programming
information on syntax and options of the MONITOR POINT command.
When your monitoring data has been collected, you can read it into a database
using, for example, the Service Level Reporter Version 2 (SLR II).
See the CICS Performance Guide for guidance about choosing performance tools.
See CICS Supplied Transactions for information about the transactions needed to
invoke them.
Terminal data
Terminal data is very important in solving problems, because it can help you
determine what data was entered just before the transaction failed, and if there is
any output.
The more you know about the information that was input at the terminal on which
the transaction failed, the better your chance of duplicating the problem in a test
environment. However, this information may not be precise, especially if there are
many fields on the input screen. You are recommended to provide a quick and easy
way for terminal operators to report problems, so that they can report the error
while they can still see the data on the screen (or at least remember more clearly
what it was).
The output from a transaction is sometimes easier to capture. If you have a locally
attached printer, you can make a copy. (The problem may be that the printer output
is incorrect.)
Even if the program does not use queues, look at the system queues for CEMT (or
your site replacement) and CSTL (and CDBC if you use DBCTL) to see if there are
any relevant messages.
The things you might want to look for in the queues are:
1. Are the required entries there?
2. Are the entries in the correct order?
Passed information
Be particularly careful when you are using the common work area (CWA) because
you only have one area for the entire system. A transaction may depend on a
certain sequence of transactions and some other program may change that
sequence.
If you are using the CWA, you must also know if your CICS is split into multiple
MRO regions because there is an independent CWA for each MRO region.
Terminal user areas can have problems because the area is associated with a
terminal and not a particular transaction.
If you are using tables in the CWA, remember that there is no recovery; if a
transaction updates the table and then abends, the transaction is backed out but
the change is not.
To do this, you need to use the appropriate utilities and diagnostic tools for the data
access methods that you have at your installation.
Check the various indexes in files and databases. If you have more than one
method of accessing information, one path may be working well but another path
may be causing problems.
When looking through the data in files, pay particular attention to the record layout.
The program may be using an out-of-date record description.
Traces
CICS provides a tracing facility that enables you to trace transactions through the
CICS components as well as through your own programs. CICS auxiliary trace
enables you to write trace records on a sequential device for later analysis.
For information about the tracing facilities provided by CICS, read Chapter 15,
“Using traces in problem determination,” on page 223.
The transaction abend can originate from several places, and the method you use
for problem determination depends on the source of the abend. The procedures are
described in the sections that follow. As you go through them, you might like to use
the worksheet that is included at the end of this section to record your findings
(“Worksheet for transaction abends” on page 35).
For detailed information and a full list of the transaction abend codes used by CICS
and by other IBM products, see CICS Messages and Codes.
If you have received a user abend code, it can still be difficult to find out which
program is responsible for it unless you have adequate documentation. For this
reason, it is good practice for all programmers who issue abends from within their
programs to document the codes in a central location at your installation.
If your abend code is something other than these, use the procedures in “Last
statement identification” on page 294, to find the last command that was executed,
and then turn to “Analyzing the problem further” on page 35. The best source of
information on CICS abends can be found in CICS Messages and Codes. It
contains a section that lists all transaction abend codes issued by CICS. There is
an explanation of why the code was issued, followed by details of system and user
actions. The same information is available online, using the CICS-supplied
messages and codes transaction, CMAC.
If, after reviewing the material in CICS Messages and Codes you cannot find the
cause of the problem, continue with the procedures outlined in Chapter 4, “Dealing
with transaction abend codes,” on page 25.
AICA abends
If your transaction terminated with abend code AICA, the transaction is likely to
have been in a loop. You can find detailed guidance about dealing with loops in
Chapter 8, “Dealing with loops,” on page 141.
ASRA abends
CICS issues an ASRA abend code when it detects that a program check has
occurred within a transaction. Program checks can occur for a wide variety of
reasons, but you can find the nature of the error from the program interrupt code in
the program status word (PSW). The PSW is used by the machine hardware to
record the address of the current instruction being executed, the addressing mode,
and other control information. The PSW gives you the address at which the
program check occurred, and so it represents a record of the circumstances of the
failure.
A transaction can abend with an abend code of ASRB when a program issues the
MVS ABEND macro. For example, BDAM issues this ABEND macro when it detects
errors, rather than sending a return code to the calling program. CICS is notified
when an MVS abend occurs, and in turn issues an ASRB abend code for the
transaction.
Use the procedures outlined in “Locating the last command or statement” on page
293 to find the origin of the abend in your program. That information, together with
the description and procedures for ASRB abends given in CICS Messages and
Codes, should be sufficient for you to solve the problem.
ASRD abends
AEYD abends
At the time of the abend, register 2 points to the parameter area containing the
invalid address. The trace should include an exception trace entry created by
DFHEISR. This entry should identify the parameter in error. If the abend is handled,
EXEC CICS ASSIGN ASRASTG, ASRAKEY, ASRASPC, and ASRAREGS can give
additional information.
A record of the program in error and the offset of the program check within the
program load module are contained in the following places:
You can find information about the PSW in ESA/370 from the IBM Enterprise
Systems Architecture/370 Principles of Operation.
PIC PIC explanation
1 Operation exception—incorrect operation attempted.
Some possible causes
v Overlaid program
v Overlaid register save area, causing incorrect branch
v Resource unavailable, but program logic assumed valid address returned
and took inappropriate action
v Incorrect branch to data that contains no instruction known to the
machine
v In an assembler-language program, a base register was inadvertently
changed
2 Privileged operation—this program is not authorized to execute this
instruction.
Some possible causes
v Incorrect branch to this code; may be due to:
– Overlaid register save area
– Program overlaid by data that contains the privileged operation code
3 Execution exception—you are not allowed to EXECUTE an EXECUTE
instruction.
Some possible causes
v Incorrect branch to this code
v Incorrect register contents; may be due to:
– Overlaid register save area
– Program overlaid by data that contains the incorrect instruction
– Incorrect program logic
4 Protection exception—read or write access violation has occurred.
Some possible causes
v Resource unavailable, and return code not checked. Program logic
assumed valid address returned and took inappropriate action.
v Incorrect interface parameters to some other program or subsystem (for
example, VSAM or DL/I).
v Overlaid register save area, causing incorrect reference to data.
v In an assembler-language program, incorrect initialization or modification
of a register used to address data.
v Attempt to access internal control blocks illegally or use a CICS system
or application programming macro call.
With the storage protection facility, there are further situations in which a protection
exception (interrupt code 4) may occur:
v An attempt is made to write to the CDSA, ECDSA, or ERDSA, when storage
protection is active and the application is running in user key
v An attempt is made to write to the ERDSA or RDSA when PROTECT is specified
for the RENTPGM system initialization parameter.
If any of these events occurs, CICS abnormally terminates the transaction with
abend code ASRA and issues message DFHSR0622 which identifies the DSA over
which the program attempted to write. This information is in the TACB and is traced
by exception trace point ID AP 0781. It is also useful to know the execution key of
the program at the time of the protection exception and whether the program was
It is still possible for CICS to abend when the problem is in the application. For
example, command protection only checks output parameters and does not prevent
the passing of fetch-protected storage as an input parameter to CICS. When CICS
attempts to read such storage, an ASRA abend occurs.
Transaction isolation
Transaction isolation protects the data associated with a user transaction from being
overwritten by EXECKEY(USER) programs invoked by other user transactions.
Command protection
Command protection prevents CICS from updating storage if the storage address is
passed as a command output parameter by a transaction that is not authorized to
update that storage.
The transaction terminates with abend code AEYD. The exception trace entry AP
0779 supplies details of the failing program and command. When migrating to a
system with command protection enabled, EXEC commands that pass unauthorized
storage are identified and can be corrected.
Note: If you are using CSP/AD, CSP/AE, or CSP/RS, you must ensure that the
definitions for programs DCBINIT, DCBMODS, DCBRINIT and DCBNCOP
specify EXECKEY(CICS). These are all examples of programs that modify
global work areas set up by global user exit programs.
v If you are using DB2® and you use the DB2 message formatting routine
DSNTIAR, which is link-edited with your application programs, you should apply
the PTF for DB2 APAR PN12516, and relink-edit the applications using DSNTIAR
so that they may run in user key. If the applications are not re-link-edited after
For example:
v Using static variables or constants for fields which are set by CICS requests. For
example, in assembler coding, if the LENGTH parameter for a retrieval operation
such as EXEC CICS READQ TS is specified as a DC elsewhere in the program,
a constant is set up in static storage. When CICS attempts to set the actual
length into the data area, it causes a protection exception if the program is in the
ERDSA or RDSA.
In some cases, for example EXEC CICS READ DATASET INTO () LENGTH() ...,
the LENGTH value specifies the maximum length that the application can accept,
and is set by CICS to contain the actual length read on completion of the
operation. Even if the program does not have RENT specified, using a variable in
the program itself for this length could cause problems if the program is being
executed concurrently for multiple users. The first transaction may execute
correctly, resulting in the actual record length being set in the LENGTH
parameter, which is then used as the maximum length for the second transaction.
v Defining a table with the RENT attribute and then attempting to initialize or
update the table during CICS execution. Such a table should not be defined as
RENT.
v Defining BMS mapsets as RENT can cause a protection exception, if CICS
attempts to modify the mapsets. In some cases, CICS needs to modify BMS
mapsets during execution. Mapsets should not be link-edited with the RENT
attribute. BMS mapsets should be loaded into CICS key storage (because they
should not be modified by application programs) which means they must not be
link-edited with the RENT attribute. (Partition sets are not modified by CICS and
can be link-edited with the RENT attribute.)
You can do this by examining the time stamps in the CICS and DBCTL traces. For
guidance about this, see the CICS IMS Database Control Guide.
If tracing was off at the time of the failure, you can find an indicator in the task local
work area for DFHDBAT. The indicator is set when CICS passes control to DBCTL,
and reset when DBCTL returns control to CICS.
To find the indicator, locate the eye-catcher for the TIE in the dump and then locate
the LOCLAREA eye-catcher that follows it. The indicator is at offset X’14’ from the
start of the LOCLAREA eye-catcher. If the indicator byte is set to X’08’, CICS has
passed control to DBCTL, and you should examine the IMS part of the transaction.
If the byte is set to X’00’, DBCTL has returned control to CICS, and you should
investigate the CICS part of the transaction.
FEPI abends
For information about FEPI-associated abends in CICS or MVS, see the CICS Front
End Programming Interface User's Guide.
If you have not yet done so, use the CMAC transaction or look in CICS Messages
and Codes for an explanation of any message you may have received, because it
could offer a straightforward solution to your problem.
If the abend was clearly caused by a storage violation, turn directly to Chapter 11,
“Dealing with storage violations,” on page 191. You know when CICS has detected
a storage violation, because it issues this message:
DFHSM0102 applid A storage violation (code X’code’) has been detected by module modname.
On reading this section, you may find that the abend was due to an application
error. In this case, you need to look at the application to find out why it caused the
abend. However, if you find that a CICS module seems to be in error, you need to
contact the IBM Support Center. Before doing so, you must gather this information:
v The name of the failing module, and the module level
v The offset within the module at which the failure occurred
v The instruction at that offset
v The abend type.
This section tells you how to find out all of these things, and contains the following
topics:
v “The documentation you need”
v “Interpreting the evidence” on page 38
v “Looking at the kernel domain storage areas” on page 39
v “Using the linkage stack to identify the failing module” on page 44
If system dumping is permitted for the dump code, and if system dumping has not
otherwise been disabled, a system dump will have been taken when the error was
detected. You can find out which dump relates to which message, because the time
stamps and the dump IDs are the same.
If a system dump was not taken when the abend occurred, you need to find out
why. Use the procedure described in “You do not get a dump when an abend
occurs” on page 171, and follow the advice given there. When you are sure that
dumping is enabled for the appropriate system dump code, you need to recreate
the system abend.
You can use the interactive problem control system (IPCS) to process dumps and
view them online. See “Formatting system dumps” on page 279 for guidance about
processing dumps using IPCS VERBEXIT parameters. The kernel domain storage
areas (formatting keyword KE) and the internal trace table (formatting keyword TR)
are likely to be the most useful at the start of your investigation.
Later, you might find that storage summaries for the application, transaction
manager, program manager, dispatcher, and loader domains (formatting keywords
AP, XM, PG, DS, and LD, respectively) are also useful. In each case, level-1
formatting is sufficient in the first instance.
You can format and print the dump offline. Details of how to do this are given in the
CICS Operations and Utilities Guide.
You may need to copy the dump so that you can leave the system dump data set
free for use, or so that you have a more permanent copy for problem reporting.
Whether you look at the dump online or offline, do not purge it from the dump data
set until you have either copied it or finished with it—you might need to format other
areas later, or the same areas in more detail.
Although the symptom string is designed to provide keywords for searching the
RETAIN database, it can also give you significant information about what was
happening at the time the error occurred, and it might suggest an obvious cause
or a likely area in which to start your investigation. Amongst other things, it
might contain the abend code. If you have not already done so, look in CICS
Messages and Codes to see what action it suggests for this abend code.
If the system is unable to gather much information about the error, the symptom
string is less specific. In such cases, it might not help you much with problem
determination, and you need to look at other parts of the dump. The kernel
domain storage summary is a good place to start.
The task summary is in the form of a table, each line in the table representing a
different task. The left-hand column of the task summary shows the kernel task
number, which is the number used by the kernel domain to identify the task. This is
not the same as the normal CICS task number taken from field TCAKCTTA of the
TCA.
1. When you have located the task summary table in the formatted dump, look in
the ERROR column. If you find a value of *YES* for a particular task, that task
was in error at the time the dump was taken.
Note: If the recovery routine that is invoked when the error occurs does not
request a system dump, you will not see any tasks flagged in error. In
such a case, the system dump is likely to have been requested by a
program that is being executed lower down the linkage stack and that
received an abnormal response following recovery. The program that
received the error has gone from the stack, and so cannot be flagged.
However, error data for the failing task was captured in the kernel
domain error table (see “Finding more information about the error” on
page 41). Error data is also captured in the error table even when no
system dump is taken at all.
In Figure 1, you can see that kernel task number 0008 is shown to be in error.
2. Look next at the STATUS column. For each task you can see one of the
following values:
v ***Running***, meaning that the task was running when the system dump
was taken. Most of the time, only one task is shown to be running. If more
than one task is shown to be running, the different tasks are attached to
separate TCBs.
v Not Running, meaning that the task is in the system but is currently not
running. It might, for example, be suspended because it is waiting for some
resource, or it could be ready to run but waiting for a TCB to become
available.
v KTCB, referring to CICS control blocks corresponding to the CICS TCBs.
These are treated as tasks in the kernel task summary.
v Unused, meaning either that the task was in the system but it has now
terminated, or that there has not yet been a task in the system with the
corresponding task number. Earlier unused tasks are likely to have run and
The PSW is the program status word that is used by the machine hardware to
record the address of the current instruction being executed, the addressing mode,
and other control information. An example of such a storage report is shown in
Figure 2 on page 43, in this case for a program check.
1. Look first in the dump for this header, which introduces the error report for the
task:
==KE: KE DOMAIN ERROR TABLE
2. Next, you will see the kernel error number for the task. Error numbers are
assigned consecutively by the kernel, starting from 00000001.
=KE: ERROR NUMBER: 00000001
The error number tells you the number of program checks and system abends
that have occurred for this run of CICS. Not all of them have necessarily
resulted in a system dump.
3. Optional: Some kernel error data follows. If you want to find the format of this
data (and, in most cases, you will not need to), see the DFHKERRD section of
the CICS Data Areas.
4. The next thing of interest is the kernel’s interpretation of what went wrong. This
includes the error code, the error type, the name of the program that was
running, and the offset within the program.
Figure 2. Storage report for a task that has experienced a program check
Note that only the values of the registers and PSW, not the storage they address,
are guaranteed to be as they were at the time of the error. The storage that is
shown is a snapshot taken at the time the internal system dump request was
issued. Data might have changed because, for example, a program check has been
caused by an incorrect address in a register, or short lifetime storage is addressed
by a register.
The registers might point to data in the CICS region. If the values they hold can
represent 24-bit addresses, you see the data around those addresses. Similarly, if
their values can represent 31-bit addresses, you get the data around those
addresses.
It could be that the contents of a register might represent both a 24-bit address and
a 31-bit address. In that case, you get both sets of addressed data. (Note that a
register might contain a 24-bit address with a higher order bit set, making it appear
like a 31-bit address; or it could contain a genuine 31-bit address.)
If, for any reason, the register does not address any data, you see either of these
messages:
24-bit data cannot be accessed
31-bit data cannot be accessed
This means that the addresses cannot be found in the system dump of the CICS
region. Note that MVS keeps a record of how CICS uses storage, and any areas
not used by CICS are considered to lie outside the CICS address space. Such
areas are not dumped in an MVS SDUMP of the region.
It is also possible that the addresses were within the CICS region, but they were
not included in the SDUMP. This is because MVS enables you to take SDUMPs
selectively, for example “without LPA”. If this were to happen without your
knowledge, you might think you had an addressing error when, in fact, the address
was a valid one.
The format of the PSW is described in the IBM Enterprise Systems Architecture/370
Principles of Operation. The information in the PSW can help you to find the details
needed by the IBM Support Center. You can find the address of the failing
instruction, and hence its offset within the module, and also the abend type. You
find the identity of the failing module itself by examining the kernel linkage stack, as
described in “Using the linkage stack to identify the failing module.”
Having found which task was in error from the kernel’s task summary (see “Finding
which tasks are associated with the error” on page 39), you need to find out which
module was in error. The module name is one of the things you need to give the
IBM Support Center when you report the problem to them.
1. Find the task number of the task in error from the KE_NUM column, and use
this as an index into the linkage stack entries. These are shown in the dump
after the task summary.
2. Look at the TYPE column when you have found the task number. The TYPE
column, as shown in the example, can contain any of the following entries:
Bot This marks the first entry in the stack.
KE_NUM @STACK LEN TYPE ADDRESS LINK REG OFFS ERROR NAME
0031 0520A020 0120 Bot 84C00408 84C006D8 02D0 DFHKETA
0031 0520A140 01F0 Dom 84C0F078 84C0F18E 0116 DFHDSKE
0031 0520A330 0370 Dom 84CAA5A8 84CAACC2 071A DFHXMTA
0031 0520A6A0 0330 Dom 84F25430 84F25CF6 08C6 DFHPGPG
Int +00CC 84F254B6 0086 INITIAL_LINK
0031 0520A9D0 03C0 Dom 84F6C230 84E5DC40 0000 DFHAPLI1
Int +0EEA 84F6C66E 043E CICS_INTERFACE
0031 0520AD90 0108 Sub 0230B400 8230B8CA 04CA DFHEIQSP
0031 0520AE98 0290 Sub 82136D90 82137178 03E8 *YES* DFHLDLD
0520B128 Int +08FC 82136F26 0196 LDLD_INQUIRE
0520B128 Int +128E 821376CE 093E CURRENT_GET_NO_WAIT
0031 0520B128 0F70 Dom 84C6F8E0 84C72EA6 35C6 DFHMEME
Int +2CB6 84C6FA4E 016E SEND
Int +1486 84C72684 2DA4 CONTINUE_SEND
Int +350E 84C70DE4 1504 TAKE_A_DUMP_FOR_CALLER
0031 0520C098 03D0 Dom 84C52458 84C52F52 0AFA DFHDUDU
Int +08F4 84C5254A 00F2 SYSTEM_DUMP
Int +1412 84C53212 0DBA TAKE_SYSTEM_DUMP
You can sometimes use the technique described in this section to gather the
information that the IBM Support Center needs to resolve a CICS system abend.
However, you should normally use the summary information presented in the
formatted output for the kernel domain storage areas. This method is only valid if
the abend has occurred in a module or subroutine that has a kernel linkage stack
entry. This is the case only where the module or subroutine has been invoked by
one of these mechanisms:
v A kernel domain call
v A kernel subroutine call
v A call to an internal procedure identified to the kernel
Routines that have been invoked by assembler language BALR instructions do not
have kernel linkage stack entries.
If you are not sure of the format of the PSW, or how to calculate the offset, see the
z/Architecture Principles of Operation manual.
The Support Center also needs to know the instruction at the offset.
1. Locate the address of the failing instruction in the dump, and find out what
instruction is there. It is sufficient to give the hex code for the instruction, but
make sure you quote as many bytes as you found from the PSW instruction
length field.
2. Identify the abend type from the program interruption code, so that you can
report that too. It might, for example, be ‘protection exception’ (interruption code
0004), or ‘data exception’ (interruption code 0007).
Figure 4 shows some entries from a typical program storage map summary.
Note: Entries made in the R/A MODE OVERRIDE columns are the value of the
RMODE and AMODE supplied on the DEFINE_PROGRAM call for that
program. If a REQUIRED_RMODE or REQUIRED_AMODE is not specified,
If CICS has stalled, turn directly to “What to do if CICS has stalled” on page 104.
If you have one or more tasks in a wait state, you should have already carried out
preliminary checks to make sure that the problem is best classified as a wait, rather
than as a loop or as poor performance. If you have not, you can find guidance
about how to do this in Chapter 2, “Classifying the problem,” on page 7.
You are unlikely to have direct evidence that a CICS system task is in a wait state,
except from a detailed examination of trace. You are more likely to have noticed
that one of your user tasks, or possibly a CICS user task - that is, an instance of a
CICS-supplied transaction - is waiting. In such a case, it is possible that a waiting
CICS system task could be the cause of the user task having to wait.
For the purpose of this section a task is considered to be in a wait state if it has
been suspended after first starting to run. The task is not in a wait state if it has
been attached to the transaction manager but has not yet started to run, or if it has
been resumed after waiting but cannot, for some reason, start running. These are
best regarded as performance problems. Tasks that are ready to run but cannot be
dispatched might, for example, have too low a priority, or the CICS system might be
at the MXT limit, or the CICS system might be under stress (short on storage). If
you think you might have such a problem, read Chapter 9, “Dealing with
performance problems,” on page 159.
Most tasks are suspended at least once during their execution, for example while
they wait for file I/O to take place. This is part of the regular flow of control, and it
gives other tasks a chance to run in the meantime. It is only when they stay
suspended longer than they should that a problem arises.
There are two stages in resolving most wait problems involving user tasks. The first
stage involves finding out what resource the suspended task is waiting for, and the
second stage involves finding out why that resource is not available. This section
focuses principally on the first of these objectives. However, in some cases there
are suggestions of ways in which the constraints on resource availability can be
relieved.
Online inquiry is the least powerful technique, and it can only tell you what resource
a suspended user task is waiting for. This is enough information to locate the failing
area, but you often need to do more investigation before you can solve the
problem. The advantage of online inquiry is that you can find out about the waiting
task as soon as you detect the problem, and so you capture the data early.
Tracing can give you much more detail than online inquiry, but it involves significant
processing overhead. It must also be running with the appropriate options selected
when the task first enters a wait state, so this usually means you need to reproduce
the problem. However, the information it gives you about system activity in the
period leading up to the wait is likely to provide much of the information you need to
solve the problem.
A CICS system dump can give you a picture of the state of the CICS system at an
instant during the wait. You can request the dump as soon as you notice that a task
has entered a wait state, so it gives you early data capture for the problem.
However, the dump is unlikely to tell you anything about system activity in the
period leading up to the wait, even if you had internal tracing running with the
correct selectivity when the task entered the wait. This is because the trace table
has probably wrapped before you have had a chance to respond. However, the
formatted dump might contain much of the information you need to solve the
problem.
If you are able to reproduce the problem, consider using auxiliary tracing and
dumping in combination.
If the task is suspended, the information that is returned to you includes the
resource type or the resource name identifying the unavailable resource. CEMT INQ
TASK displays:
v the resource type of the unavailable resource in the HTYPE field.
v the resource name of the unavailable resource in the HVALUE field.
EXEC CICS INQUIRE TASK returns values in the SUSPENDTYPE and
SUSPENDVALUE fields which correspond to the resource type and resource name
of the unavailable resource.
Table 9 on page 110 gives a list of all the resource types and resource names that
user tasks might be suspended on, and references showing where to look next for
guidance about solving the wait.
You probably need a system dump of the appropriate CICS region to investigate the
wait. If you do not yet have one, you can get one using CEMT PERFORM SNAP or CEMT
PERFORM DUMP - but make sure the task is still in a wait state when you take the
dump. You subsequently need to format the dump using keywords for the given
resource type. Advice on which keywords to use is given, where appropriate, in the
individual sections.
When you look at the trace table, you can find trace entries relating to a particular
task from the task numbers that the entries contain. Each is unique to a task so you
can be sure that, for any run of CICS, trace entries having the same task number
belong to the same task.
For general guidance about setting tracing options and interpreting trace entries,
see Chapter 15, “Using traces in problem determination,” on page 223.
The values of the parameters can provide valuable information about task waits, so
pay particular attention to them when you study the trace table.
You need to use the dump formatting keyword DS to format the dispatcher task
summary. You probably need to look at other areas of the dump as well, so keep
the dump on the dump data set.
The dispatcher task summary gives you information like that shown in Figure 5.
Some of the fields relate to all tasks known to the dispatcher, and some (identified
in the table) relate only to suspended tasks. Values are not provided in fields of the
latter type for tasks that are not suspended.
For details of how you can use trace to investigate waits, see “Investigating waits
using trace” on page 51.
Table 4 shows the parameters that set task summary fields, the functions that use
those parameters, and the domain gates that provide the functions. Task summary
fields that are not set by parameters are also identified (by none in “Related
parameter” column).
Table 4. Parameters and functions that set fields shown in the dispatcher task summary
Field Related parameter Function Input or Gate
output
AD DOMAIN_INDEX
INQUIRE_TASK IN DSBR
GET_NEXT OUT
DTA ATTACH_TOKEN CREATE_TASK IN KEDS
DS_TOKEN TASK_TOKEN
ATTACH OUT DSAT
CANCEL_TASK IN
PURGE_INHIBIT_QUERY IN
SET_PRIORITY IN
TASK_REPLY IN
GET_NEXT DSBR
INQUIRE_TASK
ST none
If you have one or more unresponsive terminals, that is terminals that are showing
no new output and accepting no input, this does not necessarily indicate a terminal
wait.
1. If you have one or more unresponsive terminals:
a. Use CEMT INQ TERMINAL to find the transaction running at the terminal.
b. UseCEMT INQ TASK to find out what resource that task is waiting on.
c. When you know that, look at Table 9 on page 110 to find where you can get
further guidance.
2. If all the terminals in the network are affected, and CICS has stalled, read “What
to do if CICS has stalled” on page 104 for advice about how to investigate the
problem.
If you have a genuine terminal wait, remember when you carry out your
investigation that terminals in the CICS environment can have a wide range of
characteristics. A terminal is, in fact, anything that can be at the end of a
communications line. It could, for example, be a physical device such as a 3270
terminal or a printer, or a batch region, or it could be another CICS region
connected by an interregion communication link, or it could be a system that is
connected by an LUTYPE6.1 or APPC (LUTYPE6.2) protocol. If LUTYPE6.1 is in
use, the other system might be another CICS region or an IMS region. With APPC
(LUTYPE6.2), the range of possibilities is much wider. It could include any of the
systems and devices that support this communications protocol. For example, apart
from another CICS region, there might be a PC or a DISOSS system at the other
end of the link.
If you eventually find that the fault lies with a terminal, or a resource such as
DISOSS, the way in which you approach the problem depends on what type it is. In
some cases, you probably need to look in appropriate books from other libraries for
guidance about problem determination.
Your strategy must then be to find where in the communication process the fault
lies. These are the basic questions that must be answered:
1. Is the problem associated with the access method?
2. If the access method has returned, or has not been involved, is terminal control
at fault?
3. If terminal control is working correctly, why is the terminal not responding?
To answer most of these questions, you will need to do some offline dump analysis.
Use CEMT PERFORM SNAP to get the dump, and then format it using the formatting
keyword TCP. Do not cancel your task before taking the dump. If you do, the
values in the terminal control data areas will not relate to the error.
Online method
Use the transaction CECI to execute the system programming command EXEC CICS
INQUIRE TERMINAL DEVICE. This returns one of the terminal types identified in the
CICS Resource Definition Guide.
Offline method
Look at the formatted dump output you have obtained for keyword TCP. First,
identify the TCTTE relating to the terminal, from the four-character terminal ID
shown in field TCTTETI. Now look in field TCTTETT, and read the 1-byte character
that identifies the type of terminal. You can find what terminal type is represented by
the value in the field from the description given in the CICS Data Areas.
Online method
Use the CECI transaction to execute the system programming command EXEC
CICS INQUIRE TERMINAL ACCESSMETHOD. This returns the access method in
use by the terminal.
Offline method
You can find the access method for the terminal from the TCTTE. Look in field
TCTEAMIB, which is the byte name definition for the access method. The CICS
Data Areas relates values to access methods.
If you have any other access method, for example BSAM, you need to adapt the
guidance given here accordingly.
The following are the values you might find there, and their interpretations:
TCTECIP command request in progress
TCTEDIP data request in progress
Either of these status values indicates that a VTAM request is in progress, and that
the VTAM RPL is active. A response is expected either from VTAM, or from the
terminal. You can find the address of the RPL from field TCTERPLA, unless the
request was for a RECEIVE on an APPC session, in which case you can find the
RPL address from field TCTERPLB.
If a VTAM request is not in progress, the next field of interest is in the VTAM
system area of the TCTTE. Find four bytes of VTAM exit IDs, starting at field
TCTEEIDA. If any are nonzero, the VTAM request has completed. Nonzero values
suggest that VTAM is not involved in the wait. You can find the meanings of the
values from the VTAM module ID codes list in the table below.
If you suspect that the problem is associated with VTAM, consider using either
CICS VTAM exit tracing or VTAM buffer tracing. Both of these techniques can give
you detailed information about the execution of VTAM requests. For guidance about
using the techniques, read the appropriate sections in Chapter 15, “Using traces in
problem determination,” on page 223.
If field TCTVAA1 points to a TCTTE on the active chain, check that the TCTTE of
the terminal your task is waiting for is included in the chain. You can find this out by
following the chain using the “next” pointer, field TCTEHACP of the TCTTE. If it
does not contain the address of the next TCTTE on the chain, it contains either of
these values:
X’FFFFFFFF’ this is the last TCTTE on the chain
X’00000000’ this TCTTE is not on the active chain
If you find a value of X’00000000’, report the problem to the IBM Support Center.
CICS system dumps contain an index to the VTAM terminal entries. It appears in
the terminal control (TCP) summary section of the dump.
Information about the status and attributes of the VTAM terminals appears in an
interpreted form at the end of the control block for each terminal entry. The
information shown depends on the attributes of the terminal.
The example in Figure 6 on page 64 shows the index followed by a terminal entry
with its interpreted status and attribute information.
The values that are given below for fields in the TCTTE are not the only
possibilities, but they show important things about the terminal status. If you find
any other values for these fields, look in the CICS Data Areas to find out what they
mean.
The following are the questions that need to be asked, and some values that could
provide the answers.
1. Is the terminal in service? Look at field TCTTETS of the TCTTE, to find the
terminal status. The values that indicate why a terminal was failing to respond
include:
TCTTESPO = 1 and TCTTESOS = 1 terminal out of service
TCTTESOS = 1 only terminal in error recovery
Look also at field TCTESEST, to find the session status with respect to
automatic transaction initiation (ATI) for the terminal. Some of the values you
might see are:
TCTESLGI = 0 CREATESESS(NO) in TYPETERM definition
TCTESLGI = 1 CREATESESS(YES) in TYPETERM definition
TCTESLGT = 1 recovering CREATESESS
If all three bits are set, so the value of the byte is TCTENIS, the node is in
session.
You next need to see if the terminal is logging off, or if it has already been
logged off. The fields of interest are TCTEINND, TCTEINBD, and TCTEIPSA.
The values to look for are:
TCTENND = 1 the terminal is to be logged off
TCTENBD = 1 the terminal is logging off because of an error
TCTEPSA = 1 the session with the terminal ended abnormally
—look for any explanatory message on CSMT
If any of these bits are set, the terminal might not be able to respond to the
waiting task.
5. Should the terminal respond to the task? Field TCTEIPRA tells you this:
If the values you have found in all these fields suggest that the terminal status is
normal, the terminal is probably waiting for some activity to complete before it
responds. The type of investigation you need to do next depends on the type of
terminal involved in the wait. You should already have determined this, for example
by using the system programming command EXEC CICS INQUIRE TERMINAL
DEVICE.
Tools you can use for debugging terminal waits when VTAM is in use
Amongst your debugging tools, two are likely to be of particular use for investigating
terminal waits in a VTAM environment. They are:
v VTAM buffer trace. This is a feature of VTAM itself, and you need to see the
appropriate manual in the VTAM library for details of how to use it.
v CICS VTAM exit trace. This is a feature of CICS, and you can control it from the
CETR panel.
For a description of the use of these two types of tracing in CICS problem
determination, see Chapter 15, “Using traces in problem determination,” on page
223.
If a session has been acquired and it has not failed, your task is likely to be waiting
for some response from a task in the other region. This can apply to any of the
interregion or intersystem communication activities—function shipping,
asynchronous processing, transaction routing, distributed transaction processing, or
distributed program link. No matter which of these applies, it is most likely that the
other region is not responding because the related task there has been suspended.
You need to identify the related task in the remote region, and find out the resource
it is waiting on. When you have done that, see Table 9 on page 110 to find out
which part of this section to turn to next.
Waits on these resources occur when tasks make unconditional storage requests
(SUSPEND=YES) that cannot be satisfied. The type is CDSA, UDSA, SDSA, or
RDSA for storage requests below the 16MB line, and ECDSA, EUDSA, ESDSA, or
ERDSA for storage requests above the line. If conditional requests are made
(SUSPEND=NO), tasks are not suspended on these resources. Instead, an
exception response is returned if the request cannot be satisfied.
CICS automatically takes steps to relieve storage when it is under stress, for
example by releasing storage occupied by programs whose current use count is 0.
In addition, your task may be automatically purged if it has waited for storage longer
than the deadlock timeout parameter specified in the installed transaction definition.
Certain conditions prevent purging of a task, for example, a deadlock timeout value
of 0, or a specification of SPURGE(NO).
The two most likely explanations for extended waits on storage requests are:
If the suspended task has made a reasonable GETMAIN request, you next
need to see if the system is approaching SOS.
b. Is the storage close to being exhausted? To see if this could be the cause of
the wait, look at the DSA summary in the formatted dump.
This tells you the current free space in each DSA, both in bytes and as a
percentage of the total storage. It also tells you the size of the largest free
area, that is, the biggest piece of contiguous storage. (“Contiguous storage”
in this context means storage not fragmented by other records. It is
accepted that records too large to fit in a single CI can be split across two or
more CIs that are not necessarily contiguous.)
If the largest free area is smaller than the requested storage, this is likely to
be the reason why the task cannot obtain its requested storage.
If the amount of free space is unexpectedly small, look at the task subpool
summary. If a task has made an unusually large number of GETMAIN
requests, this could indicate that it is looping. A looping task might be
issuing GETMAIN requests repetitively, each for a reasonable amount of
storage, but collectively for a very large amount. If you find evidence for a
looping task, turn to Chapter 8, “Dealing with loops,” on page 141. If your
task has made a reasonable request and the system seems to have
sufficient free storage, you next need to see if fragmentation of free storage
is causing the GETMAIN request to fail.
c. Is fragmentation of free storage causing the GETMAIN request to fail? If the
DSA summary shows that the current free space is significantly greater than
the largest free area, it is likely that the DSA has become fragmented.
To see if this could be the cause of the wait, look at the temporary storage
summary in the formatted dump. If the current free space is very small, this is likely
to be the reason why the task cannot obtain its requested temporary storage. In
such a case, consider defining secondary extents for the data set.
Look also at the trace. If a task has made an unusually large number of WRITEQ
TS requests, it could be looping. A looping task might be issuing WRITEQ TS
requests repetitively, each for a reasonable amount of storage, but collectively for a
very large amount. If you find evidence for a looping task, turn to Chapter 8,
“Dealing with loops,” on page 141.
If your task has made a reasonable request and the system seems to have
sufficient unallocated temporary storage, you next need to see if fragmentation of
unallocated storage is causing the WRITEQ TS request to fail.
The following fields in the summary are of interest should your task be suspended
on resource type TSAUX:
Number of control intervals in data set:
Number of control intervals currently in use:
Available bytes per CI:
For control intervals of 4K, the available bytes per CI figure is 4032.
If your task is attempting to a write a record that is smaller than or equal to the
available bytes per CI figure (including its record header which is 28 bytes long),
this means that no control interval has the required amount of contiguous space to
satisfy the request.
If your task is attempting to write a record that is longer than the available bytes
per CI figure, CICS splits the record into sections of a length equal to this figure.
CICS then attempts to store each section in a completely empty control interval,
and any remaining part of the record in a control interval with the contiguous space
to accommodate it. If your task is waiting on resource type TSAUX after having
attempted to write a record longer than the available bytes per CI figure, either of
the following has occurred:
v There are not enough available completely empty control intervals to
accommodate all the sections
(CIs in data set - CIs in use) < (record length / available bytes per CI)
v No control interval has enough contiguous space to accommodate the remainder.
There are two ways in which you can discover the owner of the enqueue that the
task is waiting on:
v Use the CEMT INQUIRE UOWENQ command. For an example of how to use this
command to discover the owner of an enqueue, see “Resource type ENQUEUE”
on page 135. For definitive information about CEMT INQUIRE UOWENQ, see CICS
Supplied Transactions.
Do not use the EXEC CICS ENQ command for recoverable resources.
v Use the NQ section of a system dump if you have an enqueue wait for one of
the following resource names:
– JOURNALS
– KCADDR
| – KCSTRING
– LOGSTRMS
To investigate enqueue waits on these resources, you can use the NQ section of a
system dump. (You can use a system dump to investigate enqueue waits on other
types of resource, but you might find the INQUIRE UOWENQ command more
convenient.)
CICS maintains a separate enqueue pool for each type of resource that can be
enqueued upon. To produce a summary of each enqueue pool, specify 1 on the NQ
dump formatting keyword (dump formatting keywords are described in “A summary
of system dump formatting keywords and levels” on page 281). Figure 7 on page 73
shows an example summary for the transient data enqueue (TDNQ) pool.
*NOTE: These values were reset at 15.44.39 (the last statistics interval collection)
OWNER / WAITER
NQEA Tran Tran Lifetime Hash
Enqueue Name Len Sta Address Id Num Local Uowid Uow Tsk Indx
------------------------------ --- --- -------- ---- ----- ---------------- --- --- ----
Q007TOQ 9 Act 052C4580 TDWR 00356 A8EBC70A53A4BC82 1 0 13
Q002FROMQ 9 Act 053D0880 TDRD 00435 A8EBD91A57D9B7D2 2 0 24
Waiter : 0540BBC0 TDRD 00467 A8EBDAC692BB7C10 0 1 24
Waiter : 0537CE70 TDDL 00512 A8EBDAE6FF0B56F2 1 0 24
Q007FROMQ 9 Act 0540CC80 ENQY 00217 A8EBB7FE23067C44 0 1 51
Waiter : 0538F320 ENQY 00265 A8EBBF0846C00FC0 0 1 51
Waiter : 0518C5C0 ENQY 00322 A8EBC393B90C66D8 0 1 51
Q002TOQ 9 Ret 0520B260 ---- ----- A8EBD82AFDA4CD82 1 0 53
Q009FROMQ 9 Act 0540A140 TDRD 00366 A8EBC84D3FF80250 1 0 62
Figure 7. Example system dump, showing summary information for the TDNQ enqueue pool
In the table at the bottom of Figure 7, each enqueue in the pool appears on a new
line. If the enqueue has waiters, they are displayed in order on subsequent lines.
Waiters are identified by the string Waiter. The meanings of the table headings are:
Enqueue Name
The string that has been enqueued upon. Normally, up to 30 characters of
the name are displayed; however, the summary reports for file control and
address enqueue pools format the enqueue name differently:
v File control uses six enqueue pools for its various types of lock. Each
enqueue contains the address of a control block (for example, DSNB,
FCTE) in its first four bytes. If the enqueue is a record lock, this is
followed by the record identifier.
Depending upon the type of the data set or file, the remainder of the
enqueue name could, for example, be an RRN in an RRDS, or a record
key in a KSDS data set. In the summary, the remainder of the enqueue
name is displayed in both hex and character formats. This takes up two
summary lines instead of one.
v The summary reports for the EXECADDR and KCADDR enqueue pools
display the enqueue name in hexadecimal format. This is because the
enqueue request was made on an address.
Len The length of the enqueue name.
Sta The state that the enqueue is held in. This field contains either:
Act The enqueue is held in active state—that is, other transactions are
allowed to wait on the enqueue.
Ret The enqueue is held in retained state—that is, other transactions
are not allowed to wait on the enqueue. Typically, this is because
the enqueue is owned by a shunted unit of work.
You can use the CEMT INQUIRE UOWENQ command to discover the owner of the
enqueue that the suspended task is waiting on providing the owner is on the same
region. This cannot detect owners on other regions. Note that, for EXECADDR type
waits, to display the address of the resource specified on the EXEC CICS ENQ
command you need to use the hexadecimal display option of CEMT.
For detailed information about the EXEC CICS ENQ command, see the CICS
Application Programming Reference .
The following is a list of possible causes, and suggestions to consider before you
carry out a detailed investigation. If these do not give you enough information in
Note: The task waiting on resource ICGTWAIT might not be the one that you
first set out to investigate. Any AID task scheduled to start at the same
terminal cannot do so until the current task has terminated.
v You have found that the task is waiting on resource type ICWAIT. This means
that the task issued an EXEC CICS DELAY command that has not yet completed.
1. Check that the interval or time specified on the request was what you
intended. If you believe that the expiry time of the request has passed, that
suggests a possible CICS error.
2. Consider the possibility that the task was the subject of a long DELAY that
was due to be canceled by some other task. If the second task failed before
it could cancel the delay, the first would not continue until the full interval
specified on DELAY had expired.
v A task that issued EXEC CICS POST did not have its ECB posted when you
expected it to. Check to make sure the interval or time you specified was what
you intended.
v A task that issued EXEC CICS WAIT EVENT was not resumed when you thought
it should have been. Assuming the WAIT was issued sometime after a POST:
1. Check to make sure that the interval or time specified on the POST was what
you intended.
2. If it is, check to see whether the ECB being waited on was posted. If it has
been posted, that indicates a possible CICS error.
If none of the simple checks outlined here help you to solve the problem, read the
following information.
If the value of ICEXTOD is greater than CSATODTU, the ICE has not yet reached
the expiry time. The possible explanations are:
v Your task either did not make the DELAY request you expected, or the interval
specified was longer than intended. This could indicate a user error. Check the
code of the transaction issuing the request to make sure it is correct.
v Your task’s delay request was not executed correctly. This might indicate an error
within CICS code, or a corrupted control block.
If the value of ICEXTOD is equal to CSATODTU (very unlikely), you probably took
the system dump just as the interval was about to expire. In such a case, attempt to
recreate the problem, take another system dump, and compare the values again.
If the value of ICEXTOD is less than CSATODTU, the ICE has already expired. The
associated task should have resumed. This indicates that some area of storage
might have been corrupted, or there is an error within CICS code.
Using trace to find out why tasks are waiting on interval control
Before using trace to find out why your task is waiting on interval control, you need
to select an appropriate trace destination and set up the right tracing options.
By their nature, interval control waits can be long, so select auxiliary trace as the
destination, because you can specify large trace data sets for auxiliary trace.
However, the data sets do not have to be large enough to record tracing for the
whole interval specified when you first detected the problem. That is because the
error is likely to be reproducible when you specify a shorter interval, if it is
reproducible at all. For example, if the error was detected when an interval of 20
seconds was specified, try to reproduce it specifying an interval of 1 second.
As far as tracing selectivity is concerned, you need to capture level 2 trace entries
made by dispatcher domain, timer domain, and interval control program. The sort of
trace entries that you can expect in normal operation are shown in the examples
below. They show the flow of data and control following execution of the command
EXEC CICS DELAY INTERVAL(000003). A similar set of trace entries would be
obtained if TIME had been specified instead of INTERVAL, because TIME values
are converted to corresponding INTERVAL values before timer domain is called.
1. Use the CETR transaction to set up the following tracing options:
AP 00E1 EIP ENTRY DELAY REQ(0004) FIELD-A( 0034BD70 ....) FIELD-B(08001004 ....)
TASK-00163 KE_NUM-0007 TCB-009F3338 RET-8413F43E TIME-16:31:58.0431533750 INTERVAL-00.0000166250 =000602=
AP 00F3 ICP ENTRY WAIT REQ(2003) FIELD-A(0000003C ....) FIELD-B(00000000 ....)
TASK-00163 KE_NUM-0007 TCB-009F3338 RET-84760B88 TIME-16:31:58.0432681250 INTERVAL-00.0000370000 =000605=
1) Trace point AP F322 is used to report that system task APTIX has been
resumed. APTIX has the job of “waking up” your task on expiration of
the specified interval.
The task number for APTIX is, in this case, X’00006’, and this value is
shown on the trace entry.
2) Trace point DS 0004 is on entry to the dispatcher SUSPEND/RESUME
interface. This function is stated explicitly in the header. TASK-00006
indicates that the trace entry is for system task APTIX.
SUSPEND_TOKEN(01040034) shows that APTIX is requesting
dispatcher domain to resume the task that was suspended for the
specified interval. You will recall that a suspend token of X’01040034’
was given to your task when it was first suspended.
3) Trace point DS 0005 is on exit from the dispatcher SUSPEND/RESUME
interface.
The trace entry shows RESPONSE(OK), indicating that the task whose
suspend token was X’01040034’ has successfully been resumed.
However, note that this does not necessarily mean that the task has
started to run—it has only been made “dispatchable”. For example, it still
needs to wait for a TCB to become available.
e. Now look forward in the trace, and locate a trace entry made from trace
point AP 00F3 and showing your task number. This and the next entry
conclude the DELAY request for your task. They are shown in Figure 11 on
page 80
Chapter 6. Dealing with waits 79
page 80.
Figure 11. Trace entries showing satisfactory conclusion of the DELAY request
When you look at your own trace table, be concerned principally with finding the
point at which the processing went wrong. Also, watch for bad parameters. If you do
find one, it could mean that an application has a coding error, or some field holding
a parameter has been overlaid, or an error has occurred in CICS code.
Checking your application code is the easiest option you have. If you find that it is
correct and you suspect a storage violation, see Chapter 11, “Dealing with storage
violations,” on page 191. If you think the error is in CICS code, contact the IBM
Support Center.
Table 6 lists the identifiable resource types associated with file control waits, with all
the possible reasons for waits, and whether they occur for files accessed in RLS
mode, non-RLS mode, or both.
Table 6. Resource types for file control waits
Resource Description RLS or non-RLS access mode
CFDTWAIT The task is waiting for a request to N/A. The wait is caused by access
the CFDT server to complete. to a coupling facility data table.
CFDTPOOL The task is waiting for a CFDT N/A. The wait is caused by access
“maximum requests” slot to to a coupling facility data table.
become available.
CFDTPOOL The task is waiting for a CFDT N/A. The wait is caused by access
“locking request” slot to become to a coupling facility data table.
available.
ENQUEUE The task is waiting for a lock on a Non-RLS
file or data table. See “Resource
type ENQUEUE - waits for locks on
files or data tables” on page 91.
The implications of waits on any of these file control resource types are dealt with in
the sections that follow.
Requests to the CFDT server are normally processed synchronously. Therefore, this
wait could indicate that:
v There is a high level of activity to the CFDT server
v The server is processing a request for a record that is longer than 4K bytes
v The task has issued a request for a record that is currently locked by another
task within the sysplex.
Waiting on this resource can occur only for a file defined to access a coupling
facility data table.
CICS places a limit on the number of requests that a region can have running
simultaneously in a coupling facility data tables server. This limit is known as the
“maxreqs” limit, and it avoids overloading the coupling facility. If the number of
requests currently running in the server for a CICS region has reached this limit, a
request waits until one of the other requests completes.
Waiting on this resource can occur only for a file defined to access a coupling
facility data table.
CICS places a limit on the number of locking requests (that is, requests that might
acquire record locks) that a region can have simultaneously running in a coupling
facility data table server. This limit is known as the locking request slot (LRS) limit,
and it avoids tasks that hold locks from preventing other coupling facility data table
accesses. If the number of locking requests currently running in the server for a
CICS region has reached the LRS limit, this request waits for one of the locking
requests to complete.
This is the server that CICS file control uses for any VSAM request it issues in RLS
mode. Cleanup after SMSVSAM failure is in two stages.
1. Wait for VSAM to reject any file requests that were in-flight at the time of the
server failure. When all these active file requests have been rejected, CSFR
cleans up CICS state by issuing a CLOSE request against every file open in
RLS mode. When the last CLOSE request has completed, the first stage of
clean up is complete.
If CSFR is waiting for this first stage of cleanup to complete, it is waiting on
resource type FCACWAIT.
2. Wait for VSAM to reject any system requests issued against the SMSVSAM
control ACB, and then unregister the control ACB.
If CSFR is waiting for this second stage of cleanup to complete, it is waiting on
resource type FCCRSUSP.
You can specify the number of VSAM data buffers and VSAM index buffers in the
FILE resource definition using the DATABUFFERS and INDEXBUFFERS
parameters, respectively. Consider increasing the numbers of these buffers if you
find that tasks are frequently having to wait on this resource type.
If there are insufficient data and index buffers for a single task, the task is
suspended indefinitely. This might happen unexpectedly if you have a base cluster
and one or more paths in the upgrade set, and your application references only the
base. VSAM upgrades the paths whenever changes are made to the base. There
could then be too few buffers defined in the LSRPOOL for both base and paths.
Waiting on this resource can occur only for files accessed in non-RLS mode.
Waits on this type of resource can occur only for files accessed in RLS mode.
New work is created for the task when CICS receives a quiesce request from its
SMSVSAM server through the CICS RLS quiesce exit program, DFHFCQX.
SMSVSAM drives the CICS RLS quiesce exit, which creates a control block for the
request and posts the CFQR task to notify it of the request’'s arrival.
New work is created for this system task when a user task issues a quiesce request
(for example, issues an EXEC CICS SET DSNAME(...) QUIESCED WAIT
command). The user request is processed by CICS module DFHFCQI, which
creates a control block for the request and posts the CFQS task to notify it of the
request’s arrival.
This is a transient condition. CICS waits for all current update operations for this
VSAM data set to complete and retries the request twice. If the error continues after
the request is retried, CICS assumes that there is a genuine error and returns a
response of ILLOGIC to the application. Since ILLOGIC is a response to all
unexpected VSAM errors, CICS also returns the VSAM response and reason codes
(X'0890') or (X'089C') in bytes 2 and 3 of EIBRCODE. These identify the cause of
the ILLOGIC response.
Waiting on this resource can occur only for files accessed in non-RLS mode.
Only one task at a time waits on FCFSWAIT. If any other tasks attempt to change
the state of the same file, they are suspended on resource type ENQUEUE. See
“Task control waits” on page 129.
Waiting on this resource can occur for files accessed in both RLS and non-RLS
mode.
For example, VSAM uses MVS RESERVE volume locking, and it is likely that
another job has at present got the lock on the volume. See if there are any
messages on the MVS console to explain the error.
A wait on resource type FCIOWAIT occurs when the exclusive control conflict is
deferred internally by VSAM and not returned as an error condition to CICS. An
example of this is when a request against an LSR file is made for exclusive control
of a control interval (for example, by WRITE or READ UPDATE) and either this task
or another task already holds shared control of this control interval (for example, by
STARTBR).
| Waiting on this resource can occur only for files accessed in non-RLS mode. File
| Control requests issued on OPEN TCBs against LSR files are synchronous
| requests to VSAM and therefore do not wait on resource type FCIOWAIT.
DFHFCIR is the module that rebuilds the recoverable file control environment, and
the file control initialization task waits on resource type FCIRWAIT.
Because this wait occurs during CICS initialization, you should not be able to see a
task waiting on this resource.
If tasks are being caused to wait unduly for strings, consider whether you can
increase the value of STRINGS, or change the programming logic so that strings
are released more quickly.
An example of programming logic that can hold onto strings (and other VSAM
resources) for too long is when a conversational transaction issues a STARTBR or
READNEXT and then enters a wait for terminal input without issuing an ENDBR.
The browse remains active until the ENDBR, and the VSAM strings and buffers are
retained over the terminal wait. Also, for an LSR file, the transaction continues to
hold shared control of the control interval and causes transactions that attempt to
update records in the same control interval to wait.
Similarly, transactions hold VSAM resources for too long if a READ UPDATE or
WRITE MASSINSERT is outstanding over a wait for terminal input.
Waiting on this resource can occur for files accessed in both RLS and non-RLS
mode.
For example, an EXEC CICS SET DSNAME(...) QUIESCED WAIT command. The
command generates an FCQSE containing the request and passes this into the
CFQS task. The CFQS task posts the user task when the request is completed.
The resource name gives the hexadecimal address of the FCQSE control block.
You do not see a task waiting on this resource type, because this wait occurs
during CICS initialization.
Waits on this resource type can occur for files accessed in both RLS and non-RLS
mode.
Waiting on this resource can occur for files accessed in both RLS and non-RLS
mode.
Resource type FCRDWAIT - wait for a drain of the RLS control ACB
If a task is waiting on resource type FCRDWAIT, it is waiting for completion of the
drain of the RLS control ACB following an SMSVSAM server failure.
DFHFCRD is the module that performs the drain. When the SMSVSAM server fails,
CICS must drain all RLS processing, which involves:
v Disabling further RLS access
v Preventing existing tasks from issuing further RLS requests after the server
becomes available again
v Closing all ACBs that are open in RLS mode.
The drain is carried out by the system task CSFR. This should normally complete
without problems, although it may take some time if there is a large number of files
to be closed. If a task is waiting on FCRDWAIT for a considerable length of time,
you should check whether the CSFR task is itself in a wait and therefore failing to
complete.
DFHFCRP is the module that performs most of file control initialization processing.
A dynamic RLS restart occurs when a restarted SMSVSAM server becomes
available following a failure of the previous server. If this occurs during CICS
initialization, dynamic RLS restart must wait for file control initialization to complete.
Because this wait occurs during CICS initialization, you should not be able to see a
task waiting on this resource.
A wait on resource type FCRVWAIT occurs when conflicts over shared or exclusive
locks are deferred internally by VSAM and not returned as an error condition to
CICS. Conflicts that can cause an FCRVWAIT wait are:
v A task issues a file control READ UPDATE request for a record, for which:
– Another task already holds an exclusive lock
– One or more tasks hold a shared lock.
v A task issues a file control READ request with CONSISTENT or REPEATABLE
integrity for a record, for which:
– Another task already holds an exclusive lock.
– Another task is waiting for an exclusive lock because one or more tasks may
already have a shared lock, or another task has an exclusive lock.
Waiting on this resource can occur only for files accessed in RLS mode.
Transaction IDs are retained by a task for the duration of a MASSINSERT session.
Waits on FCTISUSP should not be prolonged, and if your task stays suspended on
this resource type, it could indicate any of the following:
Waiting on this resource can occur only for files accessed in non-RLS mode.
An exclusive control wait on these resource types occurs in CICS, unlike the similar
wait on FCIOWAIT, which occurs in VSAM. See “Resource type FCIOWAIT - wait
for VSAM I/O (non-RLS)” on page 85.
| FCXCPROT or FCXDPROT waits indicate that VSAM has detected an error in the
| base cluster, AIX, or upgrade set. In these cases, it is not advisable to purge the
| requests because the dataset can be lift in an inconsistent state. Purge other tasks
| involved in the wait to allow CICS to retry the VSAM requests for those tasks with
| FCXCPROT and FCXDPROT waits.
| Unlike the FCXCSUSP and FCXDSUSP types, tasks waiting with a resource type of
| FCXCPROT or FCXDPROT will not be purged if the are suspended for longer than
| their DTIMOUT value.
If you find that exclusive control conflicts occur too often in your system, consider
changing the programming logic so that applications are less likely to have
exclusive control for long periods.
Waiting on this resource can occur only for files accessed in non-RLS mode.
The possibility that a task is deadlocked, waiting on itself or another task for release
of the control interval, is dealt with in the next section.
Similarly, a task could be made to wait on another task that has exclusive or shared
control of a VSAM control interval. If this second task was, itself, waiting for
exclusive control of a resource of which the first task has exclusive or shared
control, then both tasks would be deadlocked.
See CICS Messages and Codes for more information about these abend codes.
To resolve the problem, you must determine which program caused the potential
deadlock. Find out which programs are associated with the abending task, and
attempt to find the one in error. It is likely to be one that provides successive
browse and update facilities. When you have found the programs associated with
the task, turn to “How tasks can become deadlocked waiting for exclusive control”
for guidance about finding how the error might have occurred.
For the deadlock to occur, a transaction must first issue a VSAM READ SEQUENTIAL
request using EXEC CICS STARTBR. This is a VSAM shared control operation. It must
then issue some VSAM request requiring exclusive control of the CI without first
ending the shared control operation.
VSAM handles requests requiring exclusive control on a data set that is already
being used in shared control mode by queueing them internally. VSAM returns
control to CICS, but transactions waiting for exclusive control remain suspended.
This causes no problems. The next command at first acquires shared control while
the record is read into input-area. When an attempt is subsequently made to get
exclusive control, deadlock occurs because the task that wants exclusive control is
also the task that is preventing it from being acquired.
The following sequence of commands would not cause deadlock to occur, because
the transaction relinquishes its shared control of the CI by ending the browse before
attempting to get exclusive control of it.
The next command initially causes shared control to be acquired. The record is
read into input-area, and then exclusive control is acquired in place of shared
control.
EXEC CICS READ
FILE(myfile)
INTO(input-area)
RIDFLD(rid-area)
UPDATE
The transaction now resumes. Exclusive control is relinquished following the next
REWRITE or UNLOCK command on file myfile.
Table 7 shows the type of lock that each of the “FC” resource names represents.
Table 7. Resource/pool names and lock types
Resource or pool Lock type
name
FCDSRECD VSAM or CICS-maintained data table record
FCFLRECD BDAM or user-maintained data table record
FCDSRNGE KSDS key range
FCDSLDMD VSAM load mode
FCDSESWR ESDS write
FCFLUMTL User-maintained data table load
If transactions are commonly made to wait for this reason, you should review the
programming logic of your applications to see if the record-locking time can be
minimized.
Note that CICS only locks a record for update. Other transactions are allowed to
read the record, and this presents a potential read integrity exposure. Thus, a
transaction might read a record after an update has been made, but before the
updating transaction has reached its syncpoint. If the reading transaction takes
action based on the value of the record, the action is incorrect if the record has to
be backed out.
There is some more information about read integrity in Chapter 10, “Dealing with
incorrect output,” on page 167.
Neither BDAM nor user-maintained data tables use the “control interval” concept.
When a task reads a record for update, the record is locked so that concurrent
changes cannot be made by two transactions. If the file or data table is recoverable,
the lock is released at the end of the current unit of work. If the file or data table is
not recoverable, the lock is released on completion of the REWRITE or UNLOCK
operation.
If a second task attempts to update the same record while the first has the lock, it is
suspended on resource type ENQUEUE.
If another transaction tries to write a record in the locked key range, or delete the
record at the end of the range, it is suspended until the range lock is released. The
lock is released when the transaction holding it issues a syncpoint, ends the
mass-insert operation by issuing an UNLOCK, or changes to a different range.
When a VSAM data set is opened in load mode, only one request can be issued at
a time. If a transaction issues a WRITE request while another transaction’s WRITE
is in progress, it is suspended until the first WRITE completes.
For integrity reasons, WRITE requests to recoverable ESDS data sets must be
serialized. When a transaction issues such a request, it holds the ESDS write lock
for the time it takes to log the request, obtain a record lock, and write the data set
record. If another transaction issues a WRITE request during this period, it is
suspended until the ESDS lock is released. The lock is normally released when the
WRITE completes, but may be held until syncpoint if the WRITE fails.
When loading a user-maintained data table from its source data set, this lock is
used to serialize loading with application READ requests.
Note that the loader does not suspend a task while a program is loaded if it is the
first one to ask for that program.
If the requested program is not loaded quickly, the reasons for the wait need to be
investigated. The possible reasons for the wait, and the ways you should
investigate them are:
1. The system could be short on storage (SOS), so only system tasks can be
dispatched. To check if the system is short on storage:
| a. Use the CEMT transaction, submitting one or more of the following
| commands: CEMT I SYS SOSABOVEBAR, CEMT I SYS SOSABOVELINE or CEMT I
| SYS SOSBELOWLINE.
| b. To see if SOS has been reached too often, examine the job log, check the
| run statistics, or submit CEMT I DSAS.
If SOS has been reached too often, take steps to relieve the storage
constraints. For guidance about this, see Identifying storage stressin the CICS
Performance Guide.
2. Check for messages that might indicate that there is an I/O error on a library. If
you find a message, investigate the reason why the I/O error occurred.
3. There could be an error within MVS. Has there been any sort of message to
indicate this? If so, it is likely that you need to refer the problem to the IBM
Support Center.
A user task cannot explicitly acquire a lock on a resource, but many of the CICS
modules that run on behalf of user tasks do lock resources. If this is a genuine wait,
and the system is not just running slowly, this could indicate a CICS system error.
If you have no evidence of a hardware fault, contact the IBM Support Center and
report the problem to them.
A task may fail to run if the system has reached the maximum number of tasks
allowed, or if the task is defined in a transaction class that is at its MAXACTIVE
limit.
If a task is waiting for entry into the MXT set of transactions, the resource type is
MXT, and the resource name is XM_HELD. If a task is waiting for entry into the
MAXACTIVE set of transactions for a TCLASS, the resource type is TCLASS, and
the resource name is the name of the TCLASS that the task is waiting for.
The limit that has been reached, MXT, is given explicitly as the resource name for
the wait. If this type of wait occurs too often, consider changing the MXT limit for
your CICS system.
Transaction summary
The transaction summary lists all transactions (user and system) that currently exist.
The transactions are listed in order of task number and the summary contains two
lines per transaction.
Example
==XM: TRANSACTION SUMMARY
CSNE 00031 10106100 C Yes ACT 00000003 None n/a 10164C00 00000000 00000000 00000000 1016C058 11542054
10A34B40 01000000 1017E048 00000000 00000000 10164C00 00000000
IC06 10056 10E2B200 T No ACT 089601C7 Terminal 10E167A0 1124F600 00000000 00000000 10114023 1016C9A0 11543610
10AC9300 00000000 00000000 1017E7E0 00000000 10E0F6A0 1124F600 00000000
IC12 10058 10E34C00 SD No ACT 050601AD None n/a 001DE600 00000000 00000000 10114023 1016C9F8 11545114
10AC93C0 00000000 1017E828 00000000 10E31400 001DE600 00000000
TA03 93738 10E0E000 T No ACT 088211E3 Terminal 10ED9000 0024B000 00000000 00000000 10114023 1016C738 115437B0
10AD3D40 00000000 00000000 1017E090 00000000 10117D60 0024B000 00000000
TA03 93920 10AFF200 T No TCL 00000000 Terminal 11214BD0 00000000 00000000 00000000 10114023 00000000 00000000
10AD3D40 DFHTCL03 00000000 00000000 00000000 00000000 10117680 00000000 00000000
TA03 93960 10E2D200 T No TCL 00000000 Terminal 10E573F0 00000000 00000000 00000000 10114023 00000000 00000000
10AD3D40 DFHTCL03 00000000 00000000 00000000 00000000 10E0F6C0 00000000 00000000
TA03 93967 10AFEA00 T No TCL 00000000 Terminal 10ECCBD0 00000000 00000000 00000000 10114023 00000000 00000000
10AD3D40 DFHTCL03 00000000 00000000 00000000 00000000 10117540 00000000 00000000
TA03 94001 10E34800 T No ACT 00000000 Terminal 10E2C3F0 00000000 00000000 00000000 10114023 00000000 00000000
10AD3D40 DF(AKCC) 00000000 00000000 00000000 00000000 10E31120 00000000 00000000
TA02 95140 10E2D300 T No ACT 0386150D Terminal 10E2C5E8 00057000 00000000 00000000 10114023 1016C790 11544754
10AD3C80 00000000 00000000 1017E510 00000000 10E0F320 00057000 00000000
TA02 95175 10E12C00 T No TCL 00000000 Terminal 10E937E0 00000000 00000000 00000000 10114023 00000000 00000000
10AD3C80 DFHTCL02 00000000 00000000 00000000 00000000 10E0F100 00000000 00000000
TA02 95187 10E0B000 T No TCL 00000000 Terminal 10EA95E8 00000000 00000000 00000000 10114023 00000000 00000000
10AD3C80 DFHTCL02 00000000 00000000 00000000 00000000 10117800 00000000 00000000
TA02 95205 10E2D600 T No MXT 00000000 Terminal 10E837E0 00000000 00000000 00000000 10114023 00000000 00000000
10AD3C80 DF(AKCC) 00000000 00000000 00000000 00000000 10E0F780 00000000 00000000
TA04 96637 10E33000 T No ACT 060408E7 Terminal 10E05BD0 00057600 00000000 00000000 10114023 1016C7E8 115457C8
10AD3E00 00000000 00000000 1017E558 00000000 10E31040 00057600 00000000
F121 99305 10E2D800 T No ACT 020C1439 Terminal 10EA93F0 00060000 00000000 00000000 10114023 1016C898 115423FC
10AD3BC0 AB(AFCY) 00000000 00000000 1017E708 00000000 10E0F920 00060000 00000000
TS12 99344 10AFED00 T No MXT 00000000 Terminal 10E499D8 00000000 00000000 00000000 10114023 00000000 00000000
10AD6B40 00000000 00000000 00000000 00000000 101178C0 00000000 00000000
MXT summary
The MXT summary indicates whether CICS is currently at the maximum number of
tasks, showing the current number of queued and active transactions.
* NOTE: these values were reset at 18:00:00 (the last statistics interval collection)
A transaction class is at its MAXACTIVE limit if its ‘current active’ total is greater
than or equal to its ‘max active’ setting. If a transaction class is at its MAXACTIVE
limit, a number of transactions could be queueing in that transaction class. The
transaction id and number of each queued transaction is listed with its transaction
class (for example, transaction classes DFHCTL01, DFHCTL02, and DFHCTL03 in
Figure 13 on page 100).
*** Note that the ’Total Attaches’ figures were reset at 18:00:00 (the last statistics interval collection)
The suspended task is never resumed, and holds its MXT slot until CICS is
terminated. You must cancel CICS to remove this task as you will be unable to
quiesce the system. You cannot purge or forcepurge the task.
Enqueue deadlocks between tasks occur when each of two transactions (say, A and
B) needs an exclusive lock on a resource that the other holds already. Transaction
A waits for transaction B to release the resource. However, if transaction B cannot
release the resource because it, in turn, is enqueued on a resource held by
transaction A, the two transactions are deadlocked. Further transactions may then
queue, enqueued on the resources held by transactions A and B.
Use the following example to help you diagnose deadlocks. The scenario is that a
user of task 32 complains that a terminal is locked and is unable to enter data.
Task 32 is waiting on an enqueue Hty(ENQUEUE). You can also see that the task
is waiting for a lock on a data set record Hva(FCDSRECD). At this stage, you
cannot tell which (if any) task has control of this resource.
2. Use the command CEMT INQUIRE UOWENQ at the same terminal. This command
displays information about the owners of all enqueues held. More importantly,
for deadlock diagnosis purposes, it displays information about the tasks waiting
for the enqueues. A screen similar to the following might be displayed:
INQUIRE UOWENQ
STATUS: RESULTS
Uow(AA8E9505458D8C01) Tra(CEMT) Tas(0000025) Act Exe Own
Uow(AA8E950545CAD227) Tra(TDUP) Tas(0000028) Act Tdq Own
Uow(AA8E950545DAC004) Tra(FUPD) Tas(0000032) Act Dat Own
Uow(AA8E950545DBC357) Tra(FUPD) Tas(0000035) Act Dat Wai
Uow(AA8E97FE9592F403) Tra(FUP2) Tas(0000039) Act Dat Wai
Uow(AA8E9505458D8C01) Tra(TSUP) Tas(0000034) Ret Tsq Own
Uow(AA8E97FE9592F403) Tra(FUP2) Tas(0000039) Act Dat Own
Uow(AA8E950545DAC004) Tra(FUPD) Tas(0000032) Act Dat Wai
Uow(AA8E97FE95DC1B9A) Tra(FUPD) Tas(0000042) Act Dat Own
You can see all the enqueue owners and waiters on the same region on this
display. Tasks waiting for an enqueue are displayed immediately after the task
that owns the enqueue. Owners and waiters on other regions are not displayed.
3. If you system is busy, you can clarify the display by displaying only those
resources that the task you are interested in owns and waits for. This is called
filtering. You add a filter to the end of the command as follows: CEMT INQUIRE
UOWENQ TASK(32).
INQUIRE UOWENQ TASK(32)
STATUS: RESULTS
Uow(AA8E950545DAC004) Tra(FUPD) Tas(0000032) Act Dat Own
Uow(AA8E950545DAC004) Tra(FUPD) Tas(0000032) Act Dat Wai
You can now see that task 32 owns one enqueue but is also waiting for another.
This display shows one line of information per item, listing:
v UOW identifier
v Transaction identifier
v Task identifier
v Enqueue state (active, or retained)
v Enqueue type
v Relation (whether owner of the enqueue or waiter).
This shows you that another task, task 39, owns the enqueue that task 32 is
waiting on.
b. Find out why task 39 is holding this enqueue, using the CEMT command
again as a filter for task 39. Enter CEMT INQUIRE UOWENQ TASK(39).
This shows you that task 39 is waiting for the enqueue on record “SMITH” in
the ACCT.CICS650.ACCTFILE data set. This is the enqueue that task 32
owns.
You can now see that the deadlock is between tasks 32 and 39.
7. To confirm that your diagnosis is correct, filter by the RESOURCE and
QUALIFIER of this enqueue. This also shows that task 35 also waits on the
enqueue owned by task 32.
INQUIRE UOWENQ RESOURCE(ACCT.CICS650.ACCTFILE) QUALIFIER(SMITH)
STATUS: RESULTS
Uow(AA8E950545DAC004) Tra(FUPD) Tas(0000032) Act Dat Own
Uow(AA8E950545DBC357) Tra(FUPD) Tas(0000035) Act Dat Wai
Uow(AA8E97FE9592F403) Tra(FUP2) Tas(0000039) Act Dat Wai
You can also use the EXEC CICS INQUIRE UOWENQ command or the EXEC CICS
INQUIRE ENQ command in your applications. These return all the information that is
available under CEMT INQUIRE UOWENQ. If you wish to automate deadlock detection
and resolution, these commands are of great benefit.
Note that CEMT INQUIRE UOWENQ can be used only for files accessed in non-RLS
mode, because files accessed in RLS mode have their locks managed by VSAM,
not by CICS. Deadlock and timeout detection for files accessed in RLS mode is
also performed by VSAM.
When there is a rogue task with enqueues held, which hangs or loops but is not
subject to runaway, the entire region can halt. CPSM tries to assist in the
determination of which task to purge to free-up the system. CPSM allows you to put
out an alert when a task's suspend time is too long. Once this has occurred, you
need to find the task causing the problem. To do this:
1. Display the suspended task's details and determine what the suspend reason is.
If the suspend reason is ENQUEUE, you have to find out which enqueue is
being waited upon by this task.
2. Display the enqueues held and the one this task is waiting for using the
UOWENQ display (uow) Browse for this UOWid). From this display you can get
the enqueue name that this task is waiting for.
3. Display the details of this enqueue You are now in a position to analyze the
problem to determine the cause of the problem.
On cold start, loading the GRPLIST definitions from the CSD data set can take
several minutes. For large systems, the delay could be 20 minutes or more while
this takes place. You can tell if this stage of initialization has been reached because
you get this console message:
DFHSI1511 INSTALLING GROUP LIST xxxxxxxx
On warm start, there may be a considerable delay while resource definitions are
being created from the global catalog.
You can find out if this has happened by taking an SDUMP of the CICS region.
Format the dump using the keywords KE and DS, to get the kernel and dispatcher
task summaries.
Consider, too, whether any first-or second-stage program list table (PLT) program
that you have written could be in error. If such a program does not follow the strict
protocols that are required, it can cause CICS to stall. For programming information
about PLT programs, see the CICS Customization Guide.
Look first on your MVS console for any messages. Look particularly for messages
indicating that operator intervention is needed, for example to change a tape
volume. The action could be required on behalf of a CICS task, or it could be for
any other program that CICS interfaces with.
If the CPU usage is low, CICS is doing very little work. Some of the possible
reasons are:
v The system definition parameters are not suitable for your system.
v The system is short on storage, and new tasks cannot be started. This situation
is unlikely to last for long unless old tasks cannot, for some reason, be purged.
v The system is at one of the MXT or transaction class limits, and no new tasks
can be attached. In such a case, it is likely that existing tasks are deadlocked,
and for some reason they cannot be timed out.
v There is an exclusive control conflict for a volume.
v There is a problem with the communications access method.
v There is a CICS system error.
The way you can find out if any of these apply to your system is dealt with in the
information that follows. For some of the investigations, you will need to see a
system dump of the CICS region. If you do not already have one, you can request
one using the MVS console. Make sure that CICS is apparently stalled at the time
you take the dump, because otherwise it will not provide the evidence you need.
Format the dump using the formatting keywords KE and XM, to get the storage
areas for the kernel and the transaction manager.
For more details about the choice of these and other system definition parameters,
see the Improving the performance of a CICS system in the CICS Performance
Guide.
Look first at the transaction manager summary in the formatted system dump.
Investigate the tasks accepted into the MXT set of tasks to see if they are causing
the problem. XM dump formatting formats the state of MXT and provides a
summary of the TCLASSes and of the transactions waiting for acceptance into each
TCLASS.
If you find that you cannot use the CEMT transaction, it is likely that the system is
already in the second stage of termination. CEMT cannot be used beyond the first
stage of termination.
Note: Even if CEMT is not included in the transaction list table (XLT), you can still
use it in the first stage of termination.
The action to take next depends on whether you can use the CEMT transaction,
and if so, whether or not there are current user tasks.
v If you can use the CEMT transaction:
The major dispatcher functions associated with the suspension and subsequent
resumption of tasks are described in detail in the CICS Diagnosis Reference. You
can use trace to see the dispatcher functions that are requested, and the values of
parameters that are supplied. See “Investigating waits using trace” on page 51.
Some of the dispatcher functions are available to users through the exit
programming interface (XPI). If you have any applications using these XPI
functions, make sure that they follow the rules and protocols exactly. For
programming information about the XPI, see the the CICS Customization Guide.
If you want guidance about using online or offline techniques to investigate waits,
see “Techniques for investigating waits” on page 50.
If you already know the identity of the resource that a task is waiting for, but are not
sure what functional area of CICS is involved, see Table 9 on page 110. It shows
you where to look for further guidance.
Throughout this section, the terms “suspension” and “resumption” and “suspended”
and “resumed” are used generically. Except where otherwise indicated, they refer to
any of the SUSPEND/RESUME and WAIT/POST processes by which tasks can be
made to stop running and then be made ready to run again.
The remaining resources are used only by CICS system tasks. If you have
evidence that a system task is waiting on such a resource, and it is adversely
affecting the operation of your system, you probably need to contact your IBM
Support Center. Before doing so, however, read “CICS system task waits” on page
139.
Table 9. Resources on which a suspended task might be waiting
Resource type Purge Resource Suspending DSSR call and Task Where to look next
status name module WLM wait type
(none) (none) DFHDUIO WAIT_MVS IO System “CICS system task
only waits” on page 139
(none) (none) DFHRMSL7 WAIT_MVS TIMER System “CICS system task
only waits” on page 139
(none) (none) DFHZNAC SUSPEND See System “CICS system task
note 1 on page 122 only waits” on page 139
(none) DLCNTRL DFHDBCT WAIT_MVS See System “CICS system task
note 1 on page 122 only waits” on page 139
(none) DLCONECT DFHDBCON WAIT_MVS System “CICS system task
OTHER_ only waits” on page 139
PRODUCT
(none) DMWTQUEU DFHDMWQ SUSPEND MISC System “CICS system task
only waits” on page 139
(none) No, No LMQUEUE DFHLMLM SUSPEND LOCK User “Investigating lock
manager waits” on
page 94
ADAPTER No, No FEPI_RQE DFHSZATR WAIT_MVS MISC User See note 2 on page
123
ALLOCATE Yes, Yes TCTTETI DFHALP SUSPEND See User “Interregion and
value note 3 on page 123 intersystem
communication
waits” on page 133
ALP_TERM (none) DFHALRC WAIT_OLDC MISC System “Recovery manager
only waits” on page 140
Any_MBCB No, No transient DFHTDB SUSPEND IO User “Transient data
data queue DFHTDRM waits” on page 133
name
Any_MRCB No, No transient DFHTDB SUSPEND IO User “Transient data
data queue DFHTDRM waits” on page 133
name
AP_INIT ECBTCP DFHAPSIP WAIT_OLDC MISC System “CICS system task
only waits” on page 139
AP_INIT SIPDMTEC DFHAPSIP WAIT_MVS MISC System “CICS system task
only waits” on page 139
DFHFCVR
FCQUIES Yes, Yes fcqse_ptr DFHFCQI SUSPEND See User “Investigating file
(hexa- note 1 on page 122 control waits” on
decimal) page 80
FCRAWAIT Yes, Yes FC_FILE DFHEIFC WAIT_OLDC MISC User “Investigating file
control waits” on
page 80
FCRBWAIT Yes, Yes file ID DFHFCFR WAIT_OLDC IO User “Investigating file
control waits” on
page 80
FCRDWAIT No, No *CTLACB* DFHFCRC WAIT_OLDC MISC System or “Investigating file
WAIT_OLDC MISC user control waits” on
DFHFCRR page 80
FCRPWAIT FC-START DFHFCRR WAIT_OLDC MISC System “Investigating file
only control waits” on
page 80
FCRRWAIT *DYRRE* DFHFCRR WAIT_OLDC MISC System “Investigating file
only control waits” on
page 80
FCRVWAIT No, No file ID DFHFCRV WAIT_MVS User “Investigating file
OTHER_ control waits” on
PRODUCT page 80
FCSRSUSP Yes, Yes file ID DFHFCVR SUSPEND IO User “Investigating file
control waits” on
page 80
FCTISUSP Yes, Yes file ID DFHFCVR SUSPEND IO User “Investigating file
control waits” on
page 80
FCXCSUSP and Yes, Yes file ID DFHFCVS WAIT_OLDC IO User “Investigating file
FCXDSUSP control waits” on
page 80
| FCXCPROP and No, No file ID DFHFCVS WAIT_OLDC IO User “Investigating file
| FCXDPROP control waits” on
| page 80
FEPRM No, No SZRDP DFHSZRDP WAIT_MVS MISC CSZI See note 2 on page
123
FOREVER No, No DFHXMTA DFHXMTA WAIT_MVS MISC User “A user task is
waiting on resource
type FOREVER” on
page 100
ICEXPIRY DFHAPTIX DFHAPTIX SUSPEND TIMER System “CICS system task
only waits” on page 139
ICGTWAIT Yes, Yes terminal ID DFHICP SUSPEND MISC User “Investigating
interval control
waits” on page 74
Note:
1. The MVS workload manager monitoring environment is set to
STATE=IDLE when either:
Dispatcher waits
There are five reasons why CICS dispatcher might cause tasks to wait, and the
resource names or resource type associated with these are:
v JVM_POOL
v OPENPOOL
v OPEN_DEL
v DSTSKDEF
v SOSMVS
When a task first needs a J8 or J9 mode open TCB, the dispatcher domain
attempts to find a free TCB from the JVM pool. If there is not a free J8 or J9 mode
TCB, and the number of open TCBs in the JVM pool is less than MAXJVMTCBS,
CICS attaches a new TCB, and allocates this to the requesting task.
However, if the number of J8 and J9 TCBs in the pool is at the limit set by
MAXJVMTCBS, dispatcher places the requesting task onto a queue and the task is
suspended (using suspend token AWAITING_OPEN_TCB_TOKEN in the DS task
block). When an open TCB becomes free, or the MAXJVMTCBS limit is raised, the
task at the front of the queue is resumed, and the open TCB allocation process is
retried.
When a task first needs an L8 mode open TCB, the dispatcher domain attempts to
find a free TCB of this mode with the correct subspace attributes. If there is not a
free L8 mode TCB associated with a matching subspace, CICS:
v Attaches a new L8 mode TCB of the required subspace if the number of open
TCBs in the L8 open TCB pool is less than MAXOPENTCBS, and allocates the
new TCB to the requesting task.
v Detaches a free open L8 mode TCB associated with a different subspace (if
there is one available and MAXOPENTCBS limit has been reached), attaches a
new L8 mode TCB, and allocates this new TCB to the requesting task. This
process is referred to as TCB stealing: deleting a free TCB of one type in order
to attach one of a different type.
However, if neither of these options is available, dispatcher places the requesting
task onto a queue and the task is suspended (using suspend token
AWAITING_OPENPOOL_TOKEN in the DS task block). When an open TCB
becomes free, or the MAXOPENTCBS limit is raised, the task at the front of the
queue is resumed, and the open TCB allocation process is retried.
Your task needs an open TCB, but no suitable TCB is available, and a new TCB
cannot be attached because the system is constrained by the MAXOPENTCBS
limit. In this situation, CICS selects a currently idle TCB for termination, to allow the
task to attach a TCB of the required type. However, the attach cannot proceed until
the deleted TCB's termination is complete, otherwise the number of open TCBs in
the L8 pool would temporarily exceed MAXOPENTCBS.
A task waiting on the resource type DSTSKDEF is not suspended. Task attach has
added the new task to the dispatcher chain and it is waiting for first dispatch. The
task could be waiting for a dump to complete, for example.
The number of open TCBs (J8 or J9 mode TCB) in the JVM pool is constrained by
the MAXJVMTCBS system initialization parameter. If you set a MAXJVMTCBS limit
that is too high, CICS might attempt to create too many JVMs for the available MVS
storage, resulting in an MVS storage constraint.
CICS has a storage monitor for MVS storage, which notifies it when MVS storage is
constrained or severely constrained, so that it can take short-term action to reduce
the number of JVMs in the JVM pool. As JVMs make requests for MVS storage, the
storage monitor checks whether the availability of MVS storage has dropped below
a pre-set threshold of 40MB, and notifies CICS when this is the case. The storage
monitor also notifies CICS if the availability of MVS storage has become so low that
MVS storage requests can only be satisfied from a pre-set MVS storage cushion of
20MB.
When the storage cushion is breached and so MVS storage is severely constrained,
CICS temporarily prevents the creation of new JVMs for incoming requests, and
behaves as though the MAXJVMTCBS limit has been reached and the JVM pool is
full. In this situation, if the storage monitor is still receiving requests from CICS to
create JVMs, it queues any such requests that cannot obtain sufficient MVS
storage. These requests are suspended with a resource name of SOSMVS.
The CICS task has an open TCB but is waiting for a DB2 connection to become
available to use with the open TCB. This indicates that the TCBLIMIT value has
been reached, which limits the number of open TCBs (and hence connections) that
can be used to access DB2. The CICS task must wait for a connection to be freed
by another TCB running on behalf of another CICS task, after which it may use the
freed DB2 connection with its own TCB.
You can increase the number of open TCBs permitted to access DB2 with a SET
DB2CONN TCBLIMIT command. If you increase the TCBLIMIT value, CICS posts
tasks to retry acquisition of a DB2 connection.
The task is waiting for a thread to become available. The resource name details the
DB2ENTRY or pool for which there is a shortage of threads.
You cannot purge the task when it is in this state. Message DFHAP0604 is issued
at the console if an attempt to forcepurge the task is made. Forcepurge processing
is deferred until a thread is acquired.
You can increase the number of threads available for the DB2ENTRY with a SET
DB2ENTRY ( ) THREADLIMIT(nn) command. You can increase the number of
threads available for the pool with a SET DB2CONN THREADLIMIT(nn) command.
If you increase the THREADLIMIT value, CICS posts tasks to retry acquisition of a
thread.
DFHD2IN1 (CICS DB2 initialization program) issues the wait for DFHD2IN2 to
complete.
A SET DB2CONN NOTCONNECTED command has been issued with the WAIT or FORCE
option. DFHD2TM waits for the count of tasks using DB2 to reach zero.
A SET DB2ENTRY DISABLED command has been issued with the WAIT or
FORCE option. DFHD2TM waits for the count of tasks using the DB2ENTRY to
reach zero.
DBCTL waits
Read this section if you have any of the following problems:
v You have attempted to connect to DBCTL using the CICS-supplied transaction
CDBC, but the connection process has failed to complete.
v You have a user task in a wait state, and you have found that it is waiting on
resource type DBCTL, with resource name DLSUSPND.
v You have attempted to disconnect from DBCTL using the CICS-supplied
transaction CDBC, but the disconnection process has failed to complete.
In phase 2, IMS processes the request asynchronously, and returns to CICS when
connection is complete. Until the connection is complete, you see this status
message displayed whenever you inquire with CDBI:
DFHDB8292I DBCTL connect phase 2 in progress.
If this phase fails to complete, the failure is associated with IMS. See the IMS
Diagnosis Guide and Reference manual guidance about debugging the problem.
If disconnection fails to complete, you can inquire on CDBT using, for example,
CEMT INQ TASK to see how far disconnection has progressed. You will probably
find that CDBT is waiting on resource type DBCTL and resource name DLSUSPND,
in which case the request is being processed by DBCTL.
v If CDBT is waiting on DBCTL, what you do next depends on whether you have
requested “orderly” or “immediate” disconnection.
– If you have requested “orderly” disconnection, it is likely that DBCTL is waiting
for conversational tasks to finish. You can override an “orderly” disconnection
by requesting “immediate” disconnection, in which case the process should
end at once.
– If you have requested “immediate” disconnection, and this does not happen,
there is an unexpected wait within IMS. See the IMS Diagnosis Guide and
Reference for guidance about investigating the problem.
v If CDBT is not waiting on DBCTL, this indicates a problem with CICS code.
Contact the IBM Support Center for further assistance.
EDF waits
A user task is made to wait on resource type EDF and resource name DBUGUSER
when, under the EDF session, CICS has control for EDF processing.
The journal name, given as the resource name, refers to the last element of the
MVS log stream name. For example, in the log stream name
If the task is writing to a journal on an SMF log, the journal name is the name of the
journal.
The task is the first task to request that the currently active log buffer be flushed.
The task waits for 30 milliseconds to allow other tasks to append more records to
the buffer.
The task is waiting for the flush of a log buffer to complete. It is resumed by the
task that performs the flush operation.The task can be purged if the log stream is
not DFHLOG, the primary system log.
During an INITIAL start of CICS, CICS calls the MVS system logger macro
IXGDELET ALL. CICS waits until the MVS system logger posts the ECB.
During keypoint processing, CICS calls the MVS system logger macro IXGDELET
RANGE. CICS waits until the MVS system logger posts the ECB.
During an emergency restart of CICS, or transaction backout, CICS calls the MVS
system logger macro IXGBRWSE END. CICS waits until the MVS system logger
posts the ECB.
During an emergency restart of CICS, CICS calls the MVS system logger macro
IXGBRWSE END. CICS waits until the MVS system logger posts the ECB.
During an emergency restart of CICS, or transaction backout, CICS calls the MVS
system logger macro IXGBRWSE READBLOCK. CICS waits until the MVS system
logger posts the ECB.
During an emergency restart of CICS, CICS calls the MVS system logger macro
IXGBRWSE READCURSOR. CICS waits until the MVS system logger posts the
ECB.
During an emergency restart of CICS, or transaction backout, CICS calls the MVS
system logger macro IXGBRWSE START. CICS waits until the MVS system logger
posts the ECB.
During an emergency restart of CICS, CICS calls the MVS system logger macro
IXGBRWSE START. CICS waits until the MVS system logger posts the ECB.
In several situations, CICS calls the MVS system logger macro IXGWRITE. CICS
waits until the MVS system logger posts the ECB.
KC_ENQ indicates that CICS code acting for a task has issued an EXEC CICS ENQ
command or a DFHKC TYPE=ENQ macro. If there is an extended wait for no
apparent reason, this might indicate an error within CICS. If that turns out to be the
case, contact the IBM Support Center.
USERWAIT indicates that a task has issued an EXEC CICS WAIT EVENT EXTERNAL or
an EXEC CICS WAITCICS command.
EKCWAIT indicates that a task has issued an EXEC CICS WAIT EVENT command.
If the wait is prolonged, you should identify the event being waited on, and:
v Check that the EXEC CICS WAIT EVENT command specified the correct event.
v Check for problems with the task that should be completing the work for the
specified event. It might be waiting or looping, it might have a performance
problem, or it might have failed completely.
If the resource type is EKCWAIT and the EXEC CICS WAIT EVENT command included
the NAME option, the specified name is the resource name. For programming
information about the NAME option of the WAIT EVENT command, see the CICS
Application Programming Reference.
If the resource name for the wait is SINGLE, CICS, or LIST, look at the entry in the
SUSPAREA column of the dispatcher summary in the dump. The type of value it
contains depends on the resource name:
v For SINGLE or CICS, it is the address of an ECB
v For LIST, it is the address of a list of ECBs.
(The contents of the SUSPAREA entry are not significant for TERMINAL, because
this type of wait is subject to the dispatcher RESUME function. For more
information about debugging terminal waits, see “Investigating terminal waits” on
page 57.)
Check the contents of the SUSPAREA entry. Does it contain a valid address? That
is, is it within the CICS address space, and actually pointing at an ECB, or a list of
ECBs?
If you find an invalid address: It is possible that a storage overlay is the cause of
the wait problem. If you suspect this to be the case, turn to Chapter 11, “Dealing
with storage violations,” on page 191 for further advice. However, note that this is
likely to be a “random” overlay, and such problems are often very difficult to solve.
From the kernel information in the dump, find out which code issued the DFHKC
macro call. If you think that CICS has passed an incorrect address, contact the IBM
Support Center, and report the problem to them.
If you find a valid address: Consider what area the ECB is in. Does the position
of the ECB, and its environment, suggest that it relates to a resource whose
availability you can control? If so, you might be able to solve the problem by
redefining the quantity of that resource.
If the ECB does not lie within an area that you can control, refer the problem to the
IBM Support Center.
Typically, tasks are made to wait on KC_ENQ when they make certain types of file
control request, if the file is already in use. These are the cases:
v The waiting task has attempted to change the state of a file that is in use.
Another task has already attempted to change the state of the same file, and is
Resource type ZC
The task waits for the time specified in the RTIMOUT value of the profile used by
the transaction. If the task times out, it receives either an AKCT or AZCT abend.
v If your task is waiting on a resource name of DFHZEMW1, the error message
writer module, DFHZEMQ, is waiting for the completion of I/O. If a timeout value
exists and is exceeded, the suspend expires.
v If your task is waiting on a resource name of DFHZRAQ1, this means a READ
has been issued. The task is resumed once the I/O operation is complete. If a
timeout value exists and is exceeded, the suspend expires.
v If your task is waiting on a resource name of DFHZRAR1, this means a READ
has been issued. The task is resumed once the I/O operation is complete. If a
timeout value exists and is exceeded, the suspend expires.
DFHZSLS has to set the TCT prefix VTAM fields from the ACB. This wait is issued
to ensure that these fields are set before being used.
DFHZGIN issues the VTAM INQUIRE macro and waits until VTAM completes
execution of this request.
Suspends on resource type ZCIOWAIT occur when the task is waiting for some
terminal I/O. Once the expected I/O event occurs, the task is resumed.
The XRF queue organizer, DFHZXQO, waits for the posting of TCAICTEC and
XQOVECTE which happens when the queue is emptied.
The XRF session tracker, DFHZXST, waits for the posting of TCAICTEC and
TCTVXPLE which happens when the session tracking queue is emptied.
Consider defining a greater number of sessions, which should solve the problem.
For guidance about this, see the CICS Intercommunication Guide.
The method of debugging is the same in each case. You need to consider the
access method, terminal control, and the “terminal” itself.
For interregion and intersystem communication, the remote region or system is the
terminal. Its status can be found using the same online or offline techniques that
you would use to find the status of a physical terminal. The status may lead you to
suspect that the task running in the remote region is the cause of the problem, and
you then need to investigate why that task is waiting. So you could find that what
started as a terminal wait might, after all, be a wait on some other type of resource.
IIOP waits
A request receiver DFHIIRR task suspends with resource type IIRR and resource
name SOCBNOTI when it is waiting for input from the client or a reply from the
request processor and the TCPIP connection is still open.
It is resumed by a NOTIFY gate when IIRR is told there is another request from the
sockets domain or a reply has come in from the request streams domain.
SOCFNOTI is similar to SOCBNOTI, except that it also indicates that a client has
sent parts of a GIOP fragment but has not yet sent the final fragment.
A request processor DFHIIRP task suspends with resource type IIRP and resource
name NOTI when it is waiting for requests or replies. It is resumed by a NOTIFY
gate when IIRP is told there is another request or reply from the request streams
domain.
The resource types that might be associated with the wait are described in the
following information. Note that the resource name is the transient data queue
name, except in the case of TD_INIT, whose resource name is DCT.
You are unlikely to see any evidence for this type of wait, unless you have trace
running during initialization with DS level-1 tracing selected. An error at this stage
would be likely to cause CICS to stall (see “CICS has stalled during initialization” on
page 104), or to terminate abnormally.
For more details of the properties of recoverable transient data queues, see the
CICS Resource Definition Guide.
If you have a task suspended on resource type ENQUEUE, and a value of TDNQ,
the task has been suspended while attempting to read, write or delete a logically
recoverable queue because a required enqueue is currently held by another task.
Note: For general information about dealing with enqueue waits, see “Investigating
enqueue waits” on page 71. Issuing a CEMT INQUIRE UOWENQ command
reveals the name of the queue and whether the enqueued read or write is
required by the task. If the task is enqueued against the read end of the
queue, a qualifier of FROMQ is displayed on the CEMT INQUIRE UOWENQ
screen. If the task is enqueued against the write end of the queue, a qualifier
of TOQ is displayed on the CEMT INQUIRE UOWENQ screen.
If you want to delete a queue, both the read and the write enqueues must be
obtained. No task may, therefore, read or write to a queue while a delete operation
is in progress. A delete cannot proceed until any task currently reading has
completed its read or any task writing has committed its changes.
In general, a wait on a resource type of ENQUEUE should not last for long unless
the task owning the enqueue has been delayed. If the UOW that owns the enqueue
has suffered an indoubt failure, the UOW is shunted. If the queue accessed by this
UOW is defined as WAIT=YES and WAITACTION=QUEUE, the wait can last for a
long period of time. To deduce if an indoubt failure has occurred:
v Issue a CEMT INQUIRE UOWENQ command to display the name of the enqueue
owner.
v Issue a CEMT INQUIRE UOW command to see if the UOW is shunted.
A task can read from a queue while another task is writing to the same queue. If
this happens, the first task holds the read enqueue and the second task holds the
write enqueue on the queue. The task reading the queue can only read data that
has already been committed. It cannot read data that is currently being written to
the queue until the task holding the write enqueue commits the changes it has
made and dequeues from the write end of the queue.
In most cases, the suspended task will not have to wait long. A lengthy wait can
occur if the task owning the write enqueue suffers from an indoubt failure (which
If you do not want to wait for data to be committed to the queue, code
NOSUSPEND on the READQ TD request. QBUSY is returned to the application
and the task does not wait.
This type of wait shows that all the transient data I/O buffers are in use, and the
task resumes only when one becomes available.
Tasks are only likely to wait in this way in a heavily loaded system.
The reason for this type of wait is best illustrated by example, as follows:
1. Task #1 issues a transient data request that requires access to an intrapartition
queue. Before the request can be serviced, task #1 must be assigned a
transient data I/O buffer that is not currently being used by any other task.
I/O buffers each contain a copy of a control interval (CI) from a data set. Each
CI contains records that correspond to elements in an intrapartition queue. A
search is made to see if the CI required for task #1 is already in one of the I/O
buffers. If it is, that I/O buffer can be used to service the request made by task
#1, and no VSAM I/O is involved. If it is not, task #1 is allocated any buffer, so
the required CI can be read in. The current contents of the buffer is overwritten.
An I/O buffer can have a R/O (read only) status or a R/W (read/write) status. If
the buffer that is allocated to task #1 has R/W status, it contains a copy of a CI
that has been updated by some other task, but not yet written back to the data
set. Before the buffer can be used by task #1, the CI it contains must be
preserved by writing it back to the data set.
For example:
v Terminals with backup sessions can be switched while the active system is
running, provided the CICS availability manager (CAVM) has initiated a takeover.
v Passively-shared data sets must not be opened until it is known that the active
system has terminated.
Note: This is the only way an alternate system can be sure that no more data
will be written by an active system.
v Resource managers, such as transient data, temporary storage, and database
recovery control (DBRC), rely on the time-of-day clock providing them with a
nondecreasing value to ensure the proper management of their resources. The
alternate system must not restart a resource manager until the alternate
time-of-day clock has been synchronized with the active time-of-day clock.
A system task that issued a takeover request to CAVM waits on the ECB
WCSTCECB in the CAVM static control block (DFHWCGPS) until CAVM has
decided to accept or reject the request. The DFHKC TYPE=WAIT and DCI=SINGLE
requests are issued in DFHWSRTR. The CAVM TCB posts WCSTCECB in either
DFHWSTKV (the normal case) or DFHWSSOF (CAVM failure).
The following ECBs each represent an event. The ECBs are located in the static
storage for DFHXRP. The ECBs and the events are:
v XRSTIECB—the CAVM has initiated a takeover.
v XRSIAECB—the alternate system is now the incipient active system.
v XRSTCECB—the active system is known to have terminated.
v XRSSSECB—the time-of-day clock is synchronized with active system sign off.
v XRSSTECB—the time-of-day clock is synchronized with active system
termination.
XRSTIECB
This ECB is posted by DFHXRA, following a successful call to the CAVM to initiate
takeover. Once the ECB has been posted, DFHXRA attaches a system transaction
to initiate the switch of terminals with backup sessions. DFHXRA is called from
either the surveillance task (DFHXRSP), or the console communication task
(DFHXRCP). No tasks wait for XRSTIECB to be posted.
The XRSIAEB ECB is posted by DFHXRA, following notification by the CAVM that
an alternate system is now the incipient active system. DFHXRA is called from the
surveillance task (DFHXRSP). No tasks wait for XRSIAECB to be posted.
XRSTCECB
DFHXRA is called from the surveillance task (DFHXRSP). Only one task, the
system initialization task (DFHSII1), waits for XRSTCECB to be posted. When the
ECB is posted, DFHSII1 opens the restart data set, for DFHRC use as well as for
DFHCC use, and then calls DFHXRA to post the XRSRAECB.
XRSRAECB
The XRSRAECB ECB is posted by DFHXRA once the restart data set has been
opened, for DFHRC use as well as for DFHCC use. DFHXRA is called from the
system initialization task (DFHSII1). Two tasks wait for XRSRAECB to be posted:
v The transient data recovery task (DFHTDRP) initializes the entry for the CXRF
queue before waiting for XRSRAECB to be posted. When the ECB is posted,
DFHTDRP resumes emergency restart processing.
v The terminal control recovery task (DFHTCRP) drains its tracking queue before
waiting for XRSRAECB to be posted. When the ECB is posted, DFHTCRP
resumes emergency restart processing.
XRSSSECB
The XRSSSECB ECB is posted by DFHXRA following notification by the CAVM that
the time-of-day clock is synchronized with active sign off. DFHXRA is called from
the surveillance task (DFHXRSP). No tasks wait for XRSSSECB to be posted.
XRSSTECB
Only the system initialization task, DFHSII1, waits for XRSSTECB to be posted.
You are only likely to find either of the CICS-supplied transactions CEDA or CESN
waiting on a resource type of XRPUTMSG, and only during XRF takeover by the
alternate CICS system. It can indicate either of these conditions:
v Data that is required by the transactions is held on a volume subject to MVS
RESERVE locking, and another job currently has the lock.
v There is an error in the CICS availability manager.
Note: You cannot get online information about waiting system tasks from CEMT INQ
TASK or EXEC CICS INQUIRE TASK.
If a system task is in a wait state, and there is a system error preventing it from
resuming, contact your IBM Support Center. However, do not assume that there is a
system error unless you have other evidence that the system is malfunctioning.
Other possibilities are:
v Some system tasks are intended to wait for long periods while they wait for work
to do. Module DFHSMSY of storage manager domain, for example, can stay
suspended for minutes, or even hours, in normal operation. Its purpose is to
clean up storage when significant changes occur in the amount being used, and
that might happen only infrequently in a production system running well within its
planned capacity.
v System tasks perform many I/O operations, and they are subject to constraints
like string availability and volume and data set locking. In the case of tape
volumes, the tasks can also be dependent on operator action while new volumes
are mounted.
If, in addition to the waiting system task, you think you have enough evidence that
shows there is a system error, contact your IBM Support Center.
FEPI waits
This section outlines the CICS waits that FEPI issues.
It is possible for a FEPI_RQE wait to be outstanding for a long time, such as when
awaiting a flow from the back-end system that is delayed due to network traffic. It is
If the Resource Manager abends, then any active CICS FEPI transactions are left
waiting on the FEPI_RQE resource. Because the Resource Manager is absent,
these waits never get posted, so the transactions suspend. You must issue a CEMT
SET TASK FORCEPURGE command to remove these suspended transactions from
the system.
If such a task does remain suspended for a long time after CICS initialization
completes, there is probably an error in CICS. Contact your IBM Support Center.
A suspend can occur on the CICS WEB attach transaction after it has attached its
partner (WEB alias) transaction. This suspend only occurs if the client socket is
using SSL to communicate with CICS. The suspend is resumed when the WEB
alias transaction terminates.
The list of symptoms are described in “Loops” on page 13. If a loop does not
terminate, it could be that the termination condition can never occur, or it might not
be tested for, or the conditional branch could erroneously cause the loop to be
executed over again when the condition is met.
This section outlines procedures for finding which programs are involved in a loop
that does not terminate. It contains the following topics:
v “What sort of loop is indicated by the symptoms?”
v “Investigating lock manager waits” on page 94
v “Investigating loops that are not detected by CICS” on page 155
v “What to do if you cannot find the reason for a loop” on page 157
If you find that the looping code is in one of your applications, you need to check
through the code to find out which instructions are in error. If it looks as if the error
is in CICS code, you probably need to contact the IBM Support Center.
Some CICS domains can detect loops in their own routines, and let you know if one
is suspected by sending the following message:
DFHxx0004 applid A possible loop has been detected at offset X’offset’ in module modname
The two characters xx represent the two-character domain index. If, for example,
monitoring domain had detected the loop, the message number would be
DFHMN0004. If you see this sort of message repeatedly, contact the IBM Support
Center.
Figure 14 on page 142 gives an example of code containing a simple tight loop.
CICS can detect some looping tasks by comparing the length of time the tasks
have been running with the runaway time interval, ICVR, that you code in the
system initialization table. If a task runs for longer than the interval you specify,
CICS regards it as “runaway” and causes it to abend with an abend code of AICA.
However, in some cases, CICS requests that are contained in the looping code can
cause the timer to be reset. Not every CICS request can do this; it can only happen
if the request can cause the task to be suspended. Thus, if the looping code
contains such a request, CICS cannot detect that it is looping.
The properties of the different types of loop, and the ways you can investigate
them, are described in the sections that follow.
If the tasks run for longer than the interval you specify, CICS regards them as
“runaway” and causes them to abend with an abend code of AICA.
Note: If you make the ICVR value equal to 0, runaway task detection is disabled.
Runaway tasks can then cause the CICS region to stall, meaning that CICS
must be canceled and brought up again. You might choose to set ICVR to
zero in test systems, because of the wide variation in response times.
However, it is usually more advisable to set ICVR to a large value in test
systems.
A tight loop is one involving a single program, where the same instructions are
executed repeatedly and control is never returned to CICS. In the extreme case,
there could be a single instruction in the loop, causing a branch to itself.
A non-yielding loop is also contained in a single program, but it differs from a tight
loop in that control is returned temporarily from the program to CICS. However, the
CICS routines that are invoked are ones that neither suspend the program nor pass
control to the dispatcher. The CICS commands that do not cause tasks to wait
include (but are not restricted to) ASKTIME, DEQ, ENQ, ENTER TRACENUM,
FREEMAIN, HANDLE, RELEASE, TRACE ON/OFF. Whether a command allows
the ICVR to be reset might also depend on other factors. For instance, a
PROCEDURE DIVISION.
EXEC CICS
HANDLE CONDITION ERROR(ERROR-EXIT)
ENDFILE(END-MSG)
END-EXEC.
ROUTE-FILE.
EXEC CICS
ROUTE INTERVAL(0)
LIST(TERM-ID)
END-EXEC.
NEW-LINE-ATTRIBUTE.
EXEC CICS
ASKTIME
END-EXEC.
GO TO NEW-LINE-ATTRIBUTE.
MOVE LOW-VALUES TO PRNTAREA.
MOVE DFHBMPNL TO PRNTAREA.
If you have a transaction that repeatedly abends with an abend code of AICA, first
make sure the ICVR value has not been set too low. If the value seems reasonable,
read “Investigating loops that cause transactions to abend with abend code AICA”
on page 144 for advice on determining the limits of the loop.
If you have a stalled CICS region, diagnose the problem using the techniques in
“What to do if CICS has stalled” on page 104. Check if the ICVR value has been
set to zero. If it has, change the value and try to cause a transaction to abend with
a code of AICA.
Yielding loops
Yielding loops are characterized by returning control at some point to a CICS
routine that can suspend the looping task. However, the looping task is eventually
resumed, and so the loop continues.
CICS is unable to use the runaway task timer to detect yielding loops, because the
timer is reset whenever the task is suspended. Thus, the runaway task time is
unlikely ever to be exceeded, and so the loop goes undetected by the system.
Figure 16 on page 144 shows a specific example of a yielding loop within a single
program. This code issues the SUSPEND command, which is always a yielding
type of command. Every time SUSPEND is issued, the dispatcher suspends the
task issuing the request, and sees if any other task of higher priority can run. If no
PROCEDURE DIVISION.
EXEC CICS
HANDLE CONDITION ERROR(ERROR-EXIT)
ENDFILE(END-MSG)
END-EXEC.
ROUTE-FILE.
EXEC CICS
ROUTE INTERVAL(0)
LIST(TERM-ID)
END-EXEC.
NEW-LINE-ATTRIBUTE.
EXEC CICS
SUSPEND
END-EXEC.
GO TO NEW-LINE-ATTRIBUTE.
MOVE LOW-VALUES TO PRNTAREA.
MOVE DFHBMPNL TO PRNTAREA.
output, or excessive use of storage. A fuller description of what to look out for is
given in “Loops” on page 13.
If you suspect that you have a yielding loop, turn to “Investigating loops that are not
detected by CICS” on page 155 for further guidance.
Both a tight loop and a non-yielding loop are characterized by being confined to a
single user program. You should know the identity of the transaction to which the
program belongs, because it is the transaction that abended with code AICA when
the runaway task was detected.
1. Get the documentation you need.
2. Look at the evidence.
3. Identify the loop, using information from the trace table and transaction dump.
4. Determine the reason for the loop.
A tight loop is unlikely to contain many instructions, and you might be able to
capture all the evidence you need from the record of events in the internal trace
table. A non-yielding loop may contain more instructions, depending on the EXEC
CICS commands it contains, but you might still be able to capture the evidence you
need from the record of events in the internal trace table. If you find that it is not big
enough, direct tracing to the auxiliary trace destination instead.
1. You need to trace CICS system activity selectively, to ensure that most of the
data you obtain is relevant to the problem. Set up the tracing like this:
a. Select level-1 special tracing for AP domain, and for the EXEC interface
program (EI).
b. Select special tracing for just the task that has the loop, and disable tracing
for all other tasks by turning the master system trace flag off.
You can find guidance about setting up these tracing options in Chapter 15,
“Using traces in problem determination,” on page 223.
2. Start the task, and wait until it abends AICA.
3. Format the CICS system dump with formatting keywords KE and TR, to get the
kernel storage areas and the internal trace table. (See “Formatting system
dumps” on page 279.)
You now have the documentation you need to find the loop.
If you find that the loop is within CICS code, you need to contact the IBM Support
Center. Make sure you keep the dump, because the Support Center staff need it to
investigate the problem.
If the kernel linkage stack entries suggest that the loop is in your user program, you
next need to identify the loop.
Note: It is possible that the loop was contained entirely within a module
owned by CICS or some other product, and your program was not
responsible for it at all. If you find that the loop is contained within
CICS code, contact the IBM Support Center.
c. If the PSW does point to a module outside your application program, find the
address of the return point in your program from the contents of register 14
in the appropriate register save area. The return address will lie within the
loop, if the loop is not confined to system code.
d. When you have located a point within the loop, work through the source
code and try to find the limits of the loop.
2. If you are using the trace table to identify the loop:
a. Go to the last entry in the internal trace table, and work backward until you
get to an entry for point ID AP 1942. The trace entry should have been
made when recovery was entered after the transaction abended AICA.
b. Make a note of the task number, so you can check that any other trace
entries you read relate to the same abended task.
c. Look at the entries preceding AP 1942. In particular, look for trace entries
with the point ID AP 00E1. These entries should have been made either just
before the loop was entered (for a tight loop), or within the loop itself (for a
non-yielding loop). Entires with a point ID of AP 00E1 are made on entry to
the EXEC interface program (DFHEIP) whenever your program issues an
EXEC CICS command, and again on exit from the EXEC interface program.
Field B gives you the value of EIBFN, which identifies the specific command
that was issued.
d. When you have identified the value of EIBFN, use the table Table 12 on
page 147 to identify the command that was issued.
e. For trace entries made on exit from DFHEIP, field A gives you the response
code from the request. Look carefully at any response codes - they could
provide the clue to the loop. Has the program been designed to deal with
every possible response from DFHEIP? Could the response code you see
explain the loop?
If you see a repeating pattern of trace points for AP 00E1, you have a
non-yielding loop. If you can match the repeating pattern to statements in the
source code for your program, you have identified the limits of the loop.
If you see no repeating pattern of trace points for AP 00E1, it is likely that you
have a tight loop. The last entry for AP 00E1 (if there is one) should have been
made from a point just before the program entered the loop. You might be able
to recognize the point in the program where the request was made, by matching
trace entries with the source code of the program.
Assuming you have the trace, and EI level-1 tracing has been done, ensure that
you can explain why each EIP entry is there. Verify that the responses are as
expected.
A good place to look for clues to loops is immediately before the loop sequence, the
first time it is entered. Occasionally, a request that results in an unexpected return
code can trigger a loop. However, you usually can only see the last entry before the
loop if you have CICS auxiliary or GTF trace running, because the internal trace
table is likely to wrap before the AICA abend occurs.
The nature of the symptoms might indicate which transaction is involved, but you
probably need to use trace to define the limits of the loop. Use auxiliary trace to
capture the trace entries, to ensure that the entire loop is captured in the trace data.
If you use internal trace, there is a danger that wraparound will prevent you from
seeing the whole loop.
1. Use the CETR transaction to set up the following tracing options. You can use
the transaction dynamically, on the running CICS system. For guidance about
using the CETR transaction, see Chapter 15, “Using traces in problem
determination,” on page 223.
The trace data and the program listings should enable you to identify the limits of
the loop. You need the transaction dump to examine the user storage for the
program. The data you find there could provide the evidence you need to explain
why the loop occurred.
Note: The PSW is of no value in locating loops that are not detected by CICS. The
contents of the PSW are unpredictable, and the PSW is not formatted in the
transaction dump for ATCH abends.
If you are only aware that performance is poor, and you have not yet found which of
these is relevant to your system, read “Finding the bottleneck.”
There is a quick reference section at the end of this section (“A summary of
performance bottlenecks, symptoms, and causes” on page 165) that summarizes
bottlenecks, symptoms, and actions that you should take.
CEMT INQ TASK returns a response indicating that the task is not known. If the task
has not already run and ended, this response means that it has not been attached
to the transaction manager.
If CEMT INQ TASK returns anything other than this, the task is not waiting to be
attached to the dispatcher. However, consider whether the MXT limit might be
causing the performance problem, even though individual tasks are not being held
up long enough for you to use CEMT INQ TASK on them. In such a case, use
monitoring and tracing to find just how long tasks are waiting to be attached to the
dispatcher.
Guidance about finding whether the MXT limit is to blame for the performance
problem is given in “MXT summary” on page 99.
Initial dispatch
A task can be attached to the dispatcher, but then take a long time to get an initial
dispatch.
In such a case, CEMT INQ TASK returns a status of ‘Dispatchable’ for the task. If you
keep getting this response and the task fails to do anything, it is likely that the task
you are inquiring on is not getting its first dispatch.
The delay might be too short for you to use CEMT INQ TASK in this way, but still long
enough to cause a performance problem. In such a case, use tracing or
performance class monitoring for the task, either of which would tell you how long
the task had to wait for an initial attachment to the dispatcher.
If you think your performance problem could be due to tasks taking a long time to
get a first dispatch, read “Why tasks fail to get an initial dispatch” on page 163.
Tasks run, but the overall performance is poor. If you are able to show that tasks
are getting attached and then dispatched, read “Why tasks take a long time to
complete” on page 164.
For a system task, there may not be enough storage to build the new task. This
sort of problem is more likely to occur near peak system load times.
Before the transaction manager can attach a user task to the dispatcher, the task
must first qualify under the MXT (maximum tasks in the system) and transaction
class limits. If a task is not getting attached, it is possible that one or both of these
values is too small.
You might be able to use CEMT INQ TASK to show that a task is failing to get
attached because of the MXT or transaction class limits. If you cannot use CEMT
because the task is held up for too short a time, you can look at either the
transaction global statistics, transaction class statistics, or the CICS
performance-class monitoring records. Another option is to use CICS system
tracing.
The statistics are gathered and recorded in the SMF data set.
2. Format this data set by using the statistics utility program, DFHSTUP. You might
find the following DFHSTUP control parameters useful:
SELECT APPLID=
COLLECTION TYPE=
REQTIME START= ,STOP=
DATE START= ,STOP=
Consider revising the MXT and transaction class values if the statistics indicate that
they are affecting performance. For guidance about the performance considerations
when you set these limits, see the CICS Performance Guide
For further information on the data produced by CICS monitoring see the CICS
Performance Guide.
Using trace
You can use trace if you want to find out just how long an individual task waits to be
attached to the dispatcher.
If you do not want to do any other tracing, internal trace is probably a suitable
destination for trace entries. Because the task you are interested in is almost
inactive, very few trace entries are generated.
1. Select special tracing for the transaction associated with the task, and turn off
all standard tracing by setting the master system trace flag off.
2. Define as special trace points the level-1 trace points for transaction manager
(XM), and for the CICS task controlling the facility that initiates the task, such as
terminal control (TC). Make sure that no other trace points are defined as
special. For guidance about setting up these tracing options, see Chapter 15,
“Using traces in problem determination,” on page 223.
3. When you have selected the options, start tracing to the internal trace table and
attempt to initiate the task.
4. When the task starts, get a system dump using the command CEMT PERFORM
SNAP. Format the dump using the keyword TR, to get the internal trace table.
5. Look for the trace entry showing terminal control calling the transaction manager
with a request to attach the task, and the subsequent trace entry showing the
transaction manager calling dispatcher domain with a request to attach the task.
The time stamps on the two trace entries tell you the time that elapsed between
the two events. That is equal to the time taken for the task to be attached.
You can get evidence that tasks are waiting too long for a first dispatch from
performance class monitoring. If you do find this to be the case, you need to
investigate the reasons for the delay. To calculate the initial dispatch delay incurred
by a task use the following fields from the performance-class monitoring record:
If the value you calculate is significantly greater than 0, the dispatcher could not
dispatch the task immediately.
The factors that influence the length of time a task must wait before getting its first
dispatch are:
v The priority of the task
v Whether the system is becoming short on storage.
Priorities of tasks
Normally, the priorities of tasks determine the order in which they are dispatched.
Priorities can have any value in the range 1–255. If your task is getting a first
dispatch (and, possibly, subsequent dispatches) too slowly, you might consider
changing its priority to a higher value.
One other factor affecting the priorities of tasks is the priority aging multiplier,
PRTYAGE, that you code in the system initialization parameters. This determines
the rate at which tasks in the system can have their priorities aged. Altering the
value of PRTYAGE affects the rate at which tasks are dispatched, and you probably
need to experiment to find the best value for your system.
Note: Release of the storage cushion is not the only cause of CICS going SOS.
The condition is also raised if a task makes an unconditional request for
storage greater than the storage cushion size when the system is
approaching SOS. In such a case, the cushion is not released, but the task
making the unconditional request is suspended and message DFHSM0131I
or DFHSM0133I may be issued. CICS resumes the suspended tasks
immediately if storage is made available by CICS releasing unused
programs. The short-on-storage condition remains until all the previously
suspended tasks have obtained the storage they requested. Message
DFHSM0604I is issued if CICS has allocated storage above the 2GB
boundary, to the value of 90% or more of the current MEMLIMIT value.
Two other conditions are recognized by the dispatcher on the approach to SOS:
v storage getting short
v storage critical
The two conditions affect the chance of new tasks getting a first dispatch. From the
‘storage getting short’ point, through ‘storage critical’ and right up to SOS, the
priorities of new user tasks are reduced in proportion to the severity of the
condition. However, this is not true if the PRTYAGE system initialization parameter is
set to 0. At first, you are not likely to notice the effect, but as ‘storage critical’ is
approached, new tasks might typically be delayed by up to a second before they
are dispatched for the first time.
It is likely that ‘storage getting short’ and ‘storage critical’ occur many times for
every occasion SOS that is reached. If you want to see how often these points are
reached, select level-2 tracing for the dispatcher domain and look out for trace point
IDs DS 0038 (‘storage getting short’) and DS 0039 (‘storage critical’). Trace point
DS 0040 shows that storage is OK.
A summary of the effects of ‘storage getting short’, ‘storage critical’, and SOS is
given in Table 13.
Table 13. How storage conditions affect new tasks getting started
State of storage Effects on user tasks
Storage getting short Priority of new user tasks reduced a little
Storage critical Priority of new user tasks reduced considerably
Here are some factors that can affect how long tasks take to complete.
The most obvious factor affecting the time taken for a task to complete is system
loading. For more information, see Improving the performance of a CICS system
the CICS Performance Guide. Note in particular that there is a critical loading
beyond which performance is degraded severely for only a small increase in
transaction throughput.
The time-out interval is the length of time a task can wait on a resource before it is
removed from the suspended state. A transaction that times out is normally
abended.
Any task in the system can use resources and not allow other tasks to use them.
Normally, a task with a large time-out interval is likely to hold on to resources longer
than a task with a short time-out interval. Such a task has a greater chance of
preventing other tasks from running. It follows that task time-out intervals should be
chosen with care, to optimize the use of resources by all the tasks that need them.
CICS uses QSAM to write data to extrapartition transient data destinations, and
QSAM uses the MVS RESERVE mechanism. If the destination happens to be a
DASD volume, any other CICS regions trying to access data sets on the same
volume are held up until the TD WRITE is complete.
Other system programs also use the MVS RESERVE mechanism to gain exclusive
control of DASD volumes, making the data sets on those volumes inaccessible to
other regions.
If you notice in particular that tasks making many file accesses take a long time to
complete, check the distribution of the data sets between DASD volumes to see if
volume locking could be the cause of the problem.
For CICS system tracing other than exception traces and CICS VTAM exit traces,
you can inquire on the current destinations and set them to what you want using
the CETR transaction.
“CETR - trace control” on page 243 illustrates what you might see on a CETR
screen, and indicates how you can change the options by overtyping the fields.
From that illustration you can see that, from the options in effect, a normal trace call
Note that the master system trace flag value only determines whether standard
tracing is to be done for a task (see Table 26 on page 234). It has no effect on any
other tracing status.
Internal tracing
goes to the internal trace table in main storage. The internal trace table is
used as a buffer in which the trace entries are built no matter what the
destination. It, therefore, always contains the most recent trace entries,
even if its status is STOPPED—if at least one of the other trace
destinations is currently STARTED.
Auxiliary tracing
goes to one of two data sets, if the auxiliary tracing status is STARTED.
The current data set can be selected from the CETR screen by overtyping
the appropriate field with A or B, as required. What happens when the data
set becomes full is determined by the auxiliary switch status. Make sure
that the switch status is correct for your system, or you might lose the trace
entries you want, either because the data set is full or because they are
overwritten.
GTF tracing
goes to the GTF trace data set. GTF tracing must be started under MVS,
using the TRACE=USR option, before the trace entry can be written. Note
that if GTF tracing has not been started in this way, the GTF tracing status
can be shown as STARTED on the CETR screen and yet no trace entries
are made, and no error condition reported.
It is worth remembering that the more precisely you can define the trace data you
need for any sort of problem determination, the more quickly you are likely to get to
the cause of the problem.
You can define whether you want standard or special CICS tracing for specific
transactions, and standard or special tracing for transactions started at specific
terminals. You can also suppress tracing for transactions and terminals that do not
interest you. The type of task tracing that you get (standard or special) depends on
the type of tracing for the corresponding transaction-terminal pair, in the way shown
in Table 25 on page 233.
You can deduce from the table that it is possible to get standard tracing when a
transaction is initiated at one terminal, and special tracing when it is initiated from
another terminal. This raises the possibility of setting up inappropriate task tracing
options, so the trace entries that interest you - for example, when the transaction is
initiated from a particular terminal - are not made.
The entries you want are missing from the trace table
Read this section if one or more entries you were expecting were missing entirely
from the trace table.
If the trace entry did not appear at the expected time, consider these
possibilities:
If the options were correct and tracing was running at the right time, but the trace
entries you wanted did not appear, it is likely that the task you were interested in
did not run or did not invoke the CICS components you expected. Examine the
trace carefully in the region in which you expected the task to appear, and attempt
to find why it was not invoked. Remember also that the task tracing options might
not, after all, have been appropriate.
If the earliest trace entry was later than the event that interested you, and
tracing was running at the right time, it is likely that the trace table wrapped round
and earlier entries were overwritten.
Internal trace always wraps when it is full. Try using a bigger trace table, or direct
the trace entries to the auxiliary trace or GTF trace destinations.
Note: Changing the size of the internal trace table during a run causes the data
that was already there to be destroyed. In such a case, the earliest data
would have been recorded after the time when you redefined the table size.
Auxiliary trace switches from one data set to the next when it is full, if the
autoswitch status is NEXT or ALL.
If the autoswitch status is NEXT, the two data sets can fill up but earlier data cannot
be overwritten. Your missing data might be in the initial data set, or the events you
were interested in might have occurred after the data sets were full. In the second
case, you can try increasing the size of the auxiliary trace data sets.
If the autoswitch status is ALL, you might have overwritten the data you wanted.
The initial data set is reused when the second extent is full. Try increasing the size
of the auxiliary trace data sets.
If you cannot find an exception trace entry that you expected, bear in mind that
exception tracing is always done to the internal trace table irrespective of the status
of any other type of tracing. So, if you missed it in your selected trace destination,
try looking in the internal trace table.
The sections that follow give guidance about resolving each of these problems in
turn.
If you invoked the dump from the MVS console using the MVS MODIFY command,
check that you specified the correct job name. It must be the job used to bring up
the CICS region in which you are interested.
If you invoked the dump from the CICS master terminal using CEMT PERFORM SNAP,
check that you were using the master terminal for the correct region. This is more
likely to be a problem if you have a VTAM network, because that allows you to
switch a single physical VTAM terminal between the different CICS regions.
There are, in general, two reasons why dumps might not be taken:
v Dumping is suppressed because of the way the dumping requirements for the
CICS region were defined. The valid ways that dumping can be suppressed are
described in detail in the sections that follow.
v A system error could have prevented a dump from being taken. Some of the
possibilities are:
– No transaction or system dump data sets were available.
– An I/O error occurred on a transaction or a system dump data set.
– The system dump data set was being written to by another region, and the
DURETRY time was exceeded.
You need to find out which of these types of dump suppression apply to your
system before you decide what remedial action to take.
You can inquire whether system dumping has been suppressed globally by using
the EXEC CICS INQUIRE SYSTEM DUMPING system programming command. If
necessary, you can cancel the global suppression of system dumping using EXEC
CICS SET SYSTEM DUMPING with a CVDA value of SYSDUMP.
If an exit program that suppresses system dumping for a particular dump code is
enabled, system dumping is not done for that dump code. This overrides any
system dumping requirement specified for the dump code in the dump table.
The exit program can suppress system dumps only while it is enabled. If you want
the system dumping suppression to be canceled, you can issue an EXEC CICS
DISABLE command for the program. Any system dumping requirements specified in
the dump table then take effect.
You can use EXEC CICS INQUIRE TRANSACTION DUMPING to see whether dumping has
been suppressed for a transaction, and then use the corresponding SET command
to cancel the suppression if necessary.
You can inquire on transaction and system dump code attributes using CEMT INQ
TRDUMPCODE and CEMT INQ SYDUMPCODE, respectively. You must specify the dump
code you are inquiring on.
If you find that the dumping options are not what you want, you can use CEMT SET
TRDUMPCODE code or CEMT SET SYDUMPCODE code to change the values of the
attributes accordingly.
v If you had no transaction dump when a transaction abended, look first to see
if attribute TRANDUMP or NOTRANDUMP is specified for this dump code. The
attribute needs to be TRANDUMP if a transaction dump is to be taken.
If the attribute is shown to be TRANDUMP, look next at the maximum number of
dumps specified for this dump code, and compare it with the current number. The
values are probably equal, showing that the maximum number of dumps have
already been taken.
v If you had a transaction dump but no system dump, use CEMT INQ
TRDUMPCODE and check whether there is an attribute of SYSDUMP or
NOSYSDUMP for the dump code. You need to have SYSDUMP specified if you
are to get a system dump as well as the transaction dump.
Check also that you have not had all the dumps for this dump code, by
comparing the maximum and current dump values.
v If you had no system dump when a system abend occurred, use CEMT INQ
SYDUMPCODE and check whether you have an attribute of SYSDUMP or
NOSYSDUMP for the dump code. You need SYSDUMP if you are to get a
system dump for this type of abend.
Finally, check the maximum and current dump values. If they are the same, you
need to reset the current value to zero.
Note: SDUMPs produced by the kernel do not use the standard dump domain
mechanisms, and always have a dump ID of 0/0000.
The complete range of dump IDs for any run of CICS is, therefore, distributed
between the set of system dumps and the set of transaction dumps, but neither set
of dumps has them all.
Table 15 gives an example of the sort of distribution of dump IDs that might occur.
Note that each dump ID is prefixed by the run number, in this case 23, and that this
is the same for any dump produced during that run. This does not apply to
SDUMPs produced by the kernel; these always have a dump ID of 0/0000.
Table 15. Typical distribution of dump IDs between dump data sets
On system dump data set On transaction dump data set
ID=23/0001
ID=23/0002 ID=23/0002
ID=23/0003
ID=23/0004
ID=23/0005
ID=23/0006
ID=23/0007
ID=23/0008
For further discussion of the way CICS manages transaction and system dumps,
see Chapter 17, “Using dumps in problem determination,” on page 255.
You do not get the correct data when formatting the CICS system
dump
If you did not get the correct data formatted from a CICS system dump, these are
the most likely explanations:
v You did not use the correct dump formatting keywords. If you do not specify any
formatting keywords, the whole system dump is formatted. However, if you
specify any keywords at all, you must be careful to specify keywords for all the
functional areas you are interested in.
v You used the correct dump formatting keywords, but the dump formatting
program was unable to format the dump correctly because it detected an error. In
such a case, you should be able to find a diagnostic error message from the
dump formatter.
v A partial dump might have been specified at the MVS level, for example “without
LPA”. This requirement would be recorded in the MVS parameter library.
For the present purpose, a terminal is considered to be any device where data can
be displayed. It might be some unit with a screen, or it could be a printer. Many
Broadly, there are two types of incorrect output that you might get on a screen, or
on a printer:
v The data information is wrong, so unexpected values appear on the screen or in
the hard copy from a printer.
v The layout is incorrect on the screen or in the hard copy. That is, the data is
formatted wrongly.
In practice, you may sometimes find it difficult to distinguish between incorrect data
information and incorrect formatting. In fact, you seldom need to make this
classification when you are debugging this type of problem.
Sometimes, you might find that a transaction runs satisfactorily at one terminal, but
fails to give the correct output on another. This is probably due to the different
characteristics of the different terminals, and you should find the answer to the
problem in the sections that follow.
For information on using tracing in CICS problem determination, see Chapter 15,
“Using traces in problem determination,” on page 223.
A message recording the failure is written to the CSNE log or, in the case of
autoinstall, to the CADL log.
You are likely to get a logon rejection if you attempt to specify anything other than
QUERY(NO) for a terminal that does not have the structured query field feature.
Note that NO is the default value for TYPETERM definitions that you supply, but
YES is the value for TYPETERM definitions that are supplied with CICS.
If you have a persistent problem with logon rejection, you can use the VTAM buffer
trace to find out more about the reasons for the failure.
Messages that are prefixed by DFH originate from CICS - use the CMAC
transaction or look in CICS Messages and Codes for these. For codes that appear
in the space at the bottom of the screen where status information is displayed, look
in the appropriate guide for the terminal.
The following are examples of common errors that can cause messages or codes
to be displayed:
v SCRNSIZE(ALTERNATE) has been specified in a PROFILE, and too many rows
have been specified for ALTSCREEN and ALTPAGE in the TYPETERM definition
for the terminal.
v An application has sent a spurious hex value corresponding to a control
character in a data stream. For example, X'11' is understood as “set buffer
address” by a 3270 terminal, and the values that follow are interpreted as the
new buffer address. This eventually causes an error code to be displayed.
If you suspect this may be the cause of the problem, check your application code
carefully to make sure it cannot send any unintended control characters.
v EXTENDEDDS(YES) has been specified for a device that does not support this
feature. In such a case, a message is sent to the screen, and a message might
also be written to the CSMT log.
The default value for EXTENDEDDS is NO, but check to make sure that YES
has not been specified if you know your terminal is not an extended data stream
device.
Note: You can also use the user exit XZCIN to perform uppercase translation.
Table 16 and Table 17 summarize whether or not you get uppercase translation,
depending on the values of these options.
Table 16. Uppercase translation truth table — ASIS option not specified
Profile TYPETERM UCTRAN(YES) TYPETERM UCTRAN(NO)
UCTRAN(YES) Yes Yes
UCTRAN(NO) Yes No
During the CRTE routing session, uppercase translation is dictated by the typeterm
of the terminal at which CRTE was initiated and the transaction profile definition of
the transaction being initiated (which has to be a valid transaction on the application
owning region) as shown in Table 19.
Table 19. Uppercase translation during CRTE session
TYPETERM UCTRAN TRANSACTION PROFILE INPUT TRANSLATED TO
(AOR) UCTRAN UPPERCASE
YES YES/NO ALL OF THE INPUT
NO NO NONE OF THE INPUT. See
note.
NO YES ALL OF THE INPUT
EXCEPT THE TRANSID.
See note.
TRANID YES ALL OF THE INPUT
TRANID NO TRANSID ONLY
Note: If the transid CRTE is not entered in upper case, it will not be recognized (unless
there is a lower/mixed case alias defined on the AOR) and message DFHAC2001 will be
issued.
During a CRTE routing session, if the first six characters entered at a screen are
CANCEL, CICS will recognize this input in upper, lower or mixed case and end the
routing session.
Be aware that when transaction routing from CICS Transaction Server for z/OS,
Version 3 Release 2 to an earlier release of CICS that does not support transaction
based uppercase translation, uppercase translation only occurs if it is specified in
the typeterm.
In a transaction routing environment, the system programmer who issues the EXEC
CICS SET TERMINAL command should be aware (for VTAM terminals) that the
TOR terminal uppercase translate status is copied to the AOR surrogate terminal on
every flow across the link from the TOR to the AOR. Consequently:
v The EXEC CICS SET TERMINAL change of uppercase translate status will only take
effect on the AOR on the next flow across the link.
v Any AOR typeterm definition used to hard code remote terminal definitions will be
overridden with the TOR values for uppercase translate status.
v EXEC CICS INQUIRE TERMINAL issued on the AOR can return misleading
uppercase translation status of the terminal, since the correct status on the TOR
may not yet have been copied to the AOR.
v The processing of RECEIVE requests on the TOR and AOR can interrogate the
uppercase translate status of the terminal. Therefore unpredictable results can
also occur if the system programmer issues the EXEC CICS SET TERMINAL
command during receive processing.
If the data values are wrong on the user’s part of the screen (the space above the
area used to display status information to the operator), or in the hard copy
produced by a printer, it is likely that the application is at fault.
If you find that some data is not being displayed, consider these possibilities:
v The SENDSIZE value for the TYPETERM definition could be too large for the
device receiving the data. Its receiving buffer could then overflow, with some data
being lost.
v SCRNSIZE(ALTERNATE) might be specified in the PROFILE definition for the
transaction running at the terminal, while default values for ALTSCREEN and
ALTPAGE are allowed in the TYPETERM definition for the terminal.
The default values for ALTSCREEN and ALTPAGE are 0 rows and 0 columns, so
no data could then be displayed if SCRNSIZE(ALTERNATE) were specified.
v EXTENDEDDS(YES) is specified for a device that does not support this feature.
Early data can be overlaid by later data, so that data appears in the wrong order,
when the SENDSIZE value of the TYPETERM definition is too large for the device
receiving the data. This is because the buffer can wrap when it is full, with the
surplus data overlaying the first data that was received.
Incorrect formatting of data can have a wide range of causes, but here are some
suggestions of areas that can sometimes be troublesome:
v BMS maps are incorrect.
v Applications have not been recompiled with the latest maps.
v Different numbers of columns have been specified for ALTSCREEN and
ALTPAGE in the TYPETERM definitions for the terminal. This can lead to
unpredictable formatting errors. However, you will not see them unless
SCRNSIZE(ALTERNATE) has been specified in the PROFILE for the transaction
running at the terminal.
v The PAGESIZE values included in the TYPETERM definitions must suit the
characteristics of the terminal, or you get formatting errors.
For a screen display, the number of columns specified must be less than or
equal to the line width. For a printer, the number of columns specified must be
less than the line width, or else both BMS (if you are using it) and the printer
might provide a new line and you will get extra spacing you do not want.
The default values for PAGESIZE depend on the value you specify for the
DEVICE keyword.
v If you get extra line feeds and form feeds on your printer, it could be that an
application is sending control characters that are not required because the printer
is already providing end of line and end of form operations.
If your application is handling the buffering of output to a printer, make sure that
an “end of message” control character is sent at the end of every buffer full of
data. Otherwise, the printer might put the next data it receives on a new line.
If the first transaction were to take some action based on the value of the record,
the action would probably be erroneous.
In the meantime, a second customer also asks for 100 items. The salesperson uses
a terminal to inquire on the number currently in stock. The “inquire” transaction
reads the record that has been read for update but not yet rewritten, and returns
the information that there are 150 items. This customer, too, is promised delivery
within 24 hours.
Traces and dumps can give you valuable information about unusual conditions that
might be causing your application to work in an unexpected way.
1. If the path through the transaction is indeterminate, insert user trace entries at
all the principal points.
2. If you know the point in the code where the failure occurs, insert a CICS system
dump request immediately after it.
3. Use CETR to select special tracing for the level-1 trace points for all
components. Select special tracing for the failing task only, and disable all
standard tracing by setting the master system trace flag off.
4. Run the transaction after setting the trace options, and wait until the system
dump request is executed. Format the internal trace table from the dump
(formatting keyword TR), and examine the trace entries before the failure. Look
in particular for unusual or unexpected conditions, possibly ones that the
application is not designed to handle.
Can you use the terminal where the transaction should have started?
Go to the terminal where the transaction should have started, and note whether the
keyboard is locked. If it is, press RESET. Now try issuing CEMT INQ TASK (or your
site replacement) from the terminal.
If you cannot issue CEMT INQ TASK from the terminal, one of these explanations
applies:
v The task that produced no output is still attached to the terminal.
v The terminal where you made the inquiry is not in service.
v There is a system-wide problem.
v You are not authorized to use the CEMT transaction. (This may be because you
have not signed on to the terminal and the CEMT transaction is not authorized
for that terminal. If you have signed on to the terminal, you are probably
authorized to use CEMT.)
Try to find a terminal where you can issue CEMT INQ TASK. If no terminal seems
to work, there is probably a system-wide problem. Otherwise, see if the task you
are investigating is shown in the summary.
v If the task is shown, it is probably still attached, and either looping or waiting.
Turn to “Distinguishing between waits, loops, and poor performance” on page 12
to see what to do next.
If you are able to issue CEMT INQ TASK from the terminal where the transaction
was attached, one of these explanations applies:
v The transaction gave no output because it never started.
v The transaction ran without producing any output, and terminated.
v The transaction started at another terminal, and might still be in the system. If it
is still in the system, you can see it in the task summary that you got for CEMT
INQ TASK. It is probably looping or waiting. See “Distinguishing between waits,
loops, and poor performance” on page 12 for advice about what to do next. If
you do not see the task in the summary, go to “No output - what to do if the task
is not in the system.”
Note: If you’re not getting output on a printer, the reason could be simply that
you are not setting on the START PRINTER bit in the write control
character. You need to set this bit to get printed output if you have
specified the STRFIELD option on a CONVERSE or SEND command,
which means that the data area specified in the FROM option contains
structured fields. Your application must set up the contents of the
structured fields.
Your task might have been initiated by direct request from a terminal, or by
automatic task initiation (ATI). Most of the techniques apply to both sorts of task,
but there are some extra things to investigate for ATI tasks. Carry out the tests
which apply to all tasks first, then go on to the tests for ATI tasks if you need to.
You need to use the CETR transaction to set up the right tracing options. See
Chapter 15, “Using traces in problem determination,” on page 223 for guidance
about setting up trace options.
1. Select special tracing for just your task, and disable tracing for all other tasks by
setting the master system trace flag off.
2. Set up special tracing for the level one trace points for the components that are
likely to be used during the invocation of the task. The components you choose
will depend on how the task is initiated - by direct request from a terminal, or by
If your transaction ran, you should see the following types of trace entries for your
task and the programs associated with it:
1. Loader domain, when it loaded your program, if the program was not already in
main storage.
2. Transaction manager, when it attached your task to the dispatcher.
3. Dispatcher domain, when your task got its first dispatch. You might also see
subsequent entries showing your task being suspended, and then resumed.
4. Program manager, for any program management functions associated with your
task.
If trace entries for any of these processes are missing, that should help you to find
where the failure occurred.
Using EDF
If the transaction being tested requires a terminal, you can use EDF.
You need two other terminals for input, as well as the one that the transaction
requires (“tttt”). Use one of these others to put the transaction terminal under control
of EDF, with:
CEDF tttt
Using CEDX
You can use CEDX to debug non-terminal transactions.
CICS intercepts the transaction specified on the CEDX tranid command, and
displays the EDF diagnostic panels at the terminal at which the EDF command is
issued.
CEDX provides the same function and diagnostic display panels as CEDF, and the
same basic rules for CEDF also apply to CEDX.
Using statistics
If no one else is using the transaction in question, you can tell from CICS statistics
whether the program has been executed or not.
Using CEBR
You can use CEBR to investigate your transaction if the transaction reads or writes
to a transient data queue, or writes to a temporary storage queue. A change in such
a queue is strong evidence that the transaction ran, provided that the environment
is sufficiently controlled that nothing else could produce the same effect. You need
to be sure that no other transaction that might be executed while you are doing
your testing does the same thing.
The absence of such a change does not mean that the transaction did not run - it
might have run incorrectly, so that the expected change was not made.
Using CECI
If your transaction writes to a file, you can use CECI before and after the
transaction to look for evidence of the execution of your transaction. A change in
the file means the transaction ran. If no change occurred, that does not necessarily
mean that the transaction failed to run - it could have worked incorrectly, so that the
changes you were expecting were not made.
You can locate it in the formatted system dump by looking at the ICP section. Look
in field ICETRNID of each ICE (the 4-character transaction ID) to see if it relates to
your task.
If you find an ICE for your task, look in field ICEXTOD. That will show you the
expiration time of day. Does it contain the value you expect? If not, either the task
which caused this one to be autoinitiated was in error, or there is a system problem.
If a task needs a resource, usually a terminal, that is unavailable, the task remains
on the AID chain until it can use the resource.
AIDs are addressed from system entries with their forward and backward chain
pointers at offset '0C' and '10' respectively. AIDs contain the following fields that can
be useful in debugging.
AIDTYPE (X'2D')
Type of aid:
AIDSTATI (X'2E')
AID status indicator:
AID_TOR_NETNAME (X'65')
Netname of the owning region for a specific terminal
AID_TERMINAL_NETNAME (X'5D')
Netname of terminal
AIDDATID (X'34')
TS queue name holding the data.
AID_REROUTED (X'4E')
AID rerouted to a different TOR
You can see the AIDs in the TCP section of the formatted system dump. Look in
field AIDTRNID (the 4-character transaction ID) of each AID, to see if it relates to
your task.
If you do find an AID that relates to your task, your task is scheduled to start, but
cannot do so because the terminal is unavailable. Look in field AIDTRMID to find
the symbolic ID of the terminal, and then investigate why the terminal is not
available. One possibility is that the terminal is not in ATI status, because ATI(YES)
has not been specified for it in the TYPETERM definition.
For example, consider a transaction that reads records from a file, processes the
information in the records, and displays the results on a terminal. The data might be
corrupted at any of points 1 through 5, as it flows from file to terminal.
1. Data records might be incorrect, or they could be missing from the file.
2. Data from the file might be mapped into the program incorrectly.
3. Data input at the terminal might be mapped into the program incorrectly.
4. Bad programming logic might corrupt the data.
5. The data might be mapped incorrectly to the terminal.
If you find bad data in the file or data set, the error is likely to have been caused by
the program that last updated the records containing that data. If the records you
expected to see are missing, make sure that your application can deal with a
‘record not found’ condition.
If the data in the file is valid, it must have been corrupted later on in the processing.
Is the data contained in the record that is read compatible with the data declaration
in the program?
Check each field in the data structure receiving the record, making sure in particular
that the type of data in the record is the same as that in the declaration, and that
the field receiving the record is the right length.
If the program receives input data from the terminal, make sure that the relevant
data declarations are correct for that, too.
If there seems to be no error in the way in which the data is mapped from the file or
terminal to the program storage areas, the next thing to check is the program logic.
You can determine the flow of data through your transaction by “desk checking”, or
by using the interactive tools and tracing techniques supplied by CICS.
Desk checking your source code is sometimes best done with the help of another
programmer who is not familiar with the program. It is often possible for such a
person to see weaknesses in the code which you have overlooked.
Note: When you use CEBR to look at a transient data queue, the records you
retrieve are removed from the queue before they are displayed to you.
This could alter the flow of control in the program you are testing. You
can, however, use CEBR to copy transient data queues to and from
temporary storage, as a way of preserving the queues if you need to.
User tracing allows you to trace the flow of control and data through your program,
and to record data values at specific points in the execution of the transaction. You
could, for example, look at the values of counters, flags, and key variables during
the execution of your program. You can include up to 4000 bytes of data on any
trace entry, and so this can be a powerful technique for finding where data values
are being corrupted.
For programming information about how you can invoke user tracing, see the CICS
Application Programming Reference.
CSFE storage freeze can be used to freeze the storage associated with a terminal
or a transaction so that it is not FREEMAINed at the end of processing. This can be
a useful tool if, for example, you want to investigate possible storage violations. You
need to get a transaction dump to look at the storage after you have run the task
with storage freeze on.
For long-running tasks, there is a possibility that a large amount of storage may be
consumed because it cannot be FREEMAINed while storage freeze is on. For
short-running tasks, however, there should be no significant overhead.
If, after using these techniques, you can find no fault with the logic of the program,
the fault either lies with the way data is mapped to the terminal, or you could have
missed some important evidence.
Note: The MDT is turned on automatically if the operator types data in the
field. If, however, the operator does not type data there, the application
must turn the tag on explicitly if the field is to be read in.
v If your program changes a field attribute byte, or a write control character, look at
each bit and check that its value is correct by looking in the appropriate
reference manual for the terminal.
Even if your system uses all the CICS storage protection facilities, CICS storage
violations can occur in certain circumstances in systems using storage protection.
For example:
v An application program could contain the necessary instructions to switch to
CICS key and modify CICS storage.
v An application program could contain the necessary instructions to switch to the
basespace and modify other transactions’ storage.
v An application program could be defined with EXECKEY(CICS) and could thus
modify CICS storage and other transactions’ storage.
v An application could overwrite one or more storage check zones in its own
task-lifetime storage.
To gain the full benefit of CICS storage protection, you need to examine the storage
needs of individual application programs and control the storage key definitions that
are used.
When CICS detects and prevents an attempted storage violation, the name of the
abending program and the address of the area it tried to overwrite are passed to
the program error program (DFHPEP). For programming information about
DFHPEP, see the CICS Customization Guide.
If a storage violation occurs in your system, please read the rest of this section.
If you have received this message, turn first to the description of message
DFHSM0102 in CICS Messages and Codes to see an explanation of the message,
and then to CICS Trace Entries to see an explanation of the exception trace point
ID, X'code'. This tells you how CICS detected the storage violation. Then return to
this section, and read “CICS has detected a storage violation.”
Storage violations not detected by CICS are less easy to identify. They can cause
almost any sort of symptom. Typically, you might have got a program check with a
condition code indicating ‘operation exception’ or ‘data exception’, because the
program or its data has been overlaid. Otherwise, you might have obtained a
message from the dump formatting program saying that it had found a corrupted
data area. Whatever the evidence for the storage violation, if it has not been
detected by CICS, turn to “Storage violations that affect innocent transactions” on
page 197.
CICS detects storage violations involving TIOAs by checking the SAA chains when
it receives a command to FREEMAIN an individual element of TIOA storage, at
least as far as the target element. It also checks the chains when it FREEMAINs
the storage belonging to a TCTTE after the last output has taken place. CICS
detects storage violations involving user-task storage by checking the storage check
zones of an element of user-task storage when it receives a command to
FREEMAIN that element of storage. It also checks the chains when it FREEMAINs
all the storage belonging to a task when the task ends.
The storage violation is detected not at the time it occurs, but only when the SAA
chain or the storage check zones are checked. This is illustrated in Figure 17 on
page 193, which shows the sequence of events when CICS detects a violation of a
user task storage element. The sequence is the same when CICS detects a
violation of a TIOA storage element.
The fact that the SAA or storage check zone is overlaid some time before it is
detected does not matter too much for user storage where the trailing storage
check zone has been overlaid, because the transaction whose storage has been
violated is also very likely to be the one responsible for the violation. It is fairly
common for transactions to write data beyond the end of the allocated area in a
storage element and into the check zone. This is the cause of the violation in
Figure 17 on page 193.
The situation could be more serious if the leading check zone has been overlaid,
because in that case it could be that some other unrelated transaction was to
blame. However, storage elements belonging to individual tasks are likely to be
more or less contiguous, and overwrites could extend beyond the end of one
element and into the next.
Finding the offending transaction when the duplicate SAA of a TIOA storage
element has been overlaid might not be so straightforward. This is because TIOAs
tend to have much longer lifetimes than tasks, because they wait on the response
of terminal operators. By the time the storage violation is detected, the transaction
that caused it is unlikely to still be in the system. However, the techniques for
CICS-detected violations still apply.
Note: For storage elements with SAAs, the address that is returned on the
GETMAIN request is that of the leading SAA; for storage elements with
storage check zones, the address that is returned is that of the beginning of
usable storage.
If you have suppressed dumping for this dump code, re-enable it and attempt to
reproduce the error. The system dump is an important source of information for
investigating CICS-detected storage violations.
If storage recovery is not on, CICS abends the transaction whose storage has been
violated (if it is still running). If the transaction is running when the error is detected
and if dumping is enabled for the dump code, a transaction dump is taken.
If you received a transaction abend message, read “What the transaction abend
message can tell you” on page 194. Otherwise, go on to “What the CICS system
dump can tell you” on page 194.
Because CICS does not detect the overlay at the time it occurs, the program
identified in the abend message probably is not the one in error. However, it is likely
that it issued the FREEMAIN request on which the error was detected. One of the
other programs in the abended transaction might have violated the storage in the
first place.
The dump formatting program reports the damaged storage check zone or SAA
chain when it attempts to format the storage areas, and this can help you with
diagnosis by identifying the TCA or TCTTE owning the storage.
When you have formatted the dump, take a look at the data overlaying the SAA or
storage check zone to see if its nature suggests which program put it there. There
are two places you can see this, one being the exception trace entry in the internal
trace table, and the other being the violated area of storage itself. Look first at the
exception trace entry in the internal trace table to check that it shows the data
overlaying the SAA or storage check zone. Does the data suggest what program
put it there? Remember that the program is likely to be part of the violated
transaction in the case of user storage. For terminal storage, you probably have
more than one transaction to consider.
As the SAAs and storage check zones are only 8 bytes long, there might not be
enough data for you to identify the program. In this case, find the overlaid data in
the formatted dump. The area is pointed to in the diagnostic message from the
dump formatting program. The data should tell you what program put it there, and,
more importantly, what part of the program was being executed when the overlay
occurred.
If the investigations you have done so far have enabled you to find the cause of the
overlay, you should be able to fix the problem.
Tracing must also be active, or CICS will do no extra checking. The CSFE
transaction has the advantage that you need not bring CICS down before you can
use it.
Table 20 shows the CSFE DEBUG options and their effects. Table 21 shows the
startup overrides that have the same effects.
Table 20. Effects of the CSFE DEBUG transaction
CSFE syntax Effect
CSFE DEBUG, CHKSTSK=CURRENT This checks storage check zones for all
storage areas on the transaction storage
chain for the current task only.
Your strategy should be to have the minimum tracing that will capture the storage
violation, to reduce the processing overhead and to give you less trace data to
process. Even so, you are likely to get a large volume of trace data, so direct the
You need to have only level-1 tracing selected, because no user code is executed
between level-2 trace points. However, you do not know which calls to CICS
components come before and after the offending code, so you need to trace all
CICS components in AP domain. (These are the ones for which the trace point IDs
have a domain index of “AP”.) Set level-1 tracing to be special for all such
components, so that you get every AP level-1 trace point traced using special task
tracing.
If the trailing storage check zone of a user-storage element has been overlaid,
select special tracing for the corresponding transaction only. This is because it is
very likely to be the one that has caused the overlay.
If the duplicate SAA of a TIOA has been overlaid, you need to select special tracing
for all tasks associated with the corresponding terminal, because you are not sure
which has overlaid the SAA. It is sufficient to select special tracing for the terminal
and standard tracing for every transaction that runs there, because you get special
task tracing with that combination. (See Table 25 on page 233.)
When you have set up the tracing options and started auxiliary tracing, you need to
wait until the storage violation occurs.
The value of 'code' is equal to the exception trace point ID, and it identifies the type
of storage that was being checked when the error was detected. A description of
the exception trace point ID, and the data it contains, is in CICS Trace Entries.
Format the system dump using the formatting keyword TR, to get the internal trace
table. Locate the exception trace entry made when the storage violation was
detected, near the end of the table. Now scan back through the table, and find the
last old-style trace entry (AP 00xx). The code causing the storage violation was
being executed between the time that the trace entry was made and the time that
the exception trace entry was made.
If you have used the CHKSTSK=CURRENT option, you can locate the occurrence
of the storage violation only with reference to the last old-style trace entry for the
current task.
You need to identify the section of code that was being executed between the two
trace entries from the nature of the trace calls. You then need to study the logic of
the code to find out how it caused the storage violation.
If they are reproducible, storage violations of this type typically occur at specific
offsets within structures. For example, the start of an overlay might always be at
offset 30 from the start of a field.
The most likely cause of such a violation is a transaction writing data to a part of
the DSAs that it does not own, or possibly FREEMAINing such an area. The
transaction might previously have GETMAINed the area and then FREEMAINed it
before writing the data, or addressability might otherwise not have been correctly
maintained by an application. Another possible reason is that an ECB might have
been posted by a transaction after the task that was waiting on it had been
canceled.
Look carefully at the content of the overlay before you do any other investigation,
because it could help you to identify the transaction, program, or routine that
caused the error. If it does not provide the clue you need, your strategy should be
to use CICS tracing to collect a history of all the activities that reference the
affected area.
The trace table must go back as far as task attach of the program causing the
overlay, because that trace entry relates the transaction’s identity to the unit of work
number used on subsequent entries. This could mean that a very large trace table
is needed. Internal trace is not suitable, because it wraps when it is full and it then
overwrites important trace entries.
Auxiliary trace is a suitable destination for recording long periods of system activity,
because it is possible to specify very large auxiliary trace data sets, and they do not
wrap when they are full.
If you have no idea which transaction is causing the overlay, you need to trace the
activities of every transaction. This impacts performance, because of the processing
overhead.
If you are unable to identify the cause of the storage violation after carrying out the
procedures of the preceding section, contact your IBM Support Center. They might
suggest coding a global trap/trace exit to detect the storage violation.
In normal operation, CICS sets up four task-lifetime storage subpools for each task.
Each element in the subpool starts and ends with a check zone that includes the
subpool name. At each FREEMAIN, and at end of task, CICS inspects the check
zones and abends the task if either has been overwritten.
Terminal input-output areas (TIOAs) have similar check zones, each of which is set
up with the same value. At each FREEMAIN of a TIOA, CICS inspects the check
zones and abends the task if they are not identical.
DFHEX0001 DFHEX0011
DFHEX0002 DFHEX0012
DFHEX0003 DFHEX0013
DFHEX0004 DFHEX0014
DFHEX0005 DFHEX0015
DFHEX0010 DFHEX0016
Messages DFH5502W and DFH5503E include support for the external CICS
interface facility.
The external CICS interface outputs trace to two destinations: an internal trace table
and an external MVS GTF data set. The internal trace table resides in the
non-CICS MVS batch region. Trace data is formatted and included in any dumps
produced by the external CICS interface.
Trace entries are issued by the external trace interface destined for the internal
trace table and/or an MVS GTF data set. They are listed in CICS Trace Entries.
The external CICS interface produces MVS SYSM dumps for some error conditions
and MVS SDUMPs for other, more serious conditions. These dumps contain all the
external CICS interface control blocks, as well as trace entries. You can use IPCS
to format these dumps.
For detailed problem determination information about the external CICS interface
including information about trace, system dumps and MVS abends, see the CICS
External Interfaces Guide.
Categories of problem
The following categories of problem (in order of ascending impact on the user) may
be encountered by the CICS log manager.
1. Those problems within the MVS logger that the MVS logger resolves for itself.
CICS has no involvement in this category and might only experience the
problem as an increase in response times.
2. Where the MVS logger is unable to satisfy the CICS log manager's request
immediately. This problem state can be encountered:
v For a log stream that uses a coupling facility structure, on a 'STRUCTURE
FULL' condition, where the coupling facility has reached its capacity before
offloading data to DASD. This state may also be encountered during the
rebuilding of a coupling facility structure.
v For a DASD-only log stream, on a 'STAGING DATA SET FULL' condition,
where the staging data set has reached its capacity before offloading data to
secondary storage.
If either of these conditions occur, CICS issues message DFHLG0771 (for a
general log) or DFHLG0777 (for a system log). The CICS log manager retries
the request every three seconds until the request is satisfied. Typically, this can
take up to a minute.
3. If the MVS logger fails, CICS is abended. If the system log has not been
damaged, a subsequent emergency restart of CICS should succeed.
4. If a return code implies that the CICS system log has been damaged, CICS is
quiesced, meaning transactions are allowed to run to completion as far as
possible, with no further records being written to the system log. To get CICS
back into production, you must perform an initial start. However, before doing so
you may want to perform a diagnostic run, to gather information for problem
diagnosis - see “Dealing with a corrupt system log” on page 218.
If a return code implies damage to a forward recovery log or autojournal, all files
using the log stream are quiesced and their transactions run to completion.
Message DFHFC4800, DFHFC4801, or DFHFC4802 is issued. User
transactions writing journal records to the log stream experience a write error.
For a forward recovery log, before you can continue to use the log stream, you
must:
a. Take an image copy of all data sets referencing the log stream.
b. Redefine the log stream.
System log
You are strongly recommended to allow the CICS log manager to manage the size
of the system log. If you do so, you do not need to worry about the data set limit
being exceeded.
In the unlikely event that you need to retain data beyond the time it would be
deleted by CICS, see the CICS Transaction Server for z/OS Installation Guide for
advice on how to define the system log.
General logs
If a journal write to a user journal fails because the data set limit is reached, you
must delete the tail of the log, or archive it, before you can use the SET
JOURNALNAME command to open the journal and make it available for use again. For
an example of how to do this, see the CICS Operations and Utilities Guide.
v The number of data sets per log stream recognized by the MVS logger is several
million. In normal circumstances, you do not need to be concerned about the limit
being exceeded.
v You can cause redundant data to be deleted from log streams automatically, after
a specified period. To arrange this for general log streams, define the logs to
MVS with AUTODELETE(YES) and RETPD(dddd), where dddd is the number of
days for which data is to be retained. This causes the MVS logger to delete an
entire log data set when all the data in it is older than the retention period
(RETPD) specified for the log stream.
The interval at which CICS checks for the availability of the MVS logger varies,
depending on the amount of system logging activity in the CICS region. The first
check is made after CICS has not made contact with the MVS logger for 10
seconds. If CICS continues to perform no system logging after the first check, the
206 Problem Determination Guide
interval between checks doubles each time, up to a maximum of 600 seconds. If
CICS makes contact with the MVS logger at any point, the interval between checks
is halved, down to a minimum of 10 seconds.
The checking interval can be affected by the exit time interval specified in the ICV
system initialization parameter, as follows:
v If the value specified in the ICV system initialization parameter is less than 10
seconds, it has no effect on the checking interval.
v If the value specified in the ICV system initialization parameter is greater than 10
seconds but less than 600 seconds, the checking interval varies between the
value specified in the ICV system initialization parameter, and 600 seconds. The
first check is made after an interval corresponding to the value in the ICV system
initialization parameter, instead of being made after 10 seconds. The minimum
checking interval is the value in the ICV system initialization parameter.
v If the value specified in the ICV system initialization parameter is greater than
600 seconds, the checking interval does not vary, and always corresponds to the
value in the ICV system initialization parameter.
The statistics field IGXQUERY in the CICS log manager statistics enables you to
monitor the number of checks that CICS makes for the availability of the MVS
logger.
The following CICS log manager messages cover some of the CICS logger failure
situations. The more common message combinations are as follows:
DFHLG0772, DFHLG0800, and DFHLG0738
DFHLG0772, DFHLG0800, DFHLG0736, and DFHLG0741
DFHLG0772 and DFHLG0740
DFHLG0772 and DFHLG0734
DFHLG0002 and DFHLG0734
Note: For details of all the return and reason codes for the MVS logger macros,
see z/OS MVS Programming: Authorized Assembler Services Reference
ENF-IXG.
The MVS logger return and reason codes for the failure are given in the message,
together with the name of the call and the attributes of the log stream being
accessed at the time. Message DFHLG0772 is followed by one or more other
messages when CICS has determined the extent of the problem.
CICS takes a system dump at the time the message is issued. This is the primary
source of information for determining the cause of the error, and a dump from a
DFHLG0772 should not be suppressed. See “Setting a SLIP trap” on page 216 for
information on how to capture dumps of the MVS system logger address space and
the coupling facility structure at the same time. These three pieces of
documentation are essential if you refer the problem to IBM service. You are also
recommended to run the DFHJUP utility to obtain printed output from the DFHLOG
and DFHSHUNT system log streams before you restart a failed region.
If CICS decides the data integrity is compromised, or the problem is too serious to
allow continued operation, it marks the system log as broken. CICS then begins
automatic action to shut itself down.
Note: The quiesce of CICS initiated with message DFHLG0736 continues until
the in-flight tasks on the system complete, either successfully by
committing their updates, or by abending. Those tasks that attempt a
backout are suspended forever. CICS, therefore, is unable to complete a
normal shutdown operation and hangs, requiring intervention to be
terminated. This intervention can be by one of the following:
v Operator action
v The shutdown assist transaction
v A CICS monitor package.
The intervention is required because there is at least one task
suspended indefinitely in an LGFREVER wait.
After DFHLG0800 and DFHLG0738, ensure that you perform a diagnostic start,
followed by an initial start when you have successfully captured the diagnostics.
See “Restarting CICS after a system log failure” on page 211 for details.
Message DFHLG0002
This is a general message that is issued when a severe error has occurred within
the CICS log manager domain. The module in error is identified in the message,
If CICS issues DFHLG0002, but determines that an emergency restart may resolve
the error and successfully recover in-flight tasks, CICS issues DFHLG0734.
DFHLG0734
This indicates a severe exception condition, indicated by the reason code in the
preceding DFHLG0002 message and CICS immediately terminates. The
problem should be investigated and the error corrected before restarting CICS.
On a diagnostic run, CICS produces a dump of the CICS region state, retrieved
from the CICS system log and then terminates. On a diagnostic run, CICS performs
no recovery work and no new work. This situation persists until you start the region
with an initial start.
For information about the AUTODIAG type-of-start override record, see the CICS
Operations and Utilities Guide. For more details of a diagnostic run, see “Dealing
with a corrupt system log” on page 218.
When you have obtained the required diagnostics and are ready to restart the
region with the broken system log, you can do so only with an initial start. You can
do this either by running the DFHRMUTL utility with the
SET_AUTO_START=AUTOINIT parameter, or by specifying START=INITIAL as a
system initialization parameter.
v Run the DFHRMUTL utility with the SET_AUTO_START=AUTOINIT parameter.
v Alternatively, specify the START=INITIAL as a system initialization parameter.
An initial start is the only form of CICS startup that does refer to log data written
during the previous run. It is the only restart that is possible in these circumstances.
Log stream data sets are of the form IXGLOGR.stream_name.Annnnnnn. The high level
qualifier (IXGLOGR) may be different if the HLQ parameter was specified when the
log stream was defined.
Explanations of MVS logger reason codes which are shown in CICS and MVS
messages and traces are in the IXGCON macro and in the OS/390 MVS Assembler
Services Reference manual.
The RO *ALL phrase means that the command goes to all systems in the sysplex:
RO *ALL,D GRS,C
RO *ALL,D GRS,RES=(SYSZLOGR,*)
D GRS,RES=(SYSZLOGR,*)
A response showing GRS contention looks like this. You may also see latch set
name SYS.IXGLOGER_MISC:
D GRS,RES=(SYSZLOGR,*)
This shows which tasks (that is, MVS TCBs) have exclusive enqueues on the log
streams, and which tasks are waiting for them. It is quite normal for enqueues and
latches to be obtained, occasionally with contention. They are indications of a
problem only if they last for more than a minute or so.
Long term enqueuing on the SYSZLOGR resource can be a sign of problems even
if there is no contention.
You can choose to display only those log streams exclusively enqueued on by CICS
jobs in the sysplex. Issue the following MVS command:
D GRS,RES=(DFHSTRM,*)
If the response shows that LOGR is not in use by all systems, there may be a
problem to investigate. Look for IXCxxx messages which might indicate the cause
of the problem and issue the following command to attempt reconnection to the
couple data set:
SETXCF CPL,TYPE=(LOGR),PCOUPLE=(couple_dataset_name)
To display all structures with Failed_persistent connections, issue the following MVS
command:
D XCF,STR,STRNM=*,STATUS=FPCONN
You can use wildcards to select multiple log streams. For example, the following job
produces a report on the system log streams for CICS region IYLX4:
//IYLXLIST JOB NOTIFY=WILLIN,MSGCLASS=A
//LOGLIST EXEC PGM=IXCMIAPU
//SYSPRINT DD SYSOUT=A,DCB=RECFM=FBA
//SYSIN DD *
DATA TYPE(LOGR) REPORT(NO)
LIST LOGSTREAM NAME(WILLIN.IYLX4.DFH*) DETAIL(YES)
Figure 18 on page 215 shows a typical response to this command, with system logs
streams for CICS region IYXL4.
User Data:
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
SYSTEMS CONNECTED: 1
User Data:
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
SYSTEMS CONNECTED: 1
A dump of XCF and MVS logger address spaces from all systems is useful in the
diagnosis of such problems. To obtain the dump, issue the following series of MVS
commands:
DUMP COMM=(meaningful dump title)
R ww,JOBNAME=(IXGLOGR,XCFAS,cics_jobname),DSPNAME=(’IXGLOGR’.*,’XCFAS’.*),CONT
R xx,STRLIST=(STRNAME=structure,(LISTNUM=ALL),ACC=NOLIM),CONT
R yy,REMOTE=(SYSLIST=*(’XCFAS’,’IXGLOGR’),DSPNAME,SDATA),CONT
R zz,SDATA=(COUPLE,ALLNUC,LPA,LSQA,PSA,RGN,SQA,TRT,CSA,GRSQ,XESDATA),END
Error records written to the MVS LOGREC data set may also be useful.
If you have applied MVS APAR OW27057, a dump of the MVS logger address
space is produced automatically if an MVS IXGBRWSE or IXGDELET request fails
because the MVS logger cannot find a specific log stream block identifier. (The
MVS logger issues a return code of 8 with a reason code of 804.) To cater for other
possible logger errors, or to obtain a dump of the coupling facility structure
associated with a failing log stream, you can set an MVS serviceability level
indication processing (SLIP) trap. Setting a SLIP trap causes MVS to take a
specified set of actions when a specified event occurs. For example, you could
specify that MVS is to take a dump of the MVS logger address space if CICS
issues a particular message.
Figure 19 on page 217 shows an example SLIP trap that captures a dump of the
CICS address space, the MVS logger address space, and the coupling facility
structure associated with the failing logstream.
-->response xx
xx,DSPNAME=(’XCFAS’.*,’IXGLOGR’.*),STRLIST
=(STRNAME=structname,LOCKENTRIES,ACC=NOLIM <change STRNAME
,(LISTNUM=ALL,
-->response yy
yy,ENTRYDATA=SERIALIZE,ADJUNCT=CAPTURE)),S
DATA=(RGN,XESDATA,ALLNUC,CSA,LSQA,PSA,SQA,
SWA,TRT,COUPLE,WLM,GRSQ,LPA),
-->response zz
zz,ID=LOGR,REMOTE=(JOBLIST,DSPNAME,SDATA),
END
Figure 19. An example SLIP trap. The trap is triggered when CICS issues a DFHLG0772
message. It captures dumps of the CICS address space, the MVS logger address space, and
the coupling facility structure associated with the failing log stream.
In this example, the SLIP triggers when a specific CICS log manager message
DFHLG0772 is written to the console. This is specified in the EQ parameter of the
SLIP:
+4,EQ,C4C6C8D3,+8,EQ,C7F0F7F7,+C,EQ,F2)
D F H L G 0 7 7 2 <equates to
You can also set a more “generic” trap, that is triggered by the occurrence of any
one of a range of messages. For example, to cause the SLIP to be triggered by any
log manager message in the DFHLG07xx range, alter the value of the EQ parameter
to:
+4,EQ,C4C6C8D3,+8,EQ,C7F0F7),
D F H L G 0 7 <equates to
Note:
1. The example SLIP will just fit into the extended operator command area
of MVS Version 5 or later.
2. The example SLIP may result in extra dumps being produced for both
CICS and the MVS logger address space.
For definitive information about setting SLIP traps, see the OS/390 MVS
Diagnostics: Tools and Service Aids manual, SY28-1085-03.
CAUTION:
If you forcibly cancel the MVS logger address space (by issuing a FORCE
IXGLOGR,ARM command) or coupling facility structures used by the MVS logger
(by issuing a SETXCF FORCE,CON,STRNAME=structname,CONNAME=ALL command),
there is a risk of corruption in the CICS system logs. If the system log is
corrupted, CICS issues a message telling you that you need to perform an
initial start. Data integrity will be compromised because of the loss of log data
required for consistency.
To prevent the problem recurring, you also need to gather diagnosis information that
will enable IBM Service to discover why the log was corrupted. Unfortunately,
performing an initial start destroys all information from the previous run of CICS. To
gather diagnostic information:
1. Scan the failed system log, using a utility such as DFHJUP. However, the output
produced by DFHJUP in these circumstances is not easy to interpret.
2. To supplement DFHJUP's output, perform a diagnostic run of CICS, using the
corrupt system log, before performing the initial start.
a. Specify AUTO on the START system initialization parameter. If the system log
becomes corrupt, CICS:
Getting dumps of the MVS logger and coupling facility address spaces
For reliable diagnosis, it is important that you have dumps of the MVS logger
address space and (if applicable) the coupling facility structures used by the system
log.
This means that, before performing the diagnostic run, you will probably need to set
a SLIP trap, as described in “Setting a SLIP trap” on page 216.
You do not need to specify a dump of the CICS system, because one is taken
automatically by the diagnostic run mechanism.
3. Specify a dump of the MVS logger address space. See the example SLIP. If you
have applied MVS APAR OW27057, and the original failure occurred because
the MVS logger was unable to find a specific log stream block identifier, an
extra dump may be produced.
4. If the system log uses coupling facility log streams, specify a dump of the
coupling facility structure. You can get the name of the structure from the two
DFHLG0104 messages that were issued when CICS connected to DFHLOG
and DFHSHUNT during the run in which the failure occurred.
If DFHLOG and DFHSHUNT use separate coupling facility structures, dump
both structures. Specify the names of both structures on the STRLIST parameter.
The types of tracing that can be used for CICS systems are:
v CICS tracing, which is performed by the trace domain at predetermined trace
points in CICS code during the regular flow of control. This includes user tracing
from applications. You get this when you turn on CICS internal tracing, auxiliary
tracing, and GTF tracing. You control this type of tracing to suit your needs,
except that, when an exception condition is detected by CICS, it always makes
an exception trace entry. You cannot turn exception tracing off.
v CICS exit programming interface (XPI) tracing, which uses the TRACE_PUT XPI
call from an exit program. You can control this within the exit program, or by
enabling and disabling exits.
v CICS XRF tracing, which records CICS XRF-related activities. This is always
running if you are operating in a CICS XRF environment.
v Program check and abend tracing, which is used by CICS to record pertinent
information when a program check or abend occurs. This is controlled by CICS
code.
v CICS VTAM exit tracing. The exits are driven by VTAM when it reaches a
particular stage in its asynchronous processing, but the trace points are in CICS
code. You can turn CICS VTAM exit tracing on or off.
v VTAM buffer tracing. This is a part of VTAM, but it can be used to record the flow
of data between logical units in the CICS environment. You can control this type
of tracing to meet your needs.
In addition to the general trace produced by CICS, there are a number of other,
more specialized forms of trace that you can use. These are:
v CICS exception tracing
v CICS XRF tracing
v Program check and abend tracing
v CICS VTAM tracing
v FEPI trace.
For information about using trace to solve FEPI problems, see the CICS Front End
Programming Interface User's Guide.
You have a large amount of control over the amount of CICS tracing that is done.
There are a number of selection mechanisms available to you to control the extent
of CICS tracing carried out in the system. These are :
v “Selecting tracing by transaction” on page 232
v “Selecting tracing by component” on page 234
v “Selecting trace destinations and related options” on page 240
v “Setting the tracing status” on page 243
You can select any combination of internal tracing, auxiliary tracing and GTF tracing
to be active at the same time. Your choice has no effect on the selectivity with
which system tracing is done, but each type of tracing has a set of characteristic
properties. These properties are described in “CICS internal trace” on page 241,
CICS tracing
General CICS tracing is handled by trace domain. It traces the flow of execution
through CICS code, and through your applications as well. You can see what
functions are being performed, which parameters are being passed, and the values
of important data fields at the time trace calls are made. This type of tracing is also
useful in “first failure data capture”, if an exception condition is detected by CICS.
For programming information about how to make trace calls from within your own
programs, see the CICS Application Programming Reference.
Trace points
Trace points are included at specific points in CICS code; from these points, trace
entries can be written to any currently selected trace destination. All CICS trace
points are listed in alphanumeric sequence in CICS Trace Entries.
Trace levels
Some trace points are used to make exception traces when exception conditions
occur, and some are used to trace the mainline execution of CICS code. Trace
points of the latter type each have an associated “level” attribute. The value of this
attribute depends on where the trace point is, and the sort of detail it can provide
on a trace call.
Trace levels can, in principle, vary in value in the range 1–32, but in practice nearly
all mainline trace points have a trace level of 1 or 2.
Level-1 trace points are designed to give you enough diagnostic information to fix
“user” errors. The following is a summary of where they are located, and a
description of the information they return:
v On entry to, and exit from, every CICS domain. The information includes the
domain call parameter list, and data whose address is contained in the
parameter list if it is necessary for a high-level understanding of the function to
be performed.
v On entry to, and exit from, major internal domain functions. The information
includes parameters passed on the call, and any output from the function.
v Before and after calls to other programs, for example, VTAM. The information
includes what request is to be made, the input parameters on the request, and
the result of the call.
v At many of the points where trace calls were made in CICS/MVS Version 2. The
type of information is the same as for that release.
Level-2 trace points are situated between the level-1 trace points, and they
provide information that is likely to be more useful for fixing errors within CICS
code. You probably will not want to use level-2 trace points yourself, unless you are
requested to do so by IBM support staff after you have referred a problem to them.
Level-3 trace points and above are reserved for special cases. Very few
components have trace points higher than 2, and they are only likely to be of use
You can select how much CICS system tracing is to be done on the basis of the
trace level attributes of trace points. You can make your selection independently for
each CICS component, and you can also vary the amount of tracing to be done for
each task. This gives you control over what system tracing is done.
Note: In the storage manager component (SM), two levels of tracing, level 3 and
level 4, are intended for IBM field engineering staff. These trace levels take
effect only if specified in system initialization parameters and modify the
internal SM operation for CICS subpools as follows:
SM level 3 trace
The quickcell mechanism is deactivated. Every CICS subpool,
regardless of quickcelling requirements, will issue domain calls for
getmain and freemain services, and these calls will be traced.
SM level 4 trace
Subpool element chaining on every CICS subpool is forced. Every
CICS subpool, regardless of element chaining requirements, will use
element chaining.
Trace points
Trace points are included at specific points in CICS code; from these points, trace
entries can be written to any currently selected trace destination. All CICS trace
points are listed in alphanumeric sequence in CICS Trace Entries.
Trace levels
Some trace points are used to make exception traces when exception conditions
occur, and some are used to trace the mainline execution of CICS code. Trace
points of the latter type each have an associated “level” attribute. The value of this
attribute depends on where the trace point is, and the sort of detail it can provide
on a trace call.
Trace levels can, in principle, vary in value in the range 1–32, but in practice nearly
all mainline trace points have a trace level of 1 or 2.
Level-1 trace points are designed to give you enough diagnostic information to fix
“user” errors. The following is a summary of where they are located, and a
description of the information they return:
v On entry to, and exit from, every CICS domain. The information includes the
domain call parameter list, and data whose address is contained in the
parameter list if it is necessary for a high-level understanding of the function to
be performed.
v On entry to, and exit from, major internal domain functions. The information
includes parameters passed on the call, and any output from the function.
v Before and after calls to other programs, for example, VTAM. The information
includes what request is to be made, the input parameters on the request, and
the result of the call.
Level-3 trace points and above are reserved for special cases. Very few
components have trace points higher than 2, and they are only likely to be of use
by IBM support staff. The SJ domain uses trace levels 29–32 to control JVM
tracing, but these correspond to JVM trace levels 0, 1, and 2, plus a user-definable
trace level.
You can select how much CICS system tracing is to be done on the basis of the
trace level attributes of trace points. You can make your selection independently for
each CICS component, and you can also vary the amount of tracing to be done for
each task. This gives you control over what system tracing is done.
Note: In the storage manager component (SM), two levels of tracing, level 3 and
level 4, are intended for IBM field engineering staff. These trace levels take
effect only if specified in system initialization parameters and modify the
internal SM operation for CICS subpools as follows:
SM level 3 trace
The quickcell mechanism is deactivated. Every CICS subpool,
regardless of quickcelling requirements, will issue domain calls for
getmain and freemain services, and these calls will be traced.
SM level 4 trace
Subpool element chaining on every CICS subpool is forced. Every
CICS subpool, regardless of element chaining requirements, will use
element chaining.
CICS uses a similar mechanism for both exception tracing and “normal” tracing.
Exception trace entries are made from specific points in CICS code, and data is
taken from areas that might provide information about the cause of the exception.
The first data field in the trace entry is usually the parameter list from the last
domain call, because this can indicate the reason for the exception.
The exception trace points do not have an associated “level” attribute, and trace
calls are only ever made from them when exception conditions occur.
Exception trace entries are always written to the internal trace table, even if no
trace destinations at all are currently STARTED. That is why there is always an
You can select tracing options so that exception traces only are made to an
auxiliary trace data set. This is likely to be useful for production regions, because it
enables you to preserve exception traces in auxiliary storage without incurring any
general tracing overhead. You need to disable all standard and special task tracing,
and enable auxiliary trace:
1. Ensure that special tracing has not been specified for any task.
2. Set the master system trace flag off.
3. Set the auxiliary trace status to STARTED, and the auxiliary trace data set and
the auxiliary switch status to whatever values you want.
Exception traces are now made to an auxiliary trace data set, but there is no other
tracing overhead.
The format of an exception trace entry is almost identical to that of a normal trace
entry. However, you can identify it by the eye-catcher *EXC* in the header.
Note: Exception conditions that are detected by MVS, for example, operation
exception, protection exception, or data exception, do not cause a CICS
exception trace entry to be made directly. However, they do cause a CICS
recovery routine to be invoked, and that, in turn, causes a “recovery”
exception trace entry to be made.
The user exception trace entries CICS writes are identified by the character string
*EXCU in any formatted trace output produced by CICS utility programs. For
example, an application program exception trace entry generated by an EXEC
CICS ENTER TRACENUM() EXCEPTION command appears in formatted trace
output as:
USER *EXCU - APPLICATION-PROGRAM-EXCEPTION
If you use the exit programming interface (XPI) trace control function to write user
trace entries, you can use the DATA1 block descriptor to indicate whether the entry
is an exception trace entry. Enter the literal ‘USEREXC’ in the DATA1 field on the
DFHTRPTX TRACE_PUT call to identify an exception trace entry. This is
interpreted by the trace formatting utility program as follows:
USER *EXCU - USER-EXIT-PROGRAM-EXCEPTION
See the CICS Customization Guide for programming information about XPI trace
control function.
Note that CICS XRF tracing is quite distinct from the “normal” CICS tracing that can
originate from the CAVM, which is identified by trace point IDs AP 00C4 through
AP 00C7.
The XRF trace entries are 32 bytes long and are written to a trace table in main
storage. The table has a fixed size of 64KB, and it wraps around when it is full.
The table starts with 28 bytes of control information, in the format shown in
Table 22.
Table 22. Control information at the start of the XRF trace table
Bytes Contents
0–15 '*** XRF TRACE **'
16–19 Address of start of trace entries
20–23 Address of end of trace entries
24–27 Address of end of most recent entry
Trace entries are 32 bytes long, and have the format shown in Table 23.
Table 23. Format of an XRF trace entry
Bytes Contents
0 Type code
1 Subtype
2–3 Process ID of XRF process that made the entry
4–27 Trace data—the format depends on the type or the subtype
28–31 Clock value when entry was made, same format as “normal” CICS trace
entries
Process IDs are assigned in order of process ATTACH starting from 1. Some
special values are used for processes which are not known to the dispatcher, but
which cause trace entries to be made. These are:
Process ID Function
X'0000' Initial attach
X'FFFE' ESPIE/ESTAE error handling
X'FFFF' Dispatcher activities.
Entry types
The entries are as follows:
Table 24. XRF trace entry types
Module Type Subtype Description
DFHWLGET 1 1 Module entry Bytes 4-11 Module name Bytes
12-15 LIFO allocation address
You cannot format the program check and abend trace information directly, but you
get a summary of its contents in a formatted CICS system dump when you specify
dump formatting keyword KE. The information is provided in the form of a storage
report for each task that has had a program check or an abend during the current
run of CICS.
You can control it online, using transaction CETR. See Figure 20 on page 233 for
an illustration of the screen you need to use.
When CICS issues a VTAM request, VTAM services the request asynchronously
and CICS continues executing. When VTAM has finished with the request, it returns
control to CICS by driving a CICS VTAM exit. Every such exit contains a trace
point, and if CICS VTAM exit tracing is active, a trace entry is written to the GTF
trace data set. GTF tracing must be active, but you do not need to start it explicitly
from CICS. It is enough to start VTAM exit tracing from the CETR transaction and
terminal trace panel.
Note: The GTF trace data set can receive trace entries from a variety of jobs
running in different address spaces. You need to identify the trace entries
that have been made from the CICS region that interests you. You can do
this by looking at the job name that precedes every trace entry in the
formatted output.
You can use this type of tracing in any of the cases where you might want to use
VTAM buffer tracing, but it has the advantage of being part of CICS and, therefore,
controllable from CICS. This means that you do not need a good understanding of
VTAM system programming to be able to use it. CICS VTAM exit tracing also has
the advantage of tracing some important CICS data areas relating to VTAM
requests, which might be useful for diagnosing problems.
If you select “normal” CICS tracing for the affected terminals at the same time as
you have CICS VTAM exit tracing running, you can then correlate CICS activities
more easily with the asynchronous processing done by VTAM.
If you need to turn on CICS VTAM exit tracing in an application owning region
(AOR) while you are signed-on to a terminal in a terminal owning region (TOR),
follow these steps:
1. Invoke CETR on the AOR.
2. Press PF5 to call up the CETR transaction and terminal trace screen.
3. Enter the APPLID of the TOR in the NETNAME field.
4. Complete other fields as required.
5. Press Enter.
CICS VTAM trace entries are always written to the GTF trace data set, and you can
format them in the usual way. See Chapter 16, “Formatting and interpreting trace
entries,” on page 247 for more information. Direct all “normal” CICS tracing to the
GTF trace destination as well, so you get the regular trace entries and the CICS
VTAM exit trace entries in sequence in a single data set. If you send the normal
tracing to another destination, you get only the isolated traces from the exit modules
with no idea of related CICS activity.
The trace entries, which include the netname of the terminal to which they relate,
are made to the GTF trace data set. If you want to send “normal” CICS trace
entries there, you can rationalize the activities of CICS with the asynchronous
activities of VTAM. For details of VTAM buffer tracing, see the appropriate manual
in the VTAM library.
For each component, you can specify two sets of trace level attributes. The trace
level attributes define the trace point IDs to be traced for that component when
standard task tracing is being done and when special task tracing is being done,
respectively.
If you are running a test region, you probably have background tracing most of the
time. In this case, the default tracing options (standard tracing for all transactions,
and level-1 trace points only in the standard set for all components) probably
suffice. All you need do is to enable the required trace destinations and set up any
related tracing options. Details are given in “Selecting trace destinations and related
options” on page 240.
When specific problems arise, you can set up special tracing so you can focus on
just the relevant tasks and components. Use this procedure to specify the tracing
you need:
1. If you believe that specific tasks are involved in the problem, use special
tracing:
v When the problem is associated with a non-terminal task, or is associated
with particular transactions, select special tracing for each suspect
transaction.
v When the problem is associated with particular terminals, select special
tracing for each suspect terminal.
2. If you believe that specific components are implicated in the problem:
a. For each suspected component, decide whether you need special level-1
tracing only, or level-1 and level-2 tracing.
b. Turn special tracing off for all other components.
3. If you do not need standard tracing, turn the master system trace flag off.
4. Enable the trace destinations.
The type of task tracing you get for the various combinations of transaction tracing
and terminal tracing is summarized in the truth table shown in Table 25.
Table 25. The combination of task trace options
OPTION on TRANSACTION OPTION on TERMINAL Task tracing
tracing suppressed standard tracing SUPPRESSED
tracing suppressed special tracing SUPPRESSED
standard tracing standard tracing STANDARD
standard tracing special tracing SPECIAL
special tracing standard tracing SPECIAL
special tracing special tracing SPECIAL
You can set up the task tracing you want using the CETR transaction, with the
screen shown in Figure 20. You need to type in the transaction ID or the terminal ID
or the netname for the terminal, together with the appropriate tracing.
The status can be any one of STANDARD, SPECIAL, or SUPPRESSED for the
transaction, and either STANDARD or SPECIAL for the terminal.
This screen can also be used to set up certain other terminal tracing options. You
can select ZCP tracing for a named terminal (trace point ID AP 00E6), and you can
also select CICS VTAM exit tracing for the terminal. For more details about CICS
VTAM exit tracing, see “CICS VTAM exit tracing” on page 230.
The CETR transaction can, for example, help you to get standard tracing for a
CETR Transaction and Terminal Trace
Type in your choices.
Item Choice Possible choices
Transaction ID ===> Any valid 4 character ID
Transaction Status ===> STandard, SPecial, SUppressed
Terminal ID ===> Any valid Terminal ID
Netname ===> Any valid Netname
Terminal Status ===> STandard, SPecial
Terminal VTAM Exit Trace ===> ON, OFf
Terminal ZCP Trace ===> ON, OFf
VTAM Exit override ===> NONE All, System, None
When finished, press ENTER.
PF1=Help 3=Quit 6=Cancel Exits 9=Error List
Figure 20. CETR screen for specifying standard and special task tracing
transaction when it is run at one terminal, and special tracing when it is run at a
second terminal.
Note:
1. You can turn standard tracing off for all tasks by setting the master
system trace flag off. You can do this with the CETR transaction, using
the screen shown in “CETR - trace control” on page 243, or you can
code SYSTR=OFF at system initialization. However, any special task
tracing will continue—it is not affected by the setting of the system
master trace flag.
2. If you run with standard tracing turned off and you specify levels of
tracing for the required components under the "Special" heading in the
“Components Trace Options” screen shown in Figure 21 on page 237,
“Component names and abbreviations” on page 235 lists the components for which
you can select trace levels for standard and special tracing. You can reference this
list online through CETR, by pressing PF1 on the component screen (see Figure 21
on page 237).
The component codes BF, BM, BR, CP, DC, DI, EI, FC, IC, IS, KC, PC, SC, SZ,
TC, TD, TS, UE, and WB are subcomponents of the AP domain. The corresponding
trace entries are produced with a point ID of AP nnnn.
For example, trace point AP 0471 is a file control level-1 trace point and AP 0472
is a file control level-2 trace point. These trace points are produced only if the trace
setting for the FC component is “(1,2)” or “ALL”. The component code AP is used
for trace points from the AP domain that do not fall into any of the subcomponent
areas listed above.
The SJ domain, which controls JVM tracing, is a special case. As well as using the
normal trace levels, the SJ domain uses trace levels 29–32, which are reserved to
indicate the JVM trace levels 0, 1, and 2, plus a user-definable JVM trace level. You
can activate trace levels 29–32 using the normal system initialization parameters
that you would use to set trace levels for components, but to activate these trace
levels using the CETR transaction, you need to use the JVM Trace screens, rather
than the Component Trace screens. The JVM trace options are set using a
"free-form" 240–character field, and you can set these using the JVM tracing
For more information about system initialization parameters, see the CICS System
Definition Guide.
Figure 21 on page 237 shows you what the CETR Component Trace Options
screen looks like. To make changes, overtype the settings shown on the screen,
v With standard task tracing in effect, from level-1 trace points of all the
components listed.
v With standard task tracing in effect, from level-2 trace points for the 3270 Bridge
component.
v With special task tracing in effect:
– From level-1 trace points only for components DI, EI, IC, and KC
– From both level-1 and level-2 trace points for components AP, CP, DD, DM,
DS, DU, FC, GC, and KE.
No special task tracing is done for components BF, BM, DC, and IS.
This CETR screen should not be used to define trace levels 29–32 for the SJ
domain, which are used to control JVM tracing. “Defining and activating tracing for
JVMs” tells you about the screens that you should use for these trace levels.
The SJ domain uses trace levels 29–32 to control the JVM's internal trace facility.
These correspond to the CICS options for JVM Level 0 trace, JVM Level 1 trace,
JVM Level 2 trace, and JVM User trace.
The default JVM trace options that are provided in CICS map to the Level 0, Level
1 and Level 2 trace point levels for JVMs. The JVM User trace option can be used
to specify deeper levels of tracing or complex trace options.
The JVM trace options are defined using a "free-form" 240–character field. You can
specify some or all of the following parameters:
v A trace level.
| The chapter on tracing Java applications and the JVM in the IBM Developer Kit and
| Runtime Environment, Java 2 Technology Edition Diagnostics Guide, which is
| available to download from www.ibm.com/developerworks/java/jdk/diagnosis/, lists
| the possible trace levels, components, trace point types, and trace point groups.
| These tracing parameters depend on the version of the IBM SDK for z/OS, Java 2
| Technology Edition that you are using, and they can also change during the lifetime
| of a version, so you should check the appropriate version of the Diagnostics Guide
| for the latest information.
| The trace format file supplied with the IBM SDK for z/OS, Java 2 Technology
| Edition lists each JVM trace point with its ID. For Version 1.4.2 of the SDK, the file
| is called TraceFormat.dat, and for Version 5 it is called J9TraceFormat.dat. You can
| use this file to identify an individual JVM trace point. Note that this file is subject to
| change without notice; a version number is included as the first line of the file, and
| this will be updated if the file is changed. You can find this file in the directory
| /usr/lpp/java142/J1.4/lib/, where /java142/J1.4/ is your install location for the SDK.
| (The default install location for Version 5 is java/J5.0.)
JVM trace can produce a large amount of output, so you should normally activate
JVM trace for special transactions, rather than turning it on globally for all
transactions. When you activate trace options for a transaction, CICS passes the
trace options to the JVM at the point when the transaction begins to use the JVM.
The CICS SJ domain level 2 trace point SJ 052E shows the option string that has
been passed to the JVM. The trace options apply only for the duration of the
transaction's use of the JVM.
v To set default JVM trace options for all JVMs in the CICS region, you can use
the CICS system initialization parameters JVMLEVEL0TRACE, JVMLEVEL1TRACE,
JVMLEVEL2TRACE, and JVMUSERTRACE. You can only supply these parameters at
CICS startup; you cannot define them in the DFHSIT macro. You can then use
CETR to view and change these options, if you want. These parameters do not
activate JVM tracing, they only set the default JVM trace options.
v To define or change JVM trace options while CICS is running, use either of these
methods:
1. Use the JVM Trace Options screens in the CETR transaction. You can
specify trace option strings, and specify whether each trace level applies for
standard tracing, special tracing, or both. The CICS Supplied Transactions
manual explains how to do this.
2. Use the EXEC CICS INQUIRE JVMPOOL and EXEC CICS SET JVMPOOL commands.
The INQUIRE JVMPOOL command displays the JVM trace options you have
When you activate JVM trace, the results appear as CICS trace points in the SJ
(JVM) domain. Each JVM trace point that is generated appears as an instance of a
CICS trace point:
v SJ 4D02 is the trace point used for formatted JVM trace information.
| v SJ 4D01 is used for any JVM trace points that cannot be formatted by CICS. If
| you see this trace point often, check that the trace format file supplied with the
| IBM SDK for z/OS, Java 2 Technology Edition is present in the /lib/ subdirectory
| of your SDK installation. For Version 1.4.2 of the SDK, the file is called
| TraceFormat.dat, and for Version 5 it is called J9TraceFormat.dat. CICS requires
| this file to format the JVM trace points.
If the JVM trace facility fails, CICS issues the trace point SJ 4D00.
The IBM Developer Kit and Runtime Environment, Java 2 Technology Edition
Diagnostics Guide, which is available to download from www.ibm.com/
developerworks/java/jdk/diagnosis/, has more detailed information about JVM trace
and about problem determination for JVMs.
In addition to the interfaces provided by CICS, the JVM's internal trace facility can
be used directly. JVM system properties are a valid method of setting and activating
trace options for JVMs in a CICS environment. The Diagnostics Guide has more
information about the system properties that you can use to control the JVM's
internal trace facility.
A Level 0 trace point is very important, and this classification is reserved for
extraordinary events and errors. Note that unlike CICS exception trace, which
cannot be switched off, the JVM Level 0 trace is normally switched off unless JVM
tracing is required.
It is suggested that you keep the CICS-supplied level specifications for JVM Level 0
trace, JVM Level 1 trace, and JVM Level 2 trace. However, if you find that another
JVM trace point level is more useful for your purposes than one of the default
levels, you could change the level specification to map to your preferred JVM trace
point level. For example, you could specify LEVEL5 instead of LEVEL2 for the
JVMLEVEL2TRACE option.
Note that enabling a JVM trace point level enables that level and all levels above it,
so for example, if you activate JVM Level 1 trace for a particular transaction, you
receive Level 0 trace points for that transaction as well. This means that you only
need to activate the deepest level of tracing that you require, and the other levels
are activated as well.
If you want to create more complex specifications for JVM tracing which use
multiple trace point levels, or if you do not want to use trace point levels at all in
your specification, use the JVMUSERTRACE option to create your own trace option
string.
You can activate the SJ domain trace points at levels 0, 1 and 2 using the CETR
Component Trace screens. Selecting tracing by component, in the CICS Problem
Determination Guide, explains how to do this.
The SJ domain includes a level 2 trace point SJ 0224, which shows you a history of
the programs that have used each JVM.
“JVM domain trace points”, in the CICS Trace Entries manual, has details of all the
standard trace points in the SJ domain.
You can select any combination of CICS internal tracing, CICS auxiliary tracing, and
CICS GTF tracing. Your decision must be based on:
1. The characteristics of the various types of CICS tracing.
2. How much trace data you need to capture.
3. Whether you want to integrate CICS tracing with tracing done by other
programs.
You can control the status and certain other attributes of the various types of CICS
tracing either dynamically, using the CETR transaction, or during system
initialization, by coding the appropriate system initialization parameters.
The internal trace table has a minimum size of 16KB, and a maximum size of
1 048 576KB. The table is extendable from 16KB in 4KB increments. You can
change the size of the table dynamically, while CICS is running, but if you do so
you lose all of the trace data that was present in the table at the time of the
change. If you want to keep the data and change the size of the table, take a
system dump before you make the change.
The internal trace table wraps when it is full. When the end of the table is reached,
the next entry to be directed to the internal trace entry goes to the start, and
overlays the trace entry that was formerly there. In practice, the internal trace table
cannot be very big, so it is most useful for background tracing or when you do not
need to capture an extensive set of trace entries. If you need to trace CICS system
activity over a long period, or if you need many entries over a short period, one of
the other trace destinations is likely to be more appropriate.
Note that the internal trace table is always present in virtual storage, whether you
have turned internal tracing on or not. The reason is that the internal trace table is
used as a destination for trace entries when CICS detects an exception condition.
Other trace destinations that are currently selected get the exception trace entry as
well, but the entry always goes to the internal trace table even if you have turned
tracing off completely. This is so that you get “first failure data capture”.
You can use the AUXTR system initialization parameter to turn CICS auxiliary trace
on or off in the system initialization table.
You can select a status of STARTED, STOPPED, or PAUSED for CICS auxiliary
trace dynamically using the CETR transaction. These statuses reflect both the value
of the auxiliary trace flag, and the status of the current auxiliary trace data set, in
the way shown in Table 27.
Table 27. The meanings of auxiliary trace status values
Auxiliary tracing status Auxiliary trace flag Auxiliary trace data set
STARTED ON OPEN
STOPPED OFF CLOSED
PAUSED OFF OPEN
When you first select STARTED for CICS auxiliary trace, any trace entries are
directed to the initial auxiliary trace data set. If CICS terminated normally when
auxiliary trace was last active, this is the auxiliary trace data set that was not being
used at the time. Otherwise, it is the DFHAUXT data set. If you initialize CICS with
auxiliary trace STARTED, DFHAUXT is used as the initial auxiliary trace data set.
NO means that when the initial data set is full, no more auxiliary tracing is done.
NEXT means that when the initial data set is full, then the other data set receives
the next trace entries. However, when that one is full, no more trace data is written
to auxiliary trace.
ALL means that auxiliary trace data is written alternately to each data set, a switch
being made from one to the other every time the current one becomes full. This
means that trace entries already present in the trace data sets start getting
overwritten when both data sets become full for the first time.
The advantage of using auxiliary trace is that you can collect large amounts of trace
data, if you initially define large enough trace data sets. For example, you might
want to do this to trace system activity over a long period of time, perhaps to solve
an unpredictable storage violation problem.
A time stamp is included in the header line of every page of abbreviated auxiliary
trace output to help match external events with a particular area of the trace, and
thus help you to find the trace entries that are of interest.
You can switch CICS GTF trace on or off by using the GTFTR system initialization
parameter.
You can select a status of STARTED or STOPPED for CICS GTF trace dynamically
using the CETR transaction. MVS GTF trace must be started with the TRACE=USR
option before CICS GTF trace is started, because otherwise no trace entries can be
written.
In a multisystem environment, trace entries from all supported CICS releases can
be recorded in the GTF trace data set. Entries from all releases are formatted by
the CICS-supplied routine, DFHTG650. See the CICS Operations and Utilities
Guide for details of how to format and print the GTF trace data set.
When the GTF trace data set is full, it wraps. The next trace entries are written at
the start of the data set, and the entries that were formerly there are overlaid. Thus,
you need to define a data set that is big enough to capture all the trace entries that
interest you.
The MVS GTF trace destination can be used not only by CICS, but by other
programs as well. This gives you the opportunity of integrating trace entries from
CICS with those from other programs. A single GTF trace data set could, for
example, contain trace entries made by both CICS and VTAM. You can relate the
two types of trace entry using a unique task identifier in the trace header, known to
both CICS and VTAM.
Start the transaction by typing CETR on the command line of your display, as
follows:
CETR
In this example:
v Internal tracing status is STOPPED, and so regular tracing is not directed
explicitly to the internal trace table. However, note that the internal trace table is
used as a buffer for the other trace destinations, so it always contains the most
recent trace entry if at least one trace destination is STARTED.
The internal trace table is also used as a destination for exception trace entries,
which are made whenever CICS detects an exception condition. If such a
condition is detected when the options shown in this example are in effect, you
would be able to find the exception trace entry in the internal trace table as well
as in the GTF trace data set.
v The internal trace table size is 16KB, which is the minimum size it can be. If
internal trace were STARTED, the trace table would wrap when it became full.
v The current auxiliary trace data set is B, meaning that trace entries are written to
DFHBUXT if auxiliary tracing is started. As its status is shown to be PAUSED, no
tracing is done to that destination. The auxiliary switch status is ALL, so a switch
would be made to the other auxiliary trace data set whenever one became full.
v The GTF trace status is shown to be STARTED, which means that CICS trace
entries are written to the GTF trace data set defined to MVS. Be aware that no
error condition is reported if the CICS GTF status is started but GTF tracing has
not been started under MVS. If this happens, the trace entries are not written.
v The master system trace flag is OFF. This means that no standard tracing is
done at all, even though standard tracing might be specified for some tasks.
However, special task tracing is not affected. The master system trace flag only
determines whether standard task tracing is to be done.
Any of the input fields can be overtyped with the new values that you require. When
you press ENTER, CETR issues the necessary commands to set the new values. If
The following logic is used to ensure that trace entries are written to the required
destinations:
1. The trace entry is built in the internal trace table.
2. If auxilliary tracing status is STARTED, the trace data is copied to the current
auxilliary trace data set.
3. If GTF tracing status is STARTED and GTF tracing is started under MVS with
the TRACE=USR option, the trace data is copied to the GTF trace data set.
The following table shows the relationships between the auxiliary trace status, trace
flag, and trace data set.
Table 28. The meanings of auxiliary trace status values
Auxiliary tracing status Auxiliary trace flag Auxiliary trace data set
Started On Open
Paused Off Open
Stopped Off Closed
| Use caution when setting TRTABSZ to a very high value because there must be
| enough MVS page storage to satisfy both the request and DSA sizes. Use the
| system command DISPLAY ASM MVS to display current information about status and
| utilization of all MVS page data sets.
For information about the use of the various CETR options as an aid to problem
determination, see the CICS Problem Determination Guide
You can specify abbreviated, short, or extended trace formatting, to give you
varying levels of information and detail in your output. Typically, abbreviated-format
trace gives you one line of trace per entry; short-format provides two lines of trace
per entry; extended-format provides many lines of trace per entry. The structures of
the different types of trace entry are described in the sections that follow.
Most of the time, the abbreviated trace table is the most useful form of trace
formatting, as you can quickly scan many trace entries to locate areas of interest.
However, in error situations, you might require more information than the
abbreviated trace can provide. The short trace provides the information that is
presented in the abbreviated trace, and, additionally, presents certain items that are
presented in the full trace. These are:
v Interpreted parameter list
v Return address
v Time that the trace entry was written
v Time interval between trace entries
These items of information are often very useful in the diagnosis of problems. By
selecting the short format, you can gain access to this information without having to
bear the processing overhead of formatting a full trace, and without having to deal
with the mass of information in a full trace.
There may be occasions, however, when you need to look at extended format trace
entries, to understand more fully the information given in the corresponding
abbreviated and short entries, and to be aware of the additional data supplied with
many extended trace entries.
Auxiliary trace can be formatted using the CICS trace utility program, DFHTU650.
You can control the formatting, and you can select trace entries on the basis of
task, terminal, transaction, time frame, trace point ID (single or range), dispatcher
task reference, and task-owning domain. This complements the usefulness of
auxiliary trace for capturing large amounts of trace data.
GTF trace can be formatted with the same sort of selectivity as auxiliary trace,
using a CICS-supplied routine with the MVS interactive problem control system
(IPCS).
There are two slightly different extended trace entry formats. One (“old-style”)
resembles the format used in earlier releases of CICS, and gives FIELD A and
FIELD B values. The other (“new-style”) uses a different format, described below.
Both types of formatted trace entries always show the information that you need to
analyze.
1. Look at the trace point ID. This is an identifier that indicates where the trace
point is in CICS code. In the case of application (AP) domain, the request type
field included in the entry is also needed to uniquely identify the trace point. For
all other domains, each trace point has a unique trace point ID.
Its format is always a two-character domain index, showing which domain the
trace point is in, then a space, then a four-digit (two-byte) hexadecimal number
identifying the trace point within the domain. The following are examples of
trace point IDs:
AP 00E1 trace point X’00EE’ in Application Domain
DS 0005 trace point X’0005’ in Dispatcher Domain
TI 0101 trace point X’0101’ in Timer Domain
2. Look at the interpretation string. It shows:
v The module where the trace point is located
v The function being performed
v Any parameters passed on a call, and any response from a called routine.
3. Look at the standard information string. It shows:
v The task number, which is used to identify a task uniquely for as long as it is
in the system. It provides a simple way of locating trace entries associated
with specific tasks, as follows:
– A five-digit decimal number shows that this is a trace entry for a task with
a TCA, the value being taken from field TCAKCTTA of the TCA.
– A three-character non-numeric value in this field shows that the trace entry
is for a system task. You could, for example, see “III” (initialization), or
“TCP” (terminal control).
– A two-character domain index in this field shows that the trace entry is for
a task without a TCA. The index identifies the domain that attached the
task.
v The kernel task number (KE_NUM), which is the number used by the kernel
domain to identify the task. The same KE_NUM value for the task is shown in
the kernel task summary in the formatted system dump.
v The time when the trace entry was made. (Note that the GTF trace time is
GMT time.)
v The interval that elapsed between this and the previous trace entry, in
seconds.
The standard information string gives two other pieces of useful information:
| v The CICS TCB ID and the address of the MVS TCB (field TCB ) that is in
| use for this task. This field can help you in comparing a CICS trace with the
| corresponding MVS trace. As there can be multiple OTE TCBs, the TCB ID
Abbreviated trace entries show the CICS TCB ID of the TCB instead of an MVS
TCB address.
v If you are using old-style trace entries, use the following example to help you
interpret the trace.
In this example:
00021 QR AP 00E1 EIP ENTRY INQUIRE-TRACEFLAG 0004,00223810 ....,00007812 .... =000005=
Figure 25. Example of the abbreviated format for an old-style trace entry
Note: For some trace entries, an 8-character resource field appears to the right
of FIELD B. Also, some trace entries include a RESOURCE field.
For ZCP trace entries, FIELD B (which contains the TCTTE address) is printed
twice on each line. This allows both sides of the output to be scanned for the
terminal entries on an 80-column screen without having to scroll left and right.
v If you are using new-style trace entries, use the following example to help you
interpret the trace.
In this example:
00021 QR LD 0002 LDLD EXIT ACQUIRE_PROGRAM/OK 03B8A370 , 00000001,848659C0,048659A0,410,200,REUSABLE =000023=
Figure 26. Example of the abbreviated format for a new-style trace entry
loader domain.
The interpretation string provides this information:
– LDLD tells you the trace call was made from within module DFHLDLD.
– EXIT FUNCTION(ACQUIRE_PROGRAM) tells you the call was made on
exit from the ACQUIRE_PROGRAM function
The standard information string gives you this information:
– The task currently running has a task number of 00021.
– The kernel task number for the task is 0007.
– The time when the trace entry was made was 10:45:49.6888118129.
(Note that the GTF trace time is GMT time.)
Extended format user trace entries show a user-defined resource field, and a
user-supplied data field that can be up to 4000 bytes in length. A typical
extended-format entry is shown in Figure 28.
AP 000B USER EVENT - APPLICATION-PROGRAM-ENTRY - SEND - CICS USER TRACE ENTRY HELP INFORMATION
TASK-00163 KE_NUM-0007 TCB-QR /009F3338 RET-8003F54C TIME-16:32:01.1295568750 INTERVAL-00.0001965625 =000731=
1-0000 E4E2C5D9 404040 *USER *
2-0000 C3C9C3E2 40E4E2C5 D940E3D9 C1C3C540 C5D5E3D9 E8404040 40404040 40404040 *CICS USER TRACE ENTRY *
0020 C8C5D3D7 40C9D5C6 D6D9D4C1 E3C9D6D5 40404040 40404040 40404040 40404040 *HELP INFORMATION *
0040 40404040 40404040 40404040 40404040 40404040 40404040 40404040 40404040 * *
0060 40404040 40404040 40404040 40404040 40404040 40404040 40404040 40404040 * *
3-0000 E2C5D5C4 40404040 *SEND *
Figure 28. Example of the extended format for a user trace entry
The interpretation string for the entry contains the string “APPLICATION-
PROGRAM-ENTRY”, to identify this as a user trace entry, and the resource field.
00163 QR AP 000B USER EVENT APPLICATION-PROGRAMRY SEND CICS USER TRACE ENTRY HELP INFORMATION =000731=
Figure 29. Example of the abbreviated format for a user trace entry
00031 QR AP 000B USER EVENT APPLICATION-PROGRAM-E SEND - CICS USE RET-800820A2 11:42:27.1176805000 00.0000 247500 =00 0815=
Figure 30. Example of the short format for a user trace entry
The type of dump to use for problem determination depends on the nature of the
problem. In practice, the system dump is often more useful, because it contains
more information than the transaction dump. You can be reasonably confident that
the system dump has captured all the evidence you need to solve your problem,
but it is possible that the transaction dump might have missed some important
information.
The amount of CICS system dump data that you could get is potentially very large,
but that need not be a problem. You can leave the data on the system dump data
set, or keep a copy of it, and format it selectively as you require.
You can control the dump actions taken by CICS, and also what information the
dump output contains. There are two aspects to controlling dump action:
1. Setting up the dumping environment, so that the appropriate dump action is
taken when circumstances arise that might cause a dump to be taken.
2. Causing a dump to be taken. Both users and CICS can issue requests for
dumps to be taken.
For information about using dumps to solve FEPI problems, see the CICS Front
End Programming Interface User's Guide.
Each CICS system dump header includes a symptom string. The symptom string
will be created only if the system dump code has the DAE option specified in the
dump table entry. The default action is that symptom strings are not produced. This
can, however, be altered by means of the DAE system initialization parameter.
On most occasions when dumps are requested, CICS references a dump code that
is specified either implicitly or explicitly to determine what action should be taken.
Dump codes are held in two dump tables, the transaction dump table and the
system dump table.
You might use these methods of taking dumps if, for example, you had a task in a
wait state, or you suspected that a task was looping. However, these methods are
not useful for getting information following a transaction abend or a CICS system
abend. This is because the evidence you need is almost certain to disappear before
your request for the dump has been processed.
CICS does not take a transaction dump if a HANDLE ABEND is active at the
current logical level. This is called an implicit HANDLE ABEND and causes the
suppression of transaction dumps. To make PL/I on units work, PL/I library routines
can issue HANDLE ABEND. The NODUMP option on an EXECS CICS ABEND
command, an internal call, or the transaction definition, prevents the taking of a
transaction dump.
You need MVS/ESA 5.1, the MVS workload manager, and the XCF facility in order
to collect dump data in this way. The MVS images in the sysplex must be
connected via XCF. The CICS regions must be using MRO supported by the CICS
TS 3.2 interregion communication program, DFHIRP.
The CICS regions must be connected via XCF/MRO. Connections using VTAM ISC
are not eligible to use the related dump facility.
The function is controlled by the DUMPSCOPE option on each CICS dump table
entry. You can set this option to have either of the following values:
v RELATED - take dumps for all related CICS regions across the sysplex.
v LOCAL - take dumps for the requesting CICS region only. This is the default.
The DUMPSCOPE option is available on the following master terminal and system
programming commands:
v EXEC CICS INQUIRE SYSDUMPCODE
v EXEC CICS SET SYSDUMPCODE
v EXEC CICS INQUIRE TRANDUMPCODE
v EXEC CICS SET TRANDUMPCODE
v CEMT INQUIRE SYDUMPCODE
v CEMT SET SYDUMPCODE
v CEMT INQUIRE TRDUMPCODE
v CEMT SET TRDUMPCODE
If the DUMPSCOPE option is set to RELATED in the CICS region issuing the dump
request, a request for a system dump is sent to all MVS images in the sysplex that
run related CICS regions.
The local MVS image running the CICS region that initiated the dump request has
two dumps - one of the originating CICS region, the other containing the originating
CICS region and up to fourteen additional related CICS regions from the local MVS
image.
There is a maximum of fifteen address spaces in an SDUMP. If there are more than
fifteen related CICS regions on an MVS image, then not all of them will be dumped.
Related CICS regions may also fail to be dumped if they are swapped out when the
dump request is issued. You should consider whether to make certain CICS regions
non-swappable as a result.
Without this facility, such simultaneous dump data capture across multiple CICS
regions in the sysplex is impossible.
where:
v REMOTE controls the issuing of dumps on remote systems.
v SYSLIST=* means the request is to be routed to all remote systems.
v PROBDESC is problem description information, as follows:
– SYSDCOND - an MVS keyword. This specifies that a dump is to be taken on
remote MVS images if the IEASDUMP.QUERY exit responds with return code
0. CICS supplies DFHDUMPX as the IEASDUMP.QUERY exit.
– SYSDLOCL - an MVS keyword. This drives the IEASDUMP.QUERY exit on
the local and remote MVS images. This allows the CICS regions on the local
MVS region to be dumped.
– DFHJOBN - a CICS keyword. The operator should include the generic job
name. This is used by DFHDUMPX to determine which address spaces to
dump.
See the MVS System Commands manual, GC28-1626, for a full description of all
command options.
If you adopt a suitable naming convention for your CICS regions, this can be used
to define suitable generic jobnames to determine which CICS regions to dump. See
the System/390 MVS Sysplex Application Migration manual for recommendations on
naming conventions. If you follow the recommendation in this manual, the generic
job name for all CICS regions in the sysplex would be ‘CICS*’.
Note: You can use the CONT option to split this command into parts, as follows:
/R nn,JOBNAME=(CICS-job-name,SMSVSAM,CATALOG,GRS), CONT
/R nn,DSPNAME=’SMSVSAM’.*,REMOTE=(SYSLIST=*(’SMSVSAM’, CONT
/R nn,’CATALOG’,’GRS’),DSPNAME,SDATA, END
DFHIRP must be at CICS TS 3.2 level. Connections using VTAM ISC are not
eligible to use the related dump facility.
If you are unable to produce related system dumps when there are related CICS
regions across MVS images, ensure that the regions are MRO connected.
During IRC start processing, CICS attempts to join XCF group DFHIR000. If this
fails, return code yyy is given in the DFHIR3777 message.
The following MVS console commands may be used to monitor activity in the
sysplex:
v D XCF - to identify the MVS sysplex and list the name of each MVS image in the
sysplex.
An example of a response to this command looks like this:
This response tells you that MVS image DEV5 has joined sysplex DEVPLEX5.
v D XCF,GROUP - to list active XCF groups by name and size (note that CICS, group
DFHIR000, is missing from the following example response).
v D XCF,COUPLE - to list details about the XCF coupling data set and its definitions.
In the following example response, the data set has a MAXGROUP of 10 and a
peak of 10. The response to XCF,GROUP indicates there are currently 10 active
groups. If CICS now attempts to join XCF group DFHIR000, it will be rejected
and the IRC start will fail with message DFHIR3777.
This example also indicates that the primary and alternate data sets are on the
same volume, thereby giving rise to a single point of failure.
Use the command CEMT I CONNECTION to display the status of the connections.
‘XCF’ is displayed for every acquired connection using MRO/XCF for
communications.
I CONNECTION
STATUS: RESULTS - OVERTYPE TO MODIFY
Con(FORD) Net(IYAHZCES) Ins Acq Xcf
Con(F100) Net(IYAHZCEC) Ins Acq Irc
Con(F150) Net(IYAHZCED) Ins Acq Irc
Con(GEO ) Net(IYAHZCEG) Ins Acq Xcf
Con(GMC ) Net(IYAHZCEB) Ins Acq Xcf
Con(JIM ) Net(IYAHZCEJ) Ins Acq Xcf
Con(MARY) Net(IYAHZCEM) Ins Acq Xcf
Con(MIKE) Net(IYAHZCEI) Ins Acq Xcf
+ Con(RAMB) Net(IYAHZCEE) Ins Acq Xcf
SYSID=CHEV APPLID=IYAHZCET
RESPONSE: NORMAL TIME: 01.28.59 DATE: 06.11.94
PF 1 HELP 3 END 7 SBH 8 SFH 9 MSG 10 SB 11 SF
D PROG,EXIT,MODNAME=DFHDUMPX
08.16.04 DEV5 CSV463I MODULE DFHDUMPX IS NOT ASSOCIATED ANY EXIT
D PROG,EXIT,EN=IEASDUMP.QUERY
08.17.44 DEV5 CSV463I NO MODULES ARE ASSOCIATED WITH EXIT IEASDUMP.QUERY
This example indicates that the exit has not been established.
D PROG,EXIT,MODNAME=DFHDUMPX
01.19.16 DEV5 CSV461I 01.19.16 PROG,EXIT DISPLAY 993
EXIT MODULE STATE MODULE STATE MODULE STATE
IEASDUMP.QUERY DFHDUMPX A
D PROG,EXIT,EN=IEASDUMP.QUERY
01.19.46 DEV5 CSV462I 01.19.46 PROG,EXIT DISPLAY 996
MODULE DFHDUMPX
EXIT(S) IEASDUMP.QUERY
You may issue MVS dump commands from the console to verify that remote
dumping is available within the MVS image, without an active CICS region.
In the next example, the messages from SDUMP indicate that one dump of the
master address space has been taken.
Another test is to issue the dump command specifying the CICS XCF group.
The messages from SDUMP indicate that one dump of the master address space
has been taken.
To verify that the remote dumping function works on the local system, use the
following commands:
The messages from SDUMP indicate two dumps were taken, one for the master
address space and a second which contains ASIDs 0101, 0012, 0001, 0005, 000B,
000A, 0008, 0007. Note that the same incident token is used for both dumps.
The following example lists the MVS console messages received when the CICS
master terminal command CEMT P DUMP is issued from CICS APPLID IYAHZCET
executing in ASID 19 on MVS image DEV6. IYAHZCET has at least one related
task in the CICS region executing in ASID 1B on MVS DEV6 and ASIDS 001A,
001C, 001B, 001E, 001F, 0020, 001D, 0022, 0024, 0021, 0023, 0028, 0025, 0029,
- 22.19.16 DEV6 JOB00029 +DFHDU0201 IYAHZCET ABOUT TO TAKE SDUMP. DUMPCODE: MT0001
- 22.19.23 DEV7 DFHDU0214 DFHDUMPX IS ABOUT TO REQUEST A REMOTE SDUMPX.
- 22.19.23 DEV6 DFHDU0214 DFHDUMPX IS ABOUT TO REQUEST A REMOTE SDUMPX.
22.19.27 DEV6 JOB00029 IEA794I SVC DUMP HAS CAPTURED:
DUMPID=001 REQUESTED BY JOB (IYAHZCET)
DUMP TITLE=CICS DUMP: SYSTEM=IYAHZCET CODE=MT0001 ID=1/0001
- 22.19.43 DEV6 JOB00029 +DFHDU0202 IYAHZCET SDUMPX COMPLETE. SDUMPX RETURN CODE X’00’
The dump in SYS1.DUMP03 on DEV6 was taken as a result of the CEMT request
on IYAHZCET.
The dump in SYS1.DUMP04 on DEV6 was taken as a remote dump by MVS dump
services as a result of the CEMT request on IYAHZCET. Note that the incident
token and ID are the same.
The dump in SYS1.DUMP05 on DEV7 was taken as a remote dump by MVS dump
services as a result of the CEMT request on IYAHZCET. Note that the incident
token and ID are the same as those for the dumps produced on DEV6, indicating
the originating MVS and CICS IDs.
The following example lists the MVS console messages received when transaction
abend SCOP is initiated after having first been added to the transaction dump table
in CICS IYAHZCES as requiring related dumps. (CEMT S TRD(SCOP) ADD
RELATE).
23.40.41 DEV7 JOB00088 +DFHDU0201 IYAHZCES ABOUT TO TAKE SDUMP. DUMPCODE: SCOP
23.40.49 DEV7 DFHDU0214 DFHDUMPX IS ABOUT TO REQUEST A REMOTE SDUMPX.
23.40.55 DEV7 JOB00088 IEA794I SVC DUMP HAS CAPTURED:
23.41.11 DEV7 JOB00088 +DFHDU0202 IYAHZCES SDUMPX COMPLETE. SDUMPX RETURN CODE X’00’
The dump in SYS1.DUMP03 on DEV6 was taken upon receipt of the remote dump
request issued from IYAHZCES. Note the incident token and ID are the same as
those for dumps produced on DEV7.
The dump in SYS1.DUMP04 on DEV7 was taken as a remote dump by MVS dump
services as a result of the request from IYAHZCES. Note the incident token and ID
are the same as those for the dumps produced on DEV6, indicating the originating
MVS and CICS IDs. A second dump of ASID 1A is taken because the CICS
IEASDUMP does not have information indicating that a dump has already been
taken for that address space.
To determine which messages you can do this for, look in CICS Messages and
Codes. If the message you are interested in has a 2-character alphabetic
component ID after the “DFH” prefix, and it has either XMEOUT global user exit
parameters or a destination of “Terminal User”, you can use it to construct a system
dump code to add to the dump table.
You cannot enable dumping for messages that do not have these characteristics.
For example, some messages that are issued early during initialization cannot be
used to cause CICS to take a system dump, because the mechanisms that control
dumping might not be initialized at that time. Also, you cannot enable dumping for
the message domain's own messages (they are prefixed by “DFHME”) where they
do not normally cause CICS to take a system dump.
1. Add the dump code (constructed by removing the “DFH” prefix from the
message number) to the system dump table.
2. Specify the SYSDUMP option.
CICS then causes a system dump to be taken when the message is issued.
If the code had been running in user key at the time of the program check or MVS
abend, CICS issues message DFHSR0001 and takes a system dump with dump
code SR0001. Only application programs defined with EXECKEY(USER) run in
user key.
If the code had not been running in user key at the time of the program check or
MVS abend, CICS issues message DFHAP0001 and takes a system dump with
dump code AP0001.
So, if CICS storage protection is active, this mechanism enables you to suppress
the system dumps caused by errors in application programs, while still allowing
dumps caused by errors in CICS code to be taken. To achieve this, use either a
CEMT SET SYDUMPCODE or an EXEC CICS SET SYSDUMPCODE command to
suppress system dumps for system dumpcode SR0001:
CEMT SET SYDUMPCODE(SR0001) ADD NOSYSDUMP
If storage protection is not active, the dumps may be suppressed via a suppression
of dumpcode AP0001. Note, however, that this suppresses dumps for errors in both
application and CICS code. The XDUREQ global user exit can be used to
distinguish between AP0001 situations in application and nonapplication code.
adds an entry to the dump table and ensures that SDUMPs are taken for ASRB
abends. However, note that the SDUMP in this instance is taken at a later point
than the SDUMP normally taken for system dump code AP0001 or SR0001.
The options you can specify differ slightly, depending on whether you are defining
the action for a transaction dump code or for a system dump code.
v For a transaction dump code, you can specify:
– Whether a transaction dump is to be taken.
– Whether a system dump is to be taken, with or without a transaction dump.
– Whether a system dump is to be taken on every CICS region in the sysplex
related to the CICS region on which the transaction dump is taken. A related
CICS region is one on which the unit of work identifiers, in the form of APPC
tokens, of one or more tasks match those in the CICS region that takes the
transaction dump.
– Whether CICS is to be terminated.
– The maximum number of times the transaction dump code action can be
taken during the current run of CICS, or before the count is reset.
v For a system dump code, you can specify:
– Whether a system dump is to be taken.
– Whether a system dump is to be taken on every CICS region in the sysplex
related to the CICS region on which the system dump is taken. A related
CICS region is one on which the unit of work identifiers, in the form of APPC
tokens, of one or more tasks match those in the CICS region that takes the
system dump.
– Whether CICS is to be terminated.
– The maximum number of times the system dump code action can be taken
during the current run of CICS, or before the count is reset.
– Whether the system dump is eligible for suppression by DAE.
Note:
1. Only a transaction dump code can cause both a transaction dump and a
system dump to be taken.
2. If a severe error is detected, the system can terminate CICS even if you
specify that CICS is not to be terminated.
The only circumstances in which dump table additions and changes are lost are:
v When CICS is cold started.
v When the CICS global catalog is redefined, although this is likely to be done only
in exceptional circumstances.
v When CICS creates a temporary dump table entry for you, because you have
asked for a dump for which there is no dump code in the dump table.
The default value used for the DAEOPTION attribute (for all new system dump
codes) is set by means of the DAE= system initialization parameter. The default
value for the maximum number of times that the dump action can be taken is set by
the TRDUMAX system initialization parameter (for new or added transaction dump
codes) and the SYDUMAX system initialization parameter (for new or added system
dump codes).
You can modify the default values for a transaction dump table entry using the
following commands:
v CEMT SET TRDUMPCODE
v EXEC CICS SET TRANDUMPCODE
v EXEC CICS SET TRANSACTION DUMPING (to modify the TRANDUMPING attribute
only).
The following table shows the default values for transaction dump table entries and
the attributes you can specify to modify them:
Table 29. Default values for transaction dump table entries
Action Default Attribute Permitted value
Take a transaction YES TRANDUMPING TRANDUMP or
dump? NOTRANDUMP
Take a system dump? NO SYSDUMPING SYSDUMP or
NOSYSDUMP
Take system dumps on NO DUMPSCOPE LOCAL or RELATED
related systems?
Shut down CICS? NO SHUTOPTION SHUTDOWN or
NOSHUTDOWN
Maximum times dump 999 MAXIMUM 0 through 999
code action can be
taken
You can modify the default values for a system dump table entry using the following
commands:
v CEMT SET SYDUMPCODE
v EXEC CICS SET SYSDUMPCODE
The following table shows the default values for system dump table entries and the
attributes you can specify to modify them:
Table 30. Default values for system dump table entries
Action Default Attribute Permitted value
Take a system dump? YES SYSDUMPING SYSDUMP or
NOSYSDUMP
Take system dumps on NO DUMPSCOPE LOCAL or RELATED
related systems?
Shut down CICS? NO SHUTOPTION SHUTDOWN or
NOSHUTDOWN
Is dump eligible for NO DAEOPTION DAE or NODAE
DAE?
Maximum times dump 999 MAXIMUM 0 through 999
code action can be
taken
For example, if you issue a command requesting a dump, using the previously
undefined dump code SYDMPX01:
EXEC CICS PERFORM DUMP DUMPCODE('SYDMPX01')
CICS makes a temporary dump table entry for dump code SYDMPX01, and you
can browse it, and see that it has the default attributes for a system dump code.
You can also see that the current count has been set to 1, as a dump has been
taken.
Attempting to add the dump code to the dump table after CICS has made the entry
causes the exception response ‘DUPREC’ to be returned. If you want to make a
change to the CICS-generated default table entry, and have that entry preserved
across CICS runs, you must delete it and then add a new entry with the options you
require.
v Example 1 shows a transaction dump table entry for transaction dump code
MYAB. This is a user-supplied dump code, specified either on an
EXEC CICS DUMP TRANSACTION command, or as a transaction abend code on an
EXEC CICS ABEND command.
The table entry shows that when this dump code is invoked, both a transaction
dump and a system dump are to be taken, and CICS is not to be terminated.
System dumps on related systems are not to be taken. The dump code action
can be taken a maximum of 50 times, but the action for this dump code has not
been taken since CICS was started or since the current count (“times dump
action taken”) was reset.
v Example 2 shows a transaction dump table entry for transaction dump code
ASRA. This is a CICS transaction abend code, and this dump table entry is
referred to every time a transaction abends ASRA. The entry shows that a
system dump only is to be taken for an ASRA abend, and that CICS is not to be
terminated. System dumps on related systems are to be taken. It also shows that
the action for this abend code has already been taken the maximum number of
times, so no action is taken when another ASRA abend occur. However, the
current count could be reset to 0 dynamically using either a CEMT transaction or
a system programming command (SET TRANDUMPCODE or SET
SYSDUMPCODE). More system dumps would then be taken for subsequent
ASRA abends.
v Example 3 shows a transaction dump table entry for transaction dump code
AKC3. This is a CICS transaction abend, and this dump table entry is referenced
every time a transaction abends AKC3 - that is, whenever the master terminal
operator purges a task.
The entry shows that no action at all is to be taken in the event of such an
abend. System dumps on related systems are not to be taken. The maximum
number of times the dump code action can be taken is given as 999, meaning
that there is no limit to the number of times the specified action is taken. The
dump code action has been taken 37 times, but each time both the transaction
dump and the system dump were suppressed.
Table 32 shows how the transaction dump table entry for transaction dump code
MYAB would be updated with and without global suppression of system dumping.
Only the updated fields are shown.
Table 32. Effect of global suppression of system dumping on transaction dump table update
Type of information Before update System System
dumping dumping
enabled suppressed
Transaction dump code MYAB
Take a transaction dump? YES
Take a system dump? YES
The statistics show that a system dump was taken when system dumping was
enabled, but not when system dumping was suppressed.
There is a further effect. CICS maintains a record of the current dump ID, which is
the number of the most recent dump to be taken. This is printed at the start of the
dump, together with the appropriate dump code. It is concatenated with the CICS
run number, which indicates the number of times that CICS has been brought up
since the global catalog was created, to provide a unique ID for the dump.
Note: This does not apply to SDUMPs taken by the kernel; these always have a
dump ID of 0/0000.
For example, for the ninth dump to be taken during the eleventh run of CICS, if the
dump code were TD01, this is what you would see:
CODE=TD01 ID=11/0009
If system dumping is enabled for the dump code, the same dump ID is given to
both the transaction dump and the system dump.
The sort of information kept in the system dump table is similar to that kept in the
transaction dump table (see Table 31 on page 271).
v Example 1 shows a system dump table entry for system dump code SYDMP001,
a user-supplied system dump code, specified using EXEC CICS PERFORM
DUMP. System dumps on related systems are to be taken. Dumps duplicate of
this one are to be suppressed by DAE. The table entry shows that no dumps
have yet been taken. However, if one were taken, CICS would be shut down. If
global suppression of system dumping was in effect, no dump would be taken
but CICS would be shut down if this dump code were referenced.
v Example 2 shows the system dump table entry for system dump code MT0001,
the CICS-supplied dump code for system dumps requested from the master
terminal, with CEMT PERFORM DUMP or CEMT PERFORM SNAP. CICS is not
shut down when a dump is taken for this dump code. Also, the value of 999 for
“maximum times action can be taken” shows that an unlimited number of dumps
can be taken for this dump code. The current count (“times action already taken”)
shows that to date, 79 dumps have been requested using CEMT.
In order to obtain a dump of a coupling facility list structure, you must specify a
value for the DUMPSPACE parameter in the CFRM policy for the coupling facility. The
recommended value is 5% of the space in the coupling facility. For more
information, see OS/390 MVS Setting Up a Sysplex, GC28-1779.
1. Enter the following command at the console:
DUMP COMM=(cfdt_poolname)
In response to the DUMP command, the system prompts you with a reply
number for the dump options you want to specify.
2. Enter the reply:
REPLY nn,STRLIST=(STRNAME=DFHCFLS_poolname,ACCESSTIME=NOLIMIT,
(LISTNUM=ALL,ADJUNCT=DIRECTIO,ENTRYDATA=UNSERIALIZE)),END
For more information about the MVS DUMP command, see OS/390 MVS System
Commands, GC28-1781.
In order to obtain a dump of a coupling facility list structure, you must specify a
value for the DUMPSPACE parameter in the CFRM policy for the coupling facility. The
recommended value is 5% of the space in the coupling facility. For more
information, see OS/390 MVS Setting Up a Sysplex, GC28-1779.
1. Enter the following command at the console:
DUMP COMM=(named_counter_poolname)
2. In response to the DUMP command, the system prompts you with a reply
number for the dump options you want to specify. Enter the reply:
REPLY nn,STRLIST=(STRNAME=DFHNCLS_poolname,ACCESSTIME=NOLIMIT,
(LISTNUM=ALL,ADJUNCT=DIRECTIO)),END
Using abbreviations for the keywords, this reply can be entered as:
R nn,STL=(STRNAME=DFHNCLS_poolname,ACC=NOLIM,(LNUM=ALL,ADJ=DIO)),END
For more information about the MVS DUMP command, see OS/390 MVS System
Commands, GC28-1781 .
Using abbreviations for the keywords, this reply can be entered as:
R nn,STL=(STRNAME=DFHXQLS_poolname,ACC=NOLIM,
(LNUM=ALL,ADJ=DIO,EDATA=UNSER)),END
For more information about the MVS DUMP command, see OS/390 MVS System
Commands, GC28-1781.
When CSFE ZCQTRACE is enabled, a dump of the builder parameter set and the
appropriate TCTTE is written to the transaction dump data set at specific points in
the processing. Table 34 shows the circumstances in which dumps are invoked, the
modules that invoke them, and the corresponding dump codes.
Table 34. ZCQTRACE dump codes
Module Dump code When invoked
DFHTRZCP AZCQ Installing terminal when termid = terminal ID
DFHTRZZP AZQZ Merging terminal with TYPETERM when termid =
terminal ID
DFHTRZXP AZQX Installing connection when termid = connection ID
DFHTRZIP AZQI Installing sessions when termid = connection ID
DFHTRZPP AZQP When termid = pool terminal ID
DFHZCQIQ AZQQ Inquiring on resource when termid = resource ID
(resource = terminal or connection)
DFHZCQIS AZQS Installing a resource when termid = resource ID
(resource = terminal or connection), or when
ZCQTRACE,AUTOINSTALL is specified.
Unformatted dumps are not easy to interpret, and you are recommended not to use
them for debugging. CICS provides utilities for formatting transaction dumps and
CICS system dumps, and you should always use them before you attempt to read
any dump. You can quickly locate areas of storage that interest you in a formatted
dump, either by browsing it online, or by printing it and looking at the hard copy.
The formatting options that are available for transaction dumps and system dumps
are described in “Formatting transaction dumps” and “Formatting system dumps,”
respectively.
You can also use the SCAN option with the dump utility program, to get a list of the
transaction dumps recorded on the specified dump data set.
For information about using DFHDU650 to format transaction dumps, see the CICS
Operations and Utilities Guide.
In prior releases, the CICS formatting routine for use under the MVS interactive
problem control system (IPCS) is supplied as DFHPDX. This standard name is not
suitable for those users running more than one release of CICS, because the dump
formatting process in each version of DFHPDX is release-specific, and you must
use the correct version for the system dump you are formatting. The module is
named with the release identifier as part of the name - DFHPD650 is the formatting
routine you must define to IPCS when formatting CICS TS 3.2 system dumps.
The IPCS default table, BLSCECT, normally in SYS1.PARMLIB, has the following
entry for CICS:
Ensure that your IPCS job can find the CICS-supplied DFHIPCSP member. You can
either copy DFHIPCSP into SYS1.PARMLIB (so that it is in the same default library
as BLSCECT), or provide an IPCSPARM DD statement to specify the library that
contains the IPCS control tables. For example:
//IPCSPARM DD DSN=SYS1.PARMLIB,DISP=SHR For BLSCECT
// DD DSN=CICSTS32.CICS.SDFHPARM,DISP=SHR For DFHIPCSP
/* ================================================================ */
EXIT EP(DFHPD212) VERB(CICS212) ABSTRACT(+
’CICS Version 2 Release 1.2 analysis’)
EXIT EP(DFHPD321) VERB(CICS321) ABSTRACT(+
’CICS Version 3 Release 2.1 analysis’)
EXIT EP(DFHPD330) VERB(CICS330) ABSTRACT(+
’CICS Version 3 Release 3 analysis’)
EXIT EP(DFHPD410) VERB(CICS410) ABSTRACT(+
’CICS Version 4 Release 1 analysis’)
EXIT EP(DFHPD510) VERB(CICS510) ABSTRACT(+
’CICS Transaction Server for OS/390 Release 1 analysis’)
EXIT EP(DFHPD520) VERB(CICS520) ABSTRACT(+
’CICS Transaction Server for OS/390 Release 2 analysis’)
EXIT EP(DFHPD530) VERB(CICS530) ABSTRACT(+
’CICS Transaction Server for OS/390 Release 3 analysis’)
EXIT EP(DFHPD610) VERB(CICS610) ABSTRACT(+
’CICS Transaction Server for z/OS V2 R1 analysis’)
EXIT EP(DFHPD620) VERB(CICS620) ABSTRACT(+
’CICS Transaction Server for z/OS V2 R2 analysis’)
EXIT EP(DFHPD630) VERB(CICS630) ABSTRACT(+
’CICS Transaction Server for z/OS V2 R3 analysis’)
EXIT EP(DFHPD640) VERB(CICS640) ABSTRACT(+
’CICS Transaction Server for z/OS V3 R1 analysis’)
EXIT EP(DFHPD650) VERB(CICS650) ABSTRACT(+
’CICS Transaction Server for z/OS V3 R2 analysis’)
/* ================================================================ */
You can use formatting keywords to format those parts of the dump that interest
you at any particular time, at specific levels of detail. You have the option of
formatting other parts later for further investigation by you or by the IBM service
organizations. It is advisable to copy your dumps so that you can save the copy
and free the dump data set for subsequent use.
If you omit all of the component keywords, and you do not specify DEF=0, the CICS
dump exit formats dump data for all components.
The CICS dump component keywords, and the levels you can specify for each of
them, are as follows:
AI [={0|2}]
Autoinstall model manager.
AI=0 Suppress formatting of AI control blocks.
AI=2 Format AI control blocks.
AP [={0|1|2|3}]
Application domain.
AP=0 Suppress formatting of AP control blocks.
AP=1 Format a summary of addresses of the AP control blocks for each
active transaction.
AP=2 Format the contents of the AP control blocks for each active
transaction.
AP=3 Format level-1 and level-2 data.
APS=<TASKID=Task identifier>
Application selection. The APS component keyword allows you to limit
the formatting of system dumps to only those storage areas relating to
the task identifier specified. Contents of the application domain control
blocks for the specified transaction will be listed along with language
environment storage areas for the same transaction.
Note: You must use angled brackets around the specified parameter.
BA [={0|1|2|3}]
Business application manager domain.
BA=0 Suppress formatting of business application manager domain control
blocks.
Note: IPCS does not produce page numbers if formatting directly to the
terminal.
| IS[={0|1|2|3}]
| The IP interconnectivity domain.
| IS=0 Suppress formatting of IS domain information.
| IS=1 Format the summary of IPCONN definitions and their sessions.
| IS=2 Format the IS domain control blocks.
| IS=3 Format level-1 and level-2 data.
JCP [={0|2}]
The journal control area.
JCP=0
Suppress formatting of the JCA.
JCP=2
Format the JCA.
KE[={0|1|2|3}]
The CICS kernel.
KE=0 Suppress formatting of the kernel control blocks.
KE=1 Format the stack and a summary of tasks.
KE=2 Format the anchor block.
KE=3 Format level-1 and level-2 data.
LD[={0|1|2|3}]
The loader domain.
You can also specify the options for TS with angled brackets, for:
TS=<1>
Summary
TS=<2>
Format control blocks
TS=<3>
Consistency checking of the TS buffers with the TS control blocks
You can specify more than one of these values between angled brackets. For
example, TS=<1,2> gives summary and formatting of control blocks without
consistency checking.
UEH[={0|2}]
The user exit handler.
UEH=0
Suppress formatting of control blocks.
UEH=2
Format control blocks.
US[={0|1|2|3}]
The user domain.
US=0 Suppress formatting of user domain control blocks.
US=1 Format the user domain summary.
US=2 Format the control blocks.
US=3 Format level-1 and level-2 data.
WB[={0|1|2}]
The web interface.
WB=0 Suppress formatting of web interface control blocks.
WB=1 Format the web interface summary. This displays the current state of
the CICS web interface, followed by a summary of the state blocks
controlled by the state manager.
WB=2 Format the control blocks. This displays the current state of the CICS
web interface, followed by the web anchor block, the global work area
and associated control blocks, and the web state manager control
blocks.
XM[={0|1|2|3}]
The transaction manager.
XM=0 Suppress formatting of transaction manager control blocks.
For a more detailed list of the contents of SDUMPs for each of the VERBEXIT
keywords, see Appendix A, “SDUMP contents and IPCS CICS VERBEXIT
keywords,” on page 315.
The different parts of the transaction dump are dealt with in the order in which they
appear, but be aware that only those parts that users should be using for problem
determination are described. Some control blocks which do appear in the
transaction dump are intended for the problem determination purposes of IBM
Service and are not described in this section.
Transaction storage
“Transaction storage” is storage that might have been obtained by CICS to store
information about a transaction, or it might have been explicitly GETMAINed by the
transaction for its own purposes.
You are likely to find several such areas in the dump, each introduced by a header
describing it as transaction storage of a particular class, for example:
USER24
USER31
CICS24
CICS31
Transaction storage class CICS31 contains, among other things, the transaction
abend control block (TACB). To find it, look for the eye-catcher DFHTACB.
DFHTACB contains valuable information relating to the abend. It contains:
v The PSW and general purpose registers of the program executing at the time of
the abend (for local AICA, ASRA,ASRB and ASRD abends only. However, for
some AICA abends, only the “next sequential instruction” part of the PSW and
the registers are present.) Registers 12 and 13 contain the addresses of the TCA
and CSA respectively in all abends except for ASRA and ASRB abends, when
these registers contain data as at the time of the abend.
v The name of the failing program.
v The offset within the failing program at which the abend occurred (for local
ASRA, ASRB and ASRD abends only).
v The execution key at the time of the abend (for local ASRA and ASRB abends
only).
v Whether the abend was caused by an attempt to overwrite a protected CICS
DSA (local ASRA abends only).
v Whether the program is executing in a subspace or the basespace.
v The subspace STOKEN.
v The subspace’s associated ALET.
Note that if the abend originally occurred in a remote DPL server program, an
eye-catcher *REMOTE* is present. If this is the case, the registers and PSW are
not present.
If you did not have trace running, you need to rely on values in registers belonging
to your programs, and try to relate these to your source statements. That might lead
you to an EXEC CICS command, or to another type of statement in your program.
The procedure is outlined in “Last statement identification.”
You need to look in the appropriate programming language manual for details of the
structure of the program’s acquired storage.
CICS does not support 64-bit addressing for applications, but programs can use
storage at addresses which are only available when CICS is running on 64-bit
architecture machines. The CICS dump formatter displays the contents of the 64-bit
General Purpose Registers captured when an abend occurs.
v For PL/I programs, TCAPCDSA addresses the chain of PL/I DSAs.
v For COBOL programs, TCAPCDSA addresses the task global table (TGT) and
working storage.
v For assembler programs, TCAPCDSA addresses the DFHEISTG storage.
Register Use
3 In most circumstances, is the base register
12 Holds the address of the CICS TCA for the C/370 program
13 Holds the address of the register save area
The register save area INIT1+X'48' (covering registers 0 through 14) should have
register 12 pointing to the program global table (PGT), register 13 pointing to the
task global table (TGT), and some others to locations in the data area and compiled
code of the program storage. If not, a CICS error is indicated.
For each invocation of the COBOL program, CICS copies the static TGT from
program storage into CICS dynamic storage (the COBOL area) and uses the
dynamic copy instead of the static one. CICS also copies working storage from
program storage to the COBOL area, above the TGT. Task-related COBOL areas
thus enable the single copy of a COBOL program to multithread as if it were truly
reentrant.
The TGT is addressed by TCAPCHS in the system part of the TCA. The TGT is
used to hold intermediate values set at various stages during program execution.
The first 18 words of the TGT constitute a standard save area, in which the
program’s current registers are stored on any request for CICS service.
Storage freeze
Certain classes of CICS storage that are normally freed during the processing of a
transaction can, optionally, be kept intact and freed only at the end of the
transaction.
Then, in the event of an abend, the dump contains a record of the storage that
would otherwise have been lost, including the storage used by CICS service
modules. The classes of storage that can be frozen in this way are those in the
teleprocessing and task subpools, and in terminal-related storage (TIOAs).
The storage freeze function is invoked by the CSFE transaction. For information
about using CSFE, see CICS Supplied Transactions.
The key of each entry in this list is the table name, and the first word of the
adjunct area is the corresponding data list number. If the table is open, entry
data is present containing a list of information about the current table users
(regions that have the table open). Each one is identified by its MVS system
296 Problem Determination Guide
name and CICS APPLID. The number of users is at +X'14' in the adjunct area.
After any valid table user elements, the rest of the data area is uninitialized and
can contain residual data up to the next 256-byte boundary.
3. To display the table data, convert the data list number to decimal and specifying
it on another STRDATA subcommand. For example, if the first word of the
adjunct area is X'00000027', the command to display the table data is as
follows:
STRDATA DETAIL LISTNUM(39) ENTRYPOS(ALL)
In the data list, the key of each entry is the record key, and the data portion
contains the user data with a 2-byte length prefix (or a 1-byte X'80' prefix if the
data length is 32767 bytes). The rest of any data area is uninitialized and can
contain residual data up to the next 256-byte boundary. The entry version
contains the time stamp at the time the record was last modified or created.
The adjunct area contains information for locking and recovery. It contains null
values (binary zeros) if the record is not locked. When the record is locked, the
lock owner APPLID and UOWID appear in this area.
If a record has a recoverable rewrite pending, there are two entries with the
same key, where the second entry is the before-image.
For information about the STRDATA subcommand and its options, see OS/390 MVS
IPCS Commands, GC28-1754 .
The key of each entry in this list is the counter name. The version field contains the
counter value minus its maximum value minus 3 (in twos complement form) which
has the effect that all counters have a value of -2 when they have reached their
limit (and the value of -1, equal to all high values, never occurs). The start of the
adjunct area contains the 8-byte minimum and maximum limit values that apply to
the counter.
For information about the STRDATA subcommand and its options, see OS/390 MVS
IPCS Commands, GC28-1754.
The key of each entry in this list is the queue name. For a small queue, the first
word of the adjunct area is the total length of queue data which is included in
the queue index entry; for a queue that has been converted to the large format,
the first word contains zero. The second word of the adjunct area is the number
of the corresponding data list if the queue has been converted to the large
format, or zero otherwise.
For a small queue, the queue data is stored as the data portion of the queue
index entry, with a 2-byte length prefix preceding each item.
In the data list, the key of each entry is the item number, and the data portion
contains the item data with a 2-byte length prefix. The rest of any data area is
uninitialized and may contain residual data up to the next 256-byte boundary.
For information about the STRDATA subcommand and its options, see OS/390 MVS
IPCS Commands, GC28-1754 .
Typically, the global trap/trace exit is used to detect errors that cannot be diagnosed
by other methods. These might cause intermittent problems that are difficult to
reproduce, the error perhaps occurring some time before the effects are noticed.
For example, a field might be changed to a bad value, or some structure in storage
might be overlaid at a specific offset.
The code in DFHTRAP must not make use of any CICS services, cause the current
task to lose control, or change the status of the CICS system.
A DSECT (DFHTRADS) is supplied for this list, which contains the addresses of:
v The return-action flag byte
v The trace entry that has just been added to the table
The DSECT also contains EQU statements for use in setting the return-action flag
byte.
The exit can look at data from the current trace entry to determine whether or not
the problem under investigation has appeared. It can also look at the TCA of the
current task, and the CSA. The DSECTs for these areas are included in the
skeleton source.
The CSA address is zero for invocations of DFHTRAP early in initialization, before
the CSA is acquired.
To reactivate the trap exit when it has been disabled, use CSFE
DEBUG,TRAP=ON, unless the exit routine is to be replaced. In this case the
sequence of commands given above applies.
The skeleton program shows how to make a further trace entry. When DFHTRAP
detects a TS GET request, it asks for a further trace entry to be made by entering
the data required in the area supplied for this purpose, and by setting the
appropriate bit in the return-action flag byte.
The trace domain then makes a trace entry with trace point ID TR 0103,
incorporating the information supplied by the exit.
Trace entries created in this way are written to any currently active trace
destination. This could be the internal trace table, the auxiliary trace data set, or the
GTF trace data set.
The skeleton DFHTRAP also shows how to detect the trace entry made by the
storage manager (SM) domain for a GETMAIN request for a particular subpool.
This is provided as an example of how to look at the data fields within the entry.
This then:
1. Marks the exit as unusable
2. Issues the message DFHTR1001 to the system console
3. Takes a CICS system dump with dump code TR1001, showing the PSW and
registers at the time of the interrupt
4. Continues (ignoring the exit on future invocations of the trace domain).
To recover from this situation, execute the commands given above for replacing the
current version of the exit routine.
This section helps you decide when to contact the Support Center, and what
information you need to collect before contacting the Center. The section also gives
you an understanding of the way in which IBM Program Support works.
In practice, many errors reported to Program Support turn out to be user errors, or
they cannot be reproduced, or they need to be dealt with by other parts of IBM
Service. This indicates just how difficult it can be to determine the precise cause of
a problem. User errors are mainly caused by faults in application programs and
errors in setting up systems. TCT parameters, in particular, have been found to
cause difficulty in this respect.
The Support Center needs to know as much as possible about your problem, and
you should have the information ready before making your first call. It is a good
idea to put the information down on a problem reporting sheet, such as this one:
Using this information, the operator accesses your customer profile, which contains
details of your address, relevant contact names, telephone numbers, and details of
the IBM products at your installation.
The Support Center operator asks you if this is a new problem, or a further call on
an existing one. If it is new, you are assigned a unique incident number. A problem
management record (PMR) is opened on the RETAIN system, where all activity
associated with your problem is recorded. The problem remains “open” until it is
solved.
Make a note of the incident number on your own problem reporting sheet. The
Center expects you to quote the incident number in all future calls connected with
this problem.
If the problem is new to you, the operator asks you for the source of the problem
within your system software—that is, the program that seems to be the cause of the
problem. As you are reading this book, it is likely that you have already identified
CICS as the problem source. You also need to give the version and release
number, for example Version 4 Release 1.
You need to give a severity level for the problem. Severity levels can be 1, 2, or 3.
They have the following meanings:
v Severity level 1 indicates that you are unable to use a program, resulting in a
critical condition that needs immediate attention.
v Severity level 2 indicates that you are able to use the program, but that operation
is severely restricted.
v Severity level 3 indicates that you are able to use the program, with limited
functions, but the problem is not critical to your overall operation.
Finally, the call receipt operator offers you a selection of specific component areas
within CICS (for example, terminal control, file control) and asks you to choose the
area where your problem appears to lie. Based on this selection, the operator can
route your call to a specialist in the chosen area.
You are not asked for any more information at this stage. However, you need to
keep all the information relevant to the problem, and any available documentation
such as dumps, traces, and translator, compiler, and program output.
At first, a support center representative uses the keywords that you have provided
to search the RETAIN database. If your problem is found to be one already known
to IBM, and a fix has been devised for it, a Program Temporary Fix (PTF) can
quickly be dispatched to you.
Let the representative know if any of the following events occurred before the
problem appeared:
v Changes in level of MVS or licensed programs
v Regeneration of any product
v PTFs applied
v Additional features used
v Application programs changed
v Unusual operator action.
You might be asked to give values from a formatted dump or trace table. You might
also be asked to carry out some special activity, for example to set a trap, or to use
trace with a certain type of selectivity, and then to report on the results.
If the problem is new, an APAR may be submitted. This is dealt with by the CICS
change team. See Chapter 21, “APARs, fixes, and PTFs,” on page 309.
When the change team solves the problem, they produce a fix enabling you to get
your system running properly again. Finally, a PTF is produced to replace the
module in error, and the APAR is closed.
When the APAR is entered, you are given an APAR number. You must write this
number on all the documentation you submit to the change team. This number is
always associated with the APAR and its resolution and, if a code change is
required, it is associated with the fix as well.
The next stage in the APAR process, getting relevant documentation to the change
team, is up to you.
Make sure the problem you have described can be seen in the documentation you
send. If the problem has ambiguous symptoms, you need to reveal the sequence of
events leading up to the failure. Tracing is valuable in this respect, but you might be
able to provide details that trace cannot give. You are encouraged to annotate your
documentation, if your annotation is legible and if it does not cover up vital
information. You can highlight data in any hard copy you send, using transparent
highlighting markers. You can also write notes in the margins, preferably using a red
pen so that the notes are not overlooked.
If you include any magnetic tapes, ensure that this is clearly indicated on the
outside of the box. This lessens the chance of their being stored in magnetic fields
strong enough to damage the data.
310 Problem Determination Guide
To make sure the documentation reaches the correct destination, that is, the CICS
change team, the box should be marked:
SHIP TO CODE 5U6
You also need a mailing label with the address of the CICS change team on it.
When the change team receives the package, this is noted in your APAR record on
the RETAIN system. The team then investigates the problem. Occasionally, they
need to ask the Support Center to contact you for more documentation, perhaps
specifying some trap you must apply before getting it.
When the problem is solved, a code is entered on RETAIN to close the APAR, and
you are provided with a fix.
You can enquire any time at your Support Center on how your APAR is progressing,
particularly if it is a problem of high severity.
When the team is confident that the fix is satisfactory, the APAR is certified by the
CICS development team and the APAR is closed. You receive notification when this
happens.
If you cannot assemble the module yourself, because it involves a part of CICS that
is object serviced, you might be supplied with a ZAP or a TOTEST PTF.
If you want a PTF to resolve a specific problem, you can order it explicitly by its
PTF number through the IBM Support Center. Otherwise, you can wait for the PTF
to be sent out on the standard distribution tape.
The first table provides a list of IPCS CICS VERBEXIT keywords and the CICS
control blocks that they display.
The second table provides a list of all CICS control blocks in an SDUMP,
alphabetically, with their associated IPCS CICS VERBEXIT keyword.
PG keyword
The summaries appear below in the sequence in which they appear in a dump.
This is broadly the sequence in which the control blocks are listed in Appendix A,
“SDUMP contents and IPCS CICS VERBEXIT keywords,” on page 315
form=numonly, but note:
v The system LLE summary, if present, follows the PGA summary, but the task LLE
summary, if present, follows the PTA summary.
v The HTB does not appear in a summary.
PGWE Summary
PGWE-ADD
Address of suspended program.
PROGRAM
Name of suspended program.
SUS-TOKN
Suspend token.
PPTE-ADD
Program PPTE address.
PPTE Summary
PPTE ADDRESS
Address of PPTE block.
PROGRAM NAME
The tables are indexed using the program name.
MOD TYPE
Module type, one of the following:
v PG - Program
v MP - Mapset
v PT - Partitionset.
LANG DEF
Language defined, one of the following:
v NDF - Not defined
v ASS - Assembler
v C-C
v COB - OS/VS COBOL
v CO2 - Enterprise COBOL or VS COBOL II
v LE3 - Le370
v PLI - PL/I
LANG DED
Language deduced, one of the following:
v NDD - Not deduced
v ASS - Assembler
v C-C
v COB - OS/VS COBOL
v CO2 - Enterprise COBOL or VS COBOL II
v LE3 - Le370
v PLI - PL/I
PTA Summary
TRAN NUM
Transaction number.
PTA ADDRESS
Address of PTA.
LOG-LVL
Logical level count in decimal.
SYS-LVL
System level count in decimal.
TASK-LLE
Address of task LLE head, zero if no task LLE exists.
PLCB Address of PLCB head, or zero if no PLCB exists.
| CHCB Summary
| CHANNEL
| Channel name (followed by *CURRENT* if it is the program's current
| channel).
| CHCB CHCB address.
| LEN Total length of all containers in the channel.
| CCSID
| Default coded character set ID for the channel.
| GN Generation number.
| CPCB Address of container pool control block.
| CRCB Summary
| CONTAINER
| Container name.
| TYPE Container type. The type is one of the following:
| CICS An internal system container.
| R/O A read-only container.
| USER A user-data container.
| CRCB CRCB address.
| LEN Length of data in the container.
| CCSID
| The default coded character set ID for the container or, if the container was
| created with the BIT option, DTYPE(BIT).
| GN Generation number.
| CSCB
| CSCB anchor address.
US keyword
A level-1 dump summarizes only the user domain data (USUD). The fields
displayed are the same for each type of USUD (principal, session, or EDF).
USXD summary
TRAN NUM
Transaction number.
PRINCIPAL TOKEN
Principal token, if any.
SESSION TOKEN
Session token, if any.
EDF TOKEN
EDF token, if any.
USUD summary
TOKEN
User token.
USERID
User identifier.
GROUPID
Group identifier.
ADDCOUNT
Adduser use count.
TRNCOUNT
Transaction use count.
OPID Operator identifier.
CLASSES
A bitmap expressing the operator classes in order 24 to 1.
PRTY Operator priority.
TIMEOUT
Timeout interval in hours and minutes (hh:mm).
ACEE Address of ACEE.
XRFSOFF
XRF user signon. Can be NOFORCE or FORCE.
USERNAME
User name.
PDF-only books
The following books are available in the CICS Information Center as Adobe
Portable Document Format (PDF) files:
Licensed publications
The following licensed publications are not included in the unlicensed version of the
Information Center:
CICS Diagnosis Reference, GC34-6862
CICS Data Areas, GC34-6863-00
CICS Supplementary Data Areas, GC34-6864-00
CICS Debugging Tools Interfaces Reference, GC34-6865
Subsequent updates will probably be available in softcopy before they are available
in hardcopy. This means that at any time from the availability of a release, softcopy
versions should be regarded as the most up-to-date.
For CICS Transaction Server books, these softcopy updates appear regularly on the
Transaction Processing and Data Collection Kit CD-ROM, SK2T-0730-xx. Each
reissue of the collection kit is indicated by an updated order number suffix (the -xx
part). For example, collection kit SK2T-0730-06 is more up-to-date than
SK2T-0730-05. The collection kit is also clearly dated on the cover.
Bibliography 341
342 Problem Determination Guide
Accessibility
Accessibility features help a user who has a physical disability, such as restricted
mobility or limited vision, to use software products successfully.
You can perform most tasks required to set up, run, and maintain your CICS system
in one of these ways:
v using a 3270 emulator logged on to CICS
v using a 3270 emulator logged on to TSO
v using a 3270 emulator as an MVS system console
Index 347
D DFHKC TYPE=WAIT macro (continued)
DCI=LIST option 129
data corruption
DCI=SINGLE option 129
bad programming logic 188
DCI=TERMINAL option 129
incorrect mapping to program 188
DFHPRIN resource type 112
incorrect mapping to terminal 189
DFHSIPLT resource name 117
attributes of fields 189, 190
DFHSIPLT resource type 112
DARK field attribute 190
DFHTACB 293, 294
MDT 190
PSW 293
modified data tag 190
registers 293
symbolic map 189
DFHTEMP resource name 120
incorrect records in file 188
DFHTRADS DSECT 299
missing records in file 188
DFHZARER resource name 122
possible causes 188
DFHZARL1 resource name 122
data exception 29
DFHZARL2 resource name 122
DATABUFFERS parameter of FILE resource
DFHZARL3 resource name 122
definition 83
DFHZARL4 resource name 122
DB2 migration considerations
DFHZARQ1 resource name 122
DSNTIAR 33
DFHZARR1 resource name 122
DB2 resource type 112
DFHZCRQ1 resource name 121
DB2_INIT resource type 112
DFHZDSP resource name 119
DB2CDISC resource type 112
DFHZEMW1 resource name 121
DB2EDISA resource type 112
DFHZERH1 resource name 122
DB2START, resource name 121
DFHZERH2 resource name 122
DBCTL (database control)
DFHZERH3 resource name 122
abends 35
DFHZERH4 resource name 122
connection fails 127
DFHZIS11 resource name 121
disconnection fails 127
DFHZRAQ1 resource name 121
immediate disconnection 127
DFHZRAR1 resource name 121
orderly disconnection 127
diagnostic run, of CICS 218
waits 126
DISOSS, communication with CICS 57
DBCTL resource type 112, 127
DISPATCH resource type 112
DBDXEOT resource type 112
dispatcher
DBDXINT resource type 112
dispatch, suspend and resume cycle 160, 164
DBUGUSER resource name 112
failure of tasks to get attached 160, 161
DCT resource name 119
failure of tasks to get initial dispatch 160, 163
deadlock time-out interval
functions of gate DSSR 109
EXEC CICS WRITEQ TS command 69
suspension and resumption of tasks 109
interval control waits 75
tracing the suspension and resumption of tasks 51
task storage waits 67
dispatcher wait
deadlocks
JVM_POOL 124
resolving 100
OPEN_DEL 125
resolving in a sysplex 103
OPENPOOL 124
debugging
distributed transaction processing (DTP) 66
IRC problems 203
DLCNTRL resource name 110
multiregion operation problems 203
DLCONECT resource name 110
DEF parameter of CICS dump exit 281
DLSUSPND resource name 112
destination control table (DCT)
DMATTACH resource type 112
extrapartition transient data destination 134
DMB (data management block)
logically recoverable queues 134
load I/O 96
DFHAIIN resource type 112
DMWTQUEU resource name 110
DFHAUXT 241
domain identifying codes 234
DFHBUXT 241
DS_NUDGE resource name 120
DFHCPIN resource type 112
DSA (dynamic storage area)
DFHDMPA dump data set 256
current free space 68
DFHDMPB dump data set 256
storage fragmentation 68
DFHEIB 292
DSNTIAR 33
EIBFN 292
DSSR gate of dispatcher domain
DFHKC TYPE=DEQ macro 131
tracing the functions 51
DFHKC TYPE=ENQ macro 129
interpreting the trace table 51
DFHKC TYPE=WAIT macro 129
tracing the input and output parameters 52
DCI=CICS option 129
Index 349
EDSA (extended dynamic storage area) (continued) execution diagnostic facility (EDF)
storage fragmentation 68 investigating loops 157
EIBFN 292 execution exception 29
in last command identification 294 Execution key 291
EKCWAIT resource type 113 exit programming interface (XPI)
EMP (event monitoring point) 19 correctness of input parameters 4
ENF resource type 113 need to observe protocols and restrictions 4
ENQUEUE on single server resource 130 problems using 4
ENQUEUE resource type 85, 113, 114, 129, 130 restrictions in user exits 4
BDAM record locking 92 suspension and resumption of tasks 109
ESDS write lock 93 SYSTEM_DUMP call 257
KSDS range lock 92 TRANSACTION_DUMP call 257
VSAM load mode lock 93 extended-format trace 248
VSAM record locking 91 EXTENDEDDS attribute, TYPETERM 176, 180
enqueue waits 71 extrapartition transient data waits 134
ERDSA resource type 114
error code 42
error data 42 F
error number 41 FCACWAIT resource type 114
error type 42 FCBFSUSP resource type 83, 114
ESDSA resource type 114 FCCAWAIT resource type 83, 114
EUDSA resource type 114 FCCFQR resource type 84, 114
event monitoring point (EMP) 19 FCCFQS resource type 84, 114
exceeding the capacity of a log stream 206 FCCRSUSP resource type 114
exception trace FCDSESWR resource name 113
characteristics 226 FCDSLDMD resource name 113
CICS system abends 38 FCDSRECD resource name 113
destination 226 FCDSRNGE resource name 113
format 227 FCDWSUSP resource type 84, 114
missing trace entries 171 FCFLRECD resource name 113
purpose 226 FCFLUMTL resource name 113
storage violation 193, 194, 196 FCFRWAIT resource type 114
user 227 FCFSWAIT resource type 85, 114
EXCLOGER resource name 111 FCINWAIT resource type 114
exclusive control of volume conflict 107 FCIOWAIT resource type 85, 114
EXEC CICS ABEND command 258 FCIRWAIT resource type 85, 114
EXEC CICS DELAY command 75 FCPSSUSP resource type 86, 115
EXEC CICS DUMP TRANSACTION command 257, FCQUIES resource type 86, 115
269 FCRAWAIT resource type 86, 115
EXEC CICS ENTER TRACENUM command 227 FCRBWAIT resource type 87, 115
EXEC CICS INQUIRE TASK FCRDWAIT resource type 87, 115
SUSPENDTYPE field 50 FCRPWAIT resource type 87, 115
SUSPENDVALUE field 50 FCRRWAIT resource type 87, 115
EXEC CICS PERFORM DUMP command 257 FCRVWAIT resource type 88, 115
EXEC CICS POST 75 FCSRSUSP resource type 86, 115
EXEC CICS READ UPDATE command 90 FCTISUSP resource type 88, 115
EXEC CICS RETRIEVE WAIT command 75 FCXCPROP resource type 115
EXEC CICS REWRITE command 90 FCXCPROT resource type 89
EXEC CICS START command 75, 160, 161 FCXCSUSP resource type 89, 115
EXEC CICS STARTBR command 90 FCXDPROP resource type 115
EXEC CICS WAIT EVENT command 75 FCXDPROT resource type 89
EXEC CICS WRITE command 90 FCXDSUSP resource type 89, 115
EXEC CICS WRITE MASSINSERT command 90 file accesses, excessive 14
EXEC CICS WRITEQ TS command 69 file control waits 80
NOSUSPEND 69 BDAM record locking 92
REWRITE option 69 drain of RLS control ACB 87
EXEC interface block (EIB) ESDS write lock 93
EIBFN 292 exclusive control conflict 89
EXECADDR resource name 113 exclusive control deadlock 89, 90
EXECSTRN resource name 113 FC environment rebuild 85
file state changes 85
Index 351
incorrect output (continued) interval control (continued)
no output obtained (continued) performance considerations 160
using execution diagnostic facility 184 waits 74
using statistics 184, 185 deadlock time-out interval 75
using trace 183 systematic investigation 76
passed information 21 interval control element (ICE) 186
printed output wrong 174 INTTR, system initialization parameter 243
source listings 18 IRC (interregion communication) 203
statistics 19 poor performance 161
symptom keyword 7 waits 66, 133
symptoms 11 IRLINK resource type 57, 116
temporary storage 20 IS_ALLOC resource type 116
terminal data 20 IS_ERROR resource type 116
terminal output wrong 174 IS_INPUT resource type 116
trace 21 IS_PACE resource type 116
trace data wrong 168 IS_RECV resource type 116
trace destination wrong 167 IS_SESS resource type 116
trace entries missing 169 ISMM access method 59
trace output wrong 167
transaction inputs and outputs 20
transient data 20 J
unexpected messages 11 job log 291
user documentation 17 JOB parameter of CICS dump exit 281
wrong CICS components being traced 169 JOURNALS resource name 113
wrong output obtained 187 JVM tracing 237
possible causes 188 activating 237
wrong tasks being traced 169 defining 237
INCORROUT symptom keyword 7 JVM_POOL resource name 112
INDEXBUFFERS parameter of FILE resource JVM_POOL wait 124
definition 83 JVMxxxxTRACE system initialization parameters 237
information sources 17
INFORMATION/ACCESS licensed program 7
INITIAL resource name 111 K
initialization stall 104 Katakana terminals 179
INQUIRE UOWENQ command mixed English and Katakana characters 179
deadlock diagnosis 103 KCADDR resource name 113
INQUIRE_ resource name 121 KCCOMPAT resource type 117, 129
interactive problem control system (IPCS) resource names 129
analyzing CICS system dumps 279 CICS 129
CICS dump exit 280 LIST 129
CICS system abends 37 SINGLE 129
internal trace 249 SUSPEND 129
abbreviated-format 250 TERMINAL 57, 129
characteristics 241 KCSTRNG resource name 113
controlling 243 kernel domain
destination 241 information given in dump 39
exception trace destination 241 error code 42
extended-format 248 error data 42
formatting 247 error table 41
interpreting 248, 250 error type 42
short-format 249 failing program 42, 46
status 241 kernel error number 41
trace entries missing 170 point of failure 42
trace table size 241 PSW at time of error 42
changing the size dynamically 241 registers at time of error 42
wrapping 241 storage addressed by PSW 43
intersystem communication (ISC) storage addressed by registers 43
poor performance 161 task error information 41
waits 66, 133 task summary 39
interval control tasks in error 40, 41
element 76 linkage stack 44
Index 353
messages (continued) output (continued)
transaction abend 25 ATI tasks 183, 186
transaction disabled 185 disabling the transaction 185
unexpected 11, 176 explanatory messages 182
MISCELANEOUS resource name 119 finding if the task ran 183
missing trace entries 169 looking at files 185
module index looking at temporary storage queues 185
in transaction dump 293 looking at transient data queues 185
MONITOR POINT command 19 possible causes 182
monitoring point 19 START PRINTER bit on write control
MQseries resource type 117 character 183
MRCB_xxx resource type 118 task not in system 183
MRO waits 66, 133 task still in system 182
MROQUEUE resource name 112 testing the terminal status 182
MSBRETRN, resource name 113 using CECI 185
multiregion operation using IRC 203 using execution diagnostic facility 184
multiregion operation waits 66, 133 using statistics 184, 185
MVS ABEND macro 27 using trace 183
MVS console repetitive 13
CICS termination message 8 wrong 187
MVS logger availability check 206 possible causes 188
MVS RESERVE locking
CEDA in a wait state 138
CESN in a wait state 138 P
CICS system stalls 107 PAGESIZE attribute, TYPETERM 180
effect on CICS performance 165 PC, communication with CICS 57
transient data extrapartition waits 134 PERFM symptom keyword 7
VSAM I/O waits 85 performance
waits during XRF takeover 138 bottlenecks 159, 165
waits on resource type XRPUTMSG 138 dispatch, suspend and resume cycle 160
MXT (maximum tasks value) initial attach to the dispatcher 160
effect on performance 161 initial attach to the transaction manager 159
kernel task summary 40 initial dispatch 160
possible cause of CICS stall 106 dispatch, suspend and resume cycle 164
reason for task failing to start 10 extrapartition transient data 165
waits 96 initial attach to the dispatcher 161
XM_HELD resource type 96 initial attach to the transaction manager 160
MXT resource type 118 initial dispatch to the dispatcher 163
interval control delays 160
MXT limit 161
N performance class monitoring 163
NetView 108 poor
networks at peak system load times 9
messages 58 finding the bottleneck 159
preliminary checks 5 investigating 159
NOSYSDUMP, system dump code attribute 173 lightly loaded system 9
NOTRANDUMP, transaction dump code attribute 173 possible causes 9
symptom keyword 7
symptoms 9, 12, 14, 159
O remote system status 161
ONC/RPC 7 system loading 165
open transaction environment task control statistics 161
TCB stealing 125 task priority 163
OPEN_DEL resource name 112 task time-out interval 165
OPEN_DEL wait 125 terminal status 160
OPENPOOL resource name 112 use of MVS RESERVE 165
OPENPOOL wait 124 using trace 162
operation exceptions 29 performance class monitoring 163
output PIIS resource type 118
absence when it is expected 12 PL/I application programs
none obtained 182 locating the DSA chain 295
Index 355
R resource names (continued)
DLSUSPND 112
RCP_INIT resource type 118
DMWTQUEU 110
RDSA resource type 118
DS_NUDGE 120
RECEIVE resource name 119
DTCHMSUB 113
record locking
EARLYPLT 112
BDAM data sets 92
ECBTCP 110
VSAM data sets 91
EXCLOGER 111
registers
EXECADDR 113
at time of error 42
EXECSTRN 113
CICS system abends 42
FCDSESWR 113
data addressed at the time of error 43
FCDSLDMD 113
in transaction abend control block 293
FCDSRECD 113
in transaction dump 291
FCDSRNGE 113
registers at last EXEC command 291
FCFLRECD 113
RELATED dump code attribute 258, 259
FCFLUMTL 113
Remote abend indicator 291
file ID 113, 114, 115
resource definition online (RDO)
GETWAIT 117
ALTER mapset 5
HVALUE 51
ALTER program 5
INITIAL 111
ALTER transaction 5
INQ_ECB 121
DEFINE mapset 5
INQUIRE 132
DEFINE program 5
inquiring during task waits 50
DEFINE transaction 5
JOURNALS 113
INSTALL option 5
JVM_POOL 112
resource names
KCADDR 113
*CTLACB* 115
KCSTRNG 113
AITM 112
LATE_PLT 112
ASYNRESP 111
LG_MGRST 117
ATCHMSUB 113
LIST 117, 122
CDB2TIME 121
LMQUEUE 94, 110
CEX2TERM 113
LOGSTRMS 113
CHANGECB 121, 132
LOT_ECB 112
CICS 117
MAXSOCKETS 119
CPI 112
message queue ID 121
CSASSI2 111
MISCELANEOUS 119
DB2START 121
module name 115, 116
DBUGUSER 112
MROQUEUE 112
DCT 119, 134
MSBRETRN 113
DFH_STATE_TOKEN 121
OPEN_DEL 112
DFHSIPLT 117
OPENPOOL 112
DFHTEMP 120
PRM 112
DFHXMTA 115
program ID 118
DFHZARER 122
PSINQECB 121, 132
DFHZARL1 122
PSOP1ECB 121
DFHZARL2 122, 132
PSOP2ECB 122, 132
DFHZARL3 122, 132
PSUNBECB 122, 132
DFHZARL4 122
QUIESCE 112
DFHZARQ1 122
RECEIVE 119
DFHZARR1 122
RZCBNOTI 118
DFHZCRQ1 121, 131
SEND 119
DFHZDSP 119
SHUTECB 111
DFHZEMW1 121, 131
SINGLE 113, 117
DFHZERH1 122
SIPDMTEC 110
DFHZERH2 122
SMSYRE 119
DFHZERH3 122
SMSYSTEM 119
DFHZERH4 122, 132
SO_LISTN 119
DFHZIS11 121
SO_LTEPTY 119
DFHZRAQ1 121, 131
SO_LTERDC 119
DFHZRAR1 121, 131
SO_NOWORK 119
DLCNTRL 110
SOCLOSE 119
DLCONECT 110
Index 357
resource types (continued) resources (continued)
MBCB_xxx 117, 136 inquiring during task waits 50
MQseries 117 locks 94
MRCB_xxx 118, 136 investigating waits 94
MXT 118 log manager 127
PIIS 118 names 109
PROGRAM 118 storage manager 67
RCP_INIT 118 task control 129
RDSA 118 temporary storage 69
RMCLIENT 118 types 109
RMI 118 RETAIN problem management system
RMUOWOBJ 118 APARs 309
RZRSTRIG 118 data base 7, 307
SDSA 118 problem management record 306
SMPRESOS 119 symptom keywords 7
SOCKET 118, 119 using INFORMATION/ACCESS 7
SODOMAIN 119 RLS (record-level sharing)
SOSMVS 125 taking SMSVSAM dumps 261
STP_TERM 119 RMCLIENT resource type 118
SUCNSOLE 119 RMI resource type 118
summary of possible values 109 RMUOWOBJ resource type 118
SUSPENDTYPE 51 RZRSTRIG resource type 118
TCP_NORM 119 RZRTRAN resource type 118
TCP_SHUT 119
TCTVCECB 119
TD_INIT 119, 134 S
TD_READ 120, 135 SAA (storage accounting area)
TDEPLOCK 119, 134 chains 192
TDIPLOCK 119, 134 overlays 192, 194, 197
TIEXPIRY 120 SCRNSIZE attribute, PROFILE 176, 180
TRANDEF 120 SDSA resource type 118
TSAUX 69, 120 SDUMP macro 257
TSBUFFER 69 failure 257
TSEXTEND 69 retry on failure 257
TSIO 69 SEND resource name 119
TSIOWAIT 120 SENDSIZE attribute, TYPETERM 180
TSPOOL 69, 120 short-format trace 249
TSQUEUE 70 SHUTECB resource name 111
TSSHARED 70, 120 SINGLE resource name 117
TSSTRING 70 SINGLE, resource name 113
TSWBUFFR 70 SIPDMTEC resource name 110
UDSA 67, 120 SLIP trap, MVS 216
USERWAIT 121, 129 SMPRESOS resource type 119
WBALIAS 121 SMSVSAM problems
WEB_ECB 121 taking RLS-related dumps 261
XRGETMSG 121 SMSYRE resource name 119
XRPUTMSG 138 SMSYSTEM resource name 119
ZC 121, 131 SO_LISTN resource name 119
ZC_ZCGRP 121, 131 SO_LTEPTY resource name 119
ZC_ZGCH 121, 132 SO_LTERDC resource name 119
ZC_ZGIN 121, 132 SO_NOWORK resource name 119
ZC_ZGRP 121, 122, 132 SOCKET resource type 118, 119
ZC_ZGUB 122, 132 SOCLOSE resource name 119
ZCIOWAIT 122, 132 SODOMAIN resource type 119
ZCZGET 122, 132 SOS (short on storage)
ZCZNAC 122, 132 caused by looping code 14
ZXQOWAIT 122, 132 potential cause of waits 68
ZXSTWAIT 122 SOSMVS resource name 112
resources SPCTR, system initialization parameter 236
DBCTL 126 SPCTRxx, system initialization parameter 236
definition errors 5 specification exception 30
Index 359
SYSTR, system initialization parameter 233 tasks (continued)
runaway (continued)
storage report 43
T tight loops 142
task control session state with VTAM 65
waits 129 slow running 10, 164
causes 129, 130 subpool summary 68
failure of task to DEQUEUE on resource 131 summary in kernel storage 39
invalid ECB address 130 suspended 12
resource type KCCOMPAT 129 inquiring on 12
unconditional ENQUEUE on single server investigating 50
resource 130 task error information 41
valid ECB address 130 time-out interval 165
task control area (TCA) tracing 169, 232
in transaction dump 291 transfer from ICE to AID chain 186
system area 291 waits 49
user area 291 CICS DB2 125
task termination DBCTL 126
abnormal 3 definition of wait state 49
task tracing EDF 127
precautions when choosing options 169 log manager 127
special 232 maximum task conditions 96
standard 232 on locked resources 94
suppressed 232 online investigation 50
tasks stages in resolving wait problems 49
abnormal termination 3 storage manager 67
ATI, no output produced 183, 186 suspension and resumption of tasks 109
looking at the AID chain 186 system 110
looking at the ICE chain 186 task control 129
conversation state with terminal 65 techniques for investigating 50
dispatch, suspend and resume cycle 160, 164 temporary storage 69
dispatching priority 163 user 110
error data 42 using the formatted CICS system dump 50
exclusive control deadlock 89, 90 using trace 50
failure during MVS service call 42 TCLASS resource type 119
failure to complete 9, 10, 12 TCP_NORM resource type 119
failure to get attached to the dispatcher 160, 161 TCP_SHUT resource type 119
failure to get attached to the transaction TCSESUSF 186
manager 159, 160 TCTTE (terminal control table terminal entry)
failure to get initial dispatch 160, 163 in transaction dump 292
failure to start 9, 10, 12 TCTTE chain, in terminal waits 63
failure under the CICS RB 42 TCTVCECB resource name 111
identifying the AID 187 TCTVCECB resource type 119
identifying the ICE 186 TD_INIT resource type 119
identifying, in remote region 66 TD_READ resource type 120
in a wait state 12 TDEPLOCK resource type 119
in error 40 TDIPLOCK resource type 119
identified in linkage stack 44 TDNQ resource name 114
information in kernel domain storage areas 41 temporary storage
lock owning conditional requests for auxiliary storage 69
identifying a lock being waited on 94 consumption by looping tasks 70
looping 141 current free space 70
consumption of storage 68, 70 no task output 185
identifying the limits 156 repetitive records 13
MXT limit 161 summary 71
PSW at time of error 42 unconditional requests for auxiliary storage 69
reason for remaining on the AID chain 186 waits 69
registers at time of error 42 unallocated space close to exhaustion 70
runaway terminal control program (TCP) 63
detection by MVS 42 TERMINAL resource name 117
non-yielding loops 142
Index 361
trace (continued) transaction abends (continued)
formatted entry (continued) ASRB 27
time of entry 248 collecting the evidence 25
formatting 247 CSMT log messages 8
from global trap/trace exit 300 dump not made when expected 171
global trap/trace exit 299 getting a transaction dump 25
in problem determination investigating 25
loops 144, 155 last command identification 294
poor performance 162 last statement identification 294
selecting destinations 240 message 25
storage violations 195, 197 messages 3, 11
incorrect output from storage violation 193
investigating 167 system dump following 258
trace entries missing 169 transaction dump following 258
wrong CICS components being traced 169 worksheet 35
wrong data captured 168 transaction deadlocks
wrong destination 167 in a sysplex 103
wrong tasks being traced 169 resolving 100
internal domain functions 224, 225 transaction dumps
interpreting 248, 250 abbreviated-format trace table 292
user entries 252 accompanying transaction abends 25
interpreting user entries 252 common system area 292
investigating waits 50, 51 CSA 292
setting the tracing options 51 CSAOPFL 292
last command identification 294 CWA 292
last statement identification 294 destination 256
level-1 224, 225 DFHTACB 293, 294
level-2 224, 226 dump not made on transaction abend 171
level-3 224, 226 exec interface structure 291
levels 224, 225, 226 EXEC interface user structure 292
logic of selectivity 234 execution key 291
master system trace flag 168, 233 extended-format trace table 292
MVS GTF 242 following transaction abend 258
common destination 242 formatting 279
overview of different types 223 selectivity 279
points 224, 225 in problem determination 255
location 224, 225 interpretation 290
program check and abend 230 job log for 291
repetitive output 13 kernel stack entries 292
storage manager trace levels 225, 226 last command identification 294
suspension and resumption of tasks 51 last statement identification 294
interpreting the trace table 51 locating program data 295
use in investigating no task output 183 module index 293
user 189 optional features list 292
checking programming logic 189 program information 292
user exception trace entries 227 program storage 292
VTAM buffer PSW 291
description 231 registers 291
destination 231 registers at last EXEC command 291
investigating logon rejection 176 remote abend indicator 291
terminal waits 66 selective dumping of storage 257, 269
trace control transaction 243 statistics 269
TRANDEF resource type 120 storage violation 193, 194
TRANDUMP, transaction dump code attribute 173 suppression for individual transactions 173, 255
transaction abends symptom string 291
abend code 25 system EXEC interface block 292
documentation 26 task control area, system area 291
interpretation 25 task control area, user area 291
action of global trap/trace exit 300 TCTTE 292
AICA 26, 142 transaction abend control block 293, 294
ASRA 26 transaction dump code options 268
Index 363
VSAM (continued) waits (continued)
waits using trace 50, 51
exclusive control deadlock 89, 90 setting the tracing options 51
file state changes 85 VTAM terminal control 131
for exclusive control of control interval 89 WBALIAS resource type 121
for VSAM transaction ID 88 WCC (write control character) 183
I/O 85 WEB_ECB resource type 121
record locking by CICS 91 working storage, COBOL programs 295
VSAM buffer unavailable 83 write control character (WCC) 183
VSAM string unavailable 86 WTO resource name 119
VSAM READ SEQUENTIAL 90
VSAM READ UPDATE 90
VSAM WRITE DIRECT 90 X
VSAM WRITE SEQUENTIAL 90 XDUREQ, dump domain global user exit 172
VSMSTRNG resource name 111 XLT (transaction list table) 107
VTAM XM_HELD resource type 96
process status 60 XRF errors
session state with task 65 failure of CAVM 138
XRF takeover
CEDA in a wait state 138
W CESN in a wait state 138
WAIT symptom keyword 7 wait on resource type XRPUTMSG 138
waits XRGETMSG resource type 121
alternate system 137 XRPUTMSG resource type 121, 138
CICS DB2 125
DBCTL 126
deadlock time-out interval 75 Z
definition 12, 49 ZC resource type 121
EDF 127 ZC_ZCGRP resource type 121
enqueue 71 ZC_ZGCH resource type 121
FEPI 139 ZC_ZGIN resource type 121
file control 80 ZC_ZGRP resource name 119
IIOP 133 ZC_ZGRP resource type 121, 122
interregion communication 66, 133 ZC_ZGUB resource type 122
intersystem communication 66, 133 ZCIOWAIT 122
interval control 74 ZCZGET resource type 122
investigating 49 ZCZNAC resource type 122
lock manager 94 ZGRPECB resource name 111
log manager 127 ZSLSECB resource name 121
maximum task conditions 96 ZXQOWAIT resource type 122
online investigation 50 ZXSTWAIT resource type 122
finding the resource 50
program control 93
stages in resolving 49
storage manager 67
suspension and resumption of tasks 109
symptom keyword 7
symptoms 12
task control 129
techniques for investigating 50
temporary storage 69
terminal 57
transient data 133
during initialization 134
extrapartition 134
I/O buffer contention 136
I/O buffers all in use 136
intrapartition 134
VSAM I/O 136
VSAM strings all in use 136
using the formatted CICS system dump 50, 52
IBM may have patents or pending patent applications covering subject matter
described in this document. The furnishing of this document does not give you any
license to these patents. You can send license inquiries, in writing, to:
For license inquiries regarding double-byte (DBCS) information, contact the IBM
Intellectual Property Department in your country or send inquiries, in writing, to:
The following paragraph does not apply in the United Kingdom or any other
country where such provisions are inconsistent with local law:
INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS
PUBLICATION “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A
PARTICULAR PURPOSE. Some states do not allow disclaimer of express or
implied warranties in certain transactions, therefore this statement may not apply to
you.
Licensees of this program who wish to have information about it for the purpose of
enabling: (i) the exchange of information between independently created programs
and other programs (including this one) and (ii) the mutual use of the information
which has been exchanged, should contact IBM United Kingdom Laboratories,
MP151, Hursley Park, Winchester, Hampshire, England, SO21 2JN. Such
information may be available, subject to appropriate terms and conditions, including
in some cases, payment of a fee.
Java and all Java-based trademarks and logos are trademarks or registered
trademarks of Oracle and/or its affiliates.
Other product and service names might be trademarks of IBM or other companies.
We appreciate your comments about this publication. Please comment on specific errors or omissions, accuracy,
organization, subject matter, or completeness of this book. The comments you send should pertain to only the
information in this manual or product and the way in which the information is presented.
For technical questions and information about products and prices, please contact your IBM branch office, your IBM
business partner, or your authorized remarketer.
When you send comments to IBM, you grant IBM a nonexclusive right to use or distribute your comments in any
way it believes appropriate without incurring any obligation to you. IBM or any other organizations will only use the
personal information that you supply to contact you about the issues that you state on this form.
Comments:
If you would like a response from IBM, please fill in the following information:
Name Address
Company or Organization
_ _ _ _ _ _ _Fold
_ _ _and
_ _ _Tape
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _Please
_ _ _ _ do
_ _ not
_ _ _staple
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _Fold
_ _ _and
_ _ Tape
______
PLACE
POSTAGE
STAMP
HERE
________________________________________________________________________________________
Fold and Tape Please do not staple Fold and Tape
Cut or Fold
SC34-6826-03 Along Line
SC34-6826-03
Spine information:
Version 3
CICS Transaction Server for z/OS Problem Determination Guide Release 2