Impact2012 - DataPower Troubleshooting PDF
Impact2012 - DataPower Troubleshooting PDF
Agenda
Introduction
Must-Gather & Error Reports
Packet Captures
Status Providers
Out-of-Memory (OoM)
Large Debug Logs
Advanced Techniques
Q&A
Summary
Introduction
As a closed system, the primary responsibility for DataPower problem determination lies with the
IBM support team.
Historically, little enablement of client self diagnosis and repair has been part of the architecture for
DataPower.
building our software to provide the data and information required to resolve problems when they occur FFDC (First Failure Data Capture)
defining tools, best practices, and standards that allow the efficient analysis of problems within a product
or solution by the system, customer, or IBM Support
analyzing problems to continually modify our processes and procedures to improve software quality and
prevent problems from occurring.
Must Gather
http://www-01.ibm.com/support/docview.wss?uid=swg21515489
Error reports contain most status providers
Some cannot fit into the error report due to size or time constraints
Error report content is continually being updated & improved
Reports can be useful even some time after the event
Status snapshot after the fact should be augmented by historical trends
before the event
What Else?
All log files:
Additional logs not put into the error report will be in 'logtemp'
Top level & for the specific domain(s); unless too many domains
Automated scripts to get files via CLI or SOMA are helpful to build in
advance
Log files can be under 'logstore' if using the log to RAID option
<env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/">
<env:Body>
<dp:request xmlns:dp="http://www.datapower.com/schemas/management">
<dp:get-file name="logtemp:default-log"/>
</dp:request>
</env:Body>
</env:Envelope>
Status Provider
These provide information about the system
E.g. filesystem, environment sensor, domain status.
Memory statistics
Show load
Load
---2
--------- --0
-----33
Memory
---------109
wtx
20
ssh
29
Show memory
File Count
Memory Statistics
Usage log:
20120314T102536Z [slm][debug] throttle(Throttler): tid(943):
Memory(3570524/4098982kB 87.107579 free) Pool(1041874)
Ports(31756/31850) Temporary-FS(224/242MB 92.561983 free) File(OK)
Discrepancies
What to look for in the status providers:
'show tcp'; 'show connections' & 'show handles'
All give slightly different results but roughly map one-to-one
If one is out-of-range by an order of magnitude could indicate an issue
Packet capture
Why packet captures?
Packet Captures
IP address
Port
MAC address
and many other qualifiers
Service Probe
Multistep Probe shows the payload as it moves through the
processing policy not meant to be on-the-wire
OoM
DataPower does not have virtual memory
Pro: performance
Con: 4GB is shared by all domains & transactions
Spikes:
An increase in traffic arriving at the device
An increase in delay at backends or in sidecalls
Can be detected if Throttle status log option is enabled
Memory
File handles/sockets/file descriptors
Ports (slightly different from sockets)
Inodes (very rare)
Memory logs
Each log message captures a snapshot at that time
Not cumulative; can go up & down
Not exhaustive; some actions or protocols can allocate memory outside
Leak reports
Always best to have a baseline
Tracing must be always on!
Active transactions can cause noise in the data capture; best to turn off
traffic if at all possible
Scalability concerns
SLM
Shaping can be used to smooth traffic
Should not be used to hide a broken backend
Plan on shaping for a few seconds; not minutes
Audit log
Polling for uptime is best practice for monitoring restarts
Except when the 32 bit counter wraps
RBM (optional)
webGUI (optional)
type file
format text
timestamp numeric
archive rotate
Dropped messages are also in the log file: Buffer Overflow: X event(s)
lost
Always set a static route to the syslog servers to force outbound traffic
over the correct interface
Adding a syslog log target is a lightweight addition to a busy box
Note: UDP syslog may truncate some longer messages
Support Resources
developerWorks articles
WebCasts
Forum: https://www.ibm.com/developerworks/forums/forum.jspa?forumID=1198
User Groups: http://www.websphere.org/websphere/Site?page=ugdetail&groupId=165
Feedback?
Comments?