0% found this document useful (0 votes)
32 views7 pages

Thread Monitor Yaml Config File

The document discusses the Thread Monitor tool which automatically collects thread dumps and javacores at configured intervals. It can be triggered by high thread usage and other events. The tool is configured via a YAML file and collects diagnostic information to help troubleshoot performance issues.

Uploaded by

raghunandhan.cv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views7 pages

Thread Monitor Yaml Config File

The document discusses the Thread Monitor tool which automatically collects thread dumps and javacores at configured intervals. It can be triggered by high thread usage and other events. The tool is configured via a YAML file and collects diagnostic information to help troubleshoot performance issues.

Uploaded by

raghunandhan.cv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Thread Monitor - Yaml configuration file

The Thead Monitor tool gathers thread dumps and Javacores at a configured interval, and during
events such as high WebContainer/Default_Executor pool thread usage. Yaml configuration file
must exist with name /SETUP/support/thread_monitor.yaml, or under the location specified with
the THREAD_MONITOR_CFG environment variable.

The Yaml file is loaded during startup, and scanned for changes every minute. If the Yaml file is
not readable during startup, no further attempts are made to load it during runtime.

By default, Thread Monitor is enabled.

Yaml file configurations:

The following is a list of various Yaml file configurations:

 enabled: main switch. Uses THREAD_MONITOR_ENABLED environment variable. Defaults to


false enabled: "${THREAD_MONITOR_ENABLED:-false}"
 format: threaddump or javacore. Threaddumps are less costly format: threaddump.
 directory: directory to which the threaddumps are written, and where archiving occurs.
 It needs to match the directory used with -Xdump that configures where Javacores are
written directory: "${DUMP_DIR}".
 minCollectionInterval: To ensure threaddumps/javacores are not collected too frequently.
Useful when triggers are enabled. Value is in seconds. It is not recommended to allow
intervals shorter than 15 seconds. Use 0 to disable. minCollectionInterval: 15

 archiving:
o delayedZipNum: when doing realtime debugging, it is useful to read dumps as

they get created.


o This setting delays zipping the most recent N dumps so they remain available to

review with 'vi'. delayedZipNum: 10


o zipSize: number of dumps to include per zip file zipSize: 30
o zipMaxAgeMins: deletes zips whose lastmodified time is older than the

configured time in minutes. 0 to disable zipMaxAgeMins: 2880


o zipMaxCount: only keep the last N zip files. 0 to disable zipMaxCount: 96

 threaddumps:
include: used to filter the threads that are saved to the threaddump file. Only thread
names that start with the specified strings will be logged include:

o WebContainer
o Default Executor
o main
o pool
o redisson

 triggers:
o interval: enables periodic collection of dumps. Value is in seconds. Interval:
30

o webContainer: triggers dumps based on WebContainer/Default_Executor thread

pool usage. The pool usage is evaluated when the monitor API triggers. When
using prometheus, the default scrape interval is 15 seconds.
o inUseLargerThan: trigger a dump when the number of active (not idle) threads is

larger than the configured value inUseLargerThan: 21


o format: use a format different than the default. When the number of threads

available in the thread pool is low, it can be preferred to collect Javacores instead
of Threaddumps as they have more information format: javacore.

 Thread monitoring tools


Javacores, and thread stacks in particular, are key debugging tools. Comparing stacks
across threads, and for different points in time, can help narrow down the cause of
performance slowdowns.

Thread monitoring tools


Javacores, and thread stacks in particular, are key debugging tools. Comparing stacks across
threads, and for different points in time, can help narrow down the cause of performance
slowdowns.

Timing is a major factor when diagnosing issues using Javacores and thread dumps. Thread
dumps must be collected at the exact time when the server is experiencing performance issues.
Some events are sporadic and short in duration, which makes the collection difficult. Often, by
the time a problem is detected and a manual dump is collected, the problem has passed.

Automatic thread dump collection

The Thread Monitor tool automates thread dump collection. It offers options for the format,
frequency and triggering of dumps. It integrates with the metrics framework to enable automatic
dump generation when certain thresholds, such as number of threads in use, are reached. With
this configuration, the Thread Monitor can automatically trigger the collection of dumps during
spikes or slowdowns.
The trigger, which is based on the number of threads in use, requires metrics to be enabled
in values.yaml.

The Thread Monitor tool is configured with the following YAML file which is available in the
Transaction Server, QueryApp, Solr Search and CRS containers. The configurations include
format (i.e. Thread dump or Javacore), archiving, and threshold triggers.

For details of each configuration option, see the YAML file:

/SETUP/support/thread_monitor.yaml

The Thread Monitor is enabled by default since HCL Commerce 9.1.9.

It is configured to collect dumps in "thread dump" format (a HCL custom format), when the
number of threads in use in the Default Executor/ Web Container pool exceeds a threshold value
of 20. It will collect a maximum of 10 dumps per hour. These configurations can be updated to
better match the scenario you are debugging.

Changes to the configuration in thread_monitor.yaml are automatically loaded at runtime. A


container restart is not required.

When a dump is collected due to the webContainer.inUseLargerThan threshold trigger, the


following warning will be printed to the logs:

[4/12/22 16:20:43:937 GMT] 00000124 WebContainerI W


com.hcl.commerce.monitor.thread.triggers.WebContainerInUseTrigger
evaluateWebContainerUsage( final int inUse ) WebContainer threads in use 21
is higher than threshold of 20. Generating threaddump...

The Thread Monitor can also be enabled and disabled from vault by configuring the environment
variable as follows:

${TENANT}/${ENVIRONMENT}/${ENVTYPE}/threadMonitorEnable/${component_name}:
value=true

ThreadAnalyzer utility

The Thread Monitor tool generates Javacores or thread dumps in the directory specified by
the DUMP_DIR environment variable. The ThreadAnalyzer utility can be used to facilitate the
analysis. This is particularly helpful when dozens or hundreds of dumps are created. Manual
review can be time consuming.

The tool is available in the containers that enable the Thread Monitor. It can be found at the
following path: /SETUP/support/thread-analyzer.jar.

The usage is as follows:

java -jar thread-analyzer.jar


Use -y when running inside a container
usage: java -jar thread-analyzer.jar [options]
-c,--config <arg> Optional path to a Yaml configuration file
-d,--directory <arg> Directory or file where the dumps are found. If a
file is specified, only that dump is analyzed. If
the file is a Zip, all the dumps within the zips
are parsed. If the location is a directory, all
dumps and zips within the directory will be parsed
(this does not include subfolders). When not set,
the tool attempts to determine the directory to
use automatically. If a directory is not
discovered, it defaults to the current directory.
-f,--follow If enabled, the tool will continue to monitor for
new threaddumps. This option is ignored if the
parameter is a file, not a directory. When -f is
set, -t is ignored.
-h,--help Prints this help.
-r,--report <arg> Comma separated list of reports. Defaults to 'Base
Web Activity'.
-s,--since <arg> Optional flag to limit the number of threadumps
Defaults to 3 hours. See 'time format'.
-t,--to <arg> Optional flag to limit the number of threadumps.
Defaults to now. See 'time format'.
-x,--extract-config Prints default config in stdout. This option is
used to customize the config, that can be then
specified with the -c option.
-y,--yes Required to acknowledge the utility can have an
impact when used within the container.

Time format: N[s=seconds, m=minutes, h=hours, d=days] OR yyyy-MM-dd HH:mm


OR epoch (e.g. 1650375222829) OR yyyyMMdd.HHmm[ss]

Available reports:
- Top Web Frames (tw): Reports the stacks most frequently found in threads
- Base Web Activity (ba): Summary of total base activities
- Web Activity (wa): Lists the discovered activity for each tread
Note: WARNING: When executed within the container, this utility takes CPU and memory
resources. The amount will vary depending on the number of dumps processed. For production
environments, it is recommended to use this utility outside the container. Using -Xmx to limit
memory usage is also highly recommended.

ThreadAnalyzer activity logic

The ThreadAnalyzer tool generates reports such as "Base Web Activity" and "Web Activity " that
rely on logic to "discover" key activites from a stack. The configuration is generally based on
applying RegEx patterns to stack frames, and to look for certain known classes and methods.

The following example attaches DB2 activity to a stack with frames that include the
package com.ibm.db2.jcc:
- activity: DB2
startsWith: com.ibm.db2.jcc
consumePrevious: true

The default configuration can be reviewed and extended.

 Use the -x argument to print the default configuration to the screen. The YAML output can
be redirected to a file for editing:

java -jar thread-analyzer.jar -x -y

 The customized configuration can be applied using the -c argument:

java -jar thread-analyzer.jar -c custom_config.yaml

ThreadAnalyzer reports

Base Web Activity report

The "Base Web Activity" report summarizes activity for each thread dump or Javacore. It
includes the time of the dump, the file name, the total number of non-idle threads, and the base
activities found. This report is useful to get a high level perspective of problem areas and to
identify timeframes, which typically show the most busy threads.

An example of the Base Web Activity report is as follows:


2021/10/26 13:10:12 - threaddump.20211026.131012.2913.0007.txt: 3 -
SocketRead[DB2]:3
2021/10/26 13:10:42 - threaddump.20211026.131042.2913.0009.txt: 10 -
SocketRead[DB2]:10
2021/10/26 13:11:12 - threaddump.20211026.131112.2913.0011.txt: 12 -
SocketRead[DB2]:7 CMD(RetrieveCatalogEntryTask):3 DB2[DBQuery]:1
CMD(CheckCatalogEntryEntitlementBySearch):1
2021/10/26 13:11:27 - threaddump.20211026.131127.2913.0012.txt: 15 -
CMD(RetrieveCatalogEntryTask):5 CMD(RetrieveContentTask):4 Web:4
CMD(InsertMoreMarketingContentAttachmentReferenceData):2
2021/10/26 13:11:42 - threaddump.20211026.131142.2913.0013.txt: 5 - Web:2
Commit:1 CMD(RetrieveContentTask):1 SocketRead[DB2]:1
2021/10/26 13:12:12 - threaddump.20211026.131212.2913.0015.txt: 2 -
DB2[DBQuery]:3
2021/10/26 13:13:27 - threaddump.20211026.131327.2913.0020.txt: 2 - Web:1
SocketRead[DB2]:1
2021/10/26 13:13:42 - threaddump.20211026.131342.2913.0021.txt: 1 -
SocketRead[DB2]:1
2021/10/26 13:14:12 - threaddump.20211026.131412.2913.0023.txt: 1 -
CMD(RetrieveCatalogEntryTask):1

Web Activity report


The "Web Activity" report shows details at the thread level for each dump found, including the
thread state and its activities.
2021/10/26 13:11:12 - threaddump.20211026.131112.2913.0011.txt:
WebContainer : 0 RUNNABLE [SocketRead[DB2], DBUpdate, Web]
WebContainer : 3 RUNNABLE [SocketRead[DB2], DBUpdate, Web]
WebContainer : 9 RUNNABLE [CMD(RetrieveCatalogEntryTask),
CMD(FilterCatalogEntryTask), Web]
WebContainer : 10 RUNNABLE [SocketRead[DB2], DBUpdate, Web]
WebContainer : 11 RUNNABLE [SocketRead[DB2], DBQuery,
CMD(RetrieveContentTask), CMD(FilterContentTask), Web]
WebContainer : 13 RUNNABLE [SocketRead[DB2], DBUpdate, Web]
WebContainer : 16 RUNNABLE [CMD(RetrieveCatalogEntryTask),
CMD(FilterCatalogEntryTask), Web]
WebContainer : 18 RUNNABLE [DB2[DBQuery],
CMD(RetrieveContentTask), CMD(FilterContentTask), Web]
WebContainer : 19 RUNNABLE [CMD(RetrieveCatalogEntryTask),
CMD(FilterCatalogEntryTask), Web]
WebContainer : 20 RUNNABLE [SocketRead[DB2], DBQuery,
CMD(RetrieveContentTask), CMD(FilterContentTask), Web]
WebContainer : 21 RUNNABLE [SocketRead[DB2], DBQuery,
CMD(RetrieveCatalogEntryTask), CMD(FilterCatalogEntryTask), Web]
WebContainer : 22 RUNNABLE
[CMD(CheckCatalogEntryEntitlementBySearch),
CMD(CatalogFilterGetContractUnitPrice), CMD(ComposePriceForCatalogEntry),
CMD(RetrieveCatalogEntryTask), CMD(FilterCatalogEntryTask), Web]

Top Web Frames report

The "Top Web Frames" report lists the most frequent stack frames for all the dumps found.

87 threaddumps, 98 total threads analyzed


278 (284%)
com.ibm.websphere.command.CacheableCommandImpl.execute(CacheableCommandImpl.java
:167)
244 (249%)
com.ibm.commerce.command.MeasuredCacheableCommandImpl.execute(MeasuredCacheableC
ommandImpl.java:69)
228 (233%)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.jav
a:55)
128 (131%)
com.ibm.commerce.foundation.client.facade.bod.AbstractBusinessObjectDocumentFaca
deClient.internalSendBusinessObjectDocument(AbstractBusinessObjectDocumentFacade
Client.java:818)
128 (131%)
com.ibm.commerce.foundation.internal.client.services.invocation.InvocationServic
e.invoke(InvocationService.java:113)
128 (131%)
com.ibm.commerce.foundation.internal.client.services.invocation.impl.NoInterface
EJBInvocationBindingImpl.invoke(NoInterfaceEJBInvocationBindingImpl.java:225)
128 (131%)
com.ibm.commerce.foundation.client.facade.bod.AbstractBusinessObjectDocumentFaca
deClient.sendBusinessObjectDocument(AbstractBusinessObjectDocumentFacadeClient.j
ava:529)
125 (128%)
com.ibm.commerce.foundation.server.command.bod.BusinessObjectCommandTargetImpl.e
xecuteCommand(BusinessObjectCommandTargetImpl.java:116)
125 (128%)
com.ibm.commerce.foundation.server.command.bod.BusinessObjectDocumentProcessor.p
rocessBusinessObjectDocument(BusinessObjectDocumentProcessor.java:276)
117 (119%)
com.ibm.commerce.foundation.internal.server.command.impl.CommandTarget.executeCo
mmand(CommandTarget.java:65)
109 (111%)
com.ibm.commerce.foundation.server.command.bod.AbstractGetBusinessObjectDocument
CmdImpl.performExecute(AbstractGetBusinessObjectDocumentCmdImpl.java:158)
96 (98%)
com.ibm.commerce.marketing.commands.marketingspot.MarketingSpotCommandBaseTaskCm
dImpl.execute(MarketingSpotCommandBaseTaskCmdImpl.java:387)

Percentages in excess of 100% mean the stack was found multiple times on a single thread.

The report allows for configurations (using --extract-config and --config) as follows:

report:
...
topWebFrames:
minPercentage: 5
excluded:
- com.ibm.ws.
- org.springframework.
- javax.
- java.
- org.apache.wink.
- com.ibm.io.

You might also like