Troubleshoot Section
Troubleshoot Section
AI Operations Management -
Containerized
Version : 24.4
Table of Contents
1. Troubleshoot 15
1.2.4. Fluent bit logs for postload-taskcontroller has many errors due to insufficient 23
buffer size
1.2.5. Troubleshoot guided install 25
1.2.7. Collector customization settings were lost after installing 2021.11 Patch 1 28
configuration
1.2.9. itom-monitoring-service-data-broker remains in Init mode when OBM 30
capability
1.2.10. AMCis not selected files aren't deployed after installation
configuration 31
1.2.13. Error: Target group name 'xxx-tg-TLS-5443' cannot be longer than '32' 36
characters
1.2.14. itom-monitoring-admin pod not starting after installation 37
enabled in feature-gates
1.2.18. Failed to parse values.yaml 43
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 2
AI Operations Management - Containerized 24.4
1.2.29. CAS content zip optic content schema installation gets stuck 61
1.3.4. Automatic upgrade of OBM Content Pack and UCMDB view fails while 69
1.3.8. gen_secret.sh failing due to cert issue during rerun on the same 75
environment
1.3.9. Pre-upgrade hooks failed 76
1.3.11. OPTIC DL Vertica Plugin fails with an error during the upgrade 80
1.3.12. "UPGRADE FAILED" error occurs after updating certificates from OMT 81
Management
1.3.13. EventsPortal
sent from OBM are not stored in the opr_event Vertica table 82
to not restart
1.3.15. Pulsar Push adapter doesn't work after upgrade 84
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 3
AI Operations Management - Containerized 24.4
1.5.2. User does not have required permissions to modify OOTB content 97
1.5.3. Agent System Metric Push does not have a Store and Forward capability 98
1.8.3. Docker pull doesn't work: Error while pulling image 114
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 4
AI Operations Management - Containerized 24.4
1.12.2. Sync Issue between APM and MCC Application Monitoring 135
1.13.4. Classic OBM UI remains inaccessible after logging into AI Operations 143
Management
1.13.5. Event browser connection error 144
1.13.9. OBM UI fails to load the page due to lightweight single sign-on issue 148
1.13.16.1. Cannot transfer Data Flow Probe from one domain to another 156
1.13.16.3. BSM server and the Probe connection fails due to an HTTP exception 158
1.13.16.5. Data Flow Probe node name cannot be resolved to its IP address 160
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 5
AI Operations Management - Containerized 24.4
1.13.16.8. Integration Probe not listed in Data Flow Probe Setup module tree 163
1.13.16.9.1. Unable to find the Data Flow Probe database scripts 165
1.13.17. Downtime notifications are not sent to DES unless OBM processes are 167
restarted
1.13.18. File format and extension pop-up appear in the graph_type excel 168
1.13.22. Adding or deleting dashboard or favorite on one GW is not visible on other 172
GWs
1.13.23. PD does not find any entry point to forward data to BVD 173
1.13.32. Creating a BVD Connected Server using the CLI doesn't work 184
1.13.33. Service Health data flow from OBM to RAW tables of OPTIC DL 185
1.13.34. Operations Agent Health dashboard is not displayed when selecting an 186
Operations
1.13.35. PDAgent CI fails from OPTIC DL as content packs having PD artifacts to
graphing 187
graph metrics
1.13.36. from OPTIC
OMi Server DL fail to import
self-monitoring content pack shows errors and unresolved 188
content
1.13.37. OutOfMemoryError: GC overhead limit exceeded error 189
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 6
AI Operations Management - Containerized 24.4
1.15.3. Localization isn't working while exporting the report to PDF 194
1.15.5. BVD CLI and Web to PDF CLI exit with error 196
1.15.7. Requesting PDF of a UIF page without specifying the file name throws an 198
error
1.15.8. BVD pod is in CrashedLoopState as migration table is locked 199
1.15.10. BVD does not show the correct license information 201
1.15.14. Schedule jobs are deleted if the schedules are failed 205
1.15.15. Name of the report and exported csv file names are different during cross 206
launch
1.15.16. Link provided in the mail along with scheduled report is not loading 207
complete report.notifications are not shown even though they are enabled
1.15.17. Popup 208
1.15.18. Number of bytes received from Vertica exceeded the configured 209
maximum
1.15.19. Certain passwords provided during application configuration for Vertica do 210
not work No
1.15.20. with BVD
data in RUM BVD dashboards 211
1.15.24. BVD reports failed to load with a red banner without any error 218
1.15.26. BVD pods failing with error WRONGPASS invalid username-password pair 220
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 7
AI Operations Management - Containerized 24.4
1.15.36. Processing request from server: Request failed with status code 500 230
1.15.37. Processing request from server: Request failed with status code 404 231
1.15.40. WebtoPDF not generating PDF when port number is not specified in the 234
URL
1.15.41. WebtoPDF not generating PDF 235
1.15.42. Mail command failed:501 Invalid MAIL FROM address provided 236
1.16.1.1. Metric data does not reach the Vertica database 242
1.16.1.6. Data is in OPTIC Data Lake Message Bus topic but not present in Vertica 250
tables
1.16.1.7. Unable to create the same dataset again 251
1.16.1.8. Data sent to the OPTIC DL HTTP Receiver not available in Vertica 252
database
1.16.1.9. Postload task flow not running 253
received
1.16.1.11. Single message pushed to a topic is not streaming into database 255
1.16.2.1. Certificate with the alias 'CA on abc.net' is already installed 257
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 8
AI Operations Management - Containerized 24.4
1.16.2.3. ERROR: Unavailable: initiator locks for query - Locking failure: Timed out I 259
locking
1.16.2.4. Can't forward events from OBM to OPTIC DL 260
1.16.2.6. Error while publishing data to the OPTIC DL Message Bus topics 262
1.16.2.7. After upgrade, the Vertica itom_di_metadata_ TABLE is not updated 263
1.16.3.5. Table does not get deleted after dataset is deleted 272
and Running
1.16.4.2. Howafter installation
to recover itomdipulsar-bookkeeper pods from read-only mode 277
1.16.4.6. Postload pods do not start and are stuck in 1/2 status 283
1.16.5.1. Guidelines for adding panels to the OPTIC Data Lake Health Insights 287
dashboard
1.16.5.2. Vertica Streaming Loader dashboard panels have no data loaded 288
1.16.5.3. The DP worker memory usage meter displays increasing memory usage 289
1.16.5.4. Data Flow Overview dashboard displays some topics with message batch 290
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 9
AI Operations Management - Containerized 24.4
1.16.5.5. Data not found in Vertica and Scheduler batch message count is zero 291
1.16.5.7. Postload Detail dashboard Taskflow drop-down does not list the 295
configured
1.16.5.8. task flows
Postload Overview dashboard 296
1.16.5.9. Request error rate in Data Flow Overview dashboard is greater than zero 299
1.16.5.10. Request error rate in Data Flow Overview dashboard is increasing over 300
time
1.16.5.11. The dashboard loads slowly or the page is unresponsive 301
1.16.5.12. The Receiver dashboard, Average Message Outgoing Rate panel 302
displays zero
1.16.5.13. The Receiver dashboard, Avg incoming requests rate - error (All) panel 303
is greater than
1.16.5.14. zero req/sec
The Receiver dashboard, Receiver running panel shows less than 100% 304
phase
1.16.6. How to's 306
1.16.6.2. How to check if OPTIC DL Message Bus topics and data is created 308
1.16.6.3. How to check the OPTIC DL Message Bus pod communication 309
1.16.6.4. How to recover OPTIC DL Message Bus from a worker node failure 310
1.16.6.5. How to check connectivity between Vertica node and OPTIC DL Message 311
Bus ProxyHow
1.16.6.6. services
to verify the OPTIC DL Vertica Plugin version after reinstall 313
1.17.2. Dashboard not found error on trying to redirect monitoring Service 318
Overview dashboard
1.17.3. UCMDB views for Hyperscale Observability aren't available in Performance 319
Dashboard
1.17.4. Performance Dashboard displays graphs with no data 320
1.17.5. Discovery is failing and not discovering any of the components for multi 322
probe domain
1.17.6. Hyperscale Observability events not forwarding to OPTIC Data Lake 323
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 10
AI Operations Management - Containerized 24.4
1.17.8.8. Multiple CIs with same name in uCMDB and PD Views 337
1.17.8.14. A few widgets in the Performance Dashboards don't have data 343
1.17.9.1. error loading graphQL mapping file for service: datafactoryservices 345
1.17.9.2. The resource type could not be found in the namespace 346
1.17.10.1. Events for Kubernetes infrastructure objects aren't displaying in OBM 349
Event Browser
1.17.10.2. Kubernetes collector triggers false events with major severity 350
1.17.10.4. Kubernetes Summary page displays undefined value in MYSQL innodb 354
graph
1.17.10.5. Kubernetes Summary page displays wrong data in the Total 355
1.17.11.2. Failed to activate zone. Error : [Action: activate zone , Resource: , Status 359
Code: 500,The
1.17.11.3. Request Status: Serverupdate
ops-monitoring-ctl internal error
command displays an error 361
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 11
AI Operations Management - Containerized 24.4
1.18.6. Automatic Event Correlation Explained UI does not load the translation 369
resources
1.18.7. Automatic Event Correlation Explained UI does not show complete data 370
1.18.8. Automatic Event Correlation Explained UI cannot be launched from OBM 371
1.18.12. AEC pods display error due to insufficient resources in Vertica's general 375
resource itom-analytics-opsbridge-notification
1.18.13. pool pod fails with OOMKilled error 376
1.19.1. OBM agent proxy selection during Edge installation on K3s installs 382
additional
1.20. common Reports
Troubleshoot components 383
1.20.6. Business Process Monitoring reports are showing partial data or no data 397
1.20.8. System Infrastructure reports are showing no or partial data or updated 405
data is not
1.20.9. shown
System in the reports
Infrastructure report widget displays partial data for metrics 408
collected by Operations
1.20.10. Troubleshoot Agent Infrastructure Reports collection issues with Agent
System 416
Metric Collector
1.20.11. Troubleshoot System Infrastructure Reports collection issues with Metric 421
Streaming policies
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 12
AI Operations Management - Containerized 24.4
System
1.20.13.Resource Details reports
System Infrastructure Availability data is missing in reports 428
1.20.14. Event reports are showing no or partial data or updated data is not shown 431
in the reports
1.20.15. System Infrastructure report widget displays partial data for metrics 433
collectedTroubleshoot
1.20.16. by SiteScopeSystem Infrastructure Reports collection issues with 441
SiteScope as collector
1.20.17. Task flows aren't listed on the OPTIC DL Health Insights dashboard 444
1.20.18. Aggregate tables are not updated data in the system infrastructure or 445
event reports
1.20.19. aren't resources
Insufficient refreshed to execute plan on pool 448
itom_di_postload_respool_provider_default
1.20.20. tenant_id is not configured - SiteScope 449
1.20.23. Agent Metric Collector is unable to collect metrics from the Operations 452
Agents
1.20.24.on thecontent
The workerupload
nodes fails or if the tables in mf_shared_provider_default 453
schema
1.20.25. are
FromnotOPTIC
populated completely
Data Administration : Could not complete request 455
successfully
1.20.26. SysInfra file system or node instanceType doesn't display the targets 456
1.20.32. Issue with Data Enrichment Service with Classic OBM Integration 464
1.22.1. The process can't access the file because it's being used by another 469
process
1.22.2. Configuration not found in the configuration sheet 470
1.22.3. Field name has invalid input at row number in the sheet 471
1.22.4. [ERROR] : Column name has invalid input at row number in the 472
configuration sheet
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 13
AI Operations Management - Containerized 24.4
1.22.5. Type mismatch exception “Can't get a STRING value from a NUMERIC cell” 473
1.22.6. CMI tool fails to generate Excel files when ran directly on an OA machine 474
1.23.2. Data forwarding issues from classic OBM to OPTIC Data Lake 477
1.23.5. SiteScope (non-TLS) - OBM (TLS) integration fails while configuring the 480
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 14
AI Operations Management - Containerized 24.4
1. Troubleshoot
You can use the information to troubleshoot problems that you may encounter when installing and using the AI Operations
Management and the ITOM Container Deployment Foundation.
Related topics
To see the list of known issues, see Known issues .
To see the list of known issues, see Release Notes.
For more information on how to manage logs, see Logs.
To contact support, see Contact Support.
For more information on Troubleshooting Toolkit, see Deployment Toolkit.
For more information on OMT troubleshooting, see Troubleshoot OMT
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 15
AI Operations Management - Containerized 24.4
Application logs
Installation
/opt/kubernetes/install-.log
NFS share
Example: /var/vols/itom/log-volume/opsbridge-w87mk/opsbridge-w87mk__omi-0__omi__example.hostname.net__obm
<NFS_obm_directory>/omi/opt/HP/BSM/log/topaz_all.log
<NFS_obm_directory>/omi/opt/HP/BSM/log/jboss7_boot.log
<NFS_obm_directory>/omi/opt/HP/BSM/log/supervisor/nanny_all.log
Login
<NFS_obm_directory>/omi/opt/HP/BSM/log/jboss/login.log
OMT logs
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 16
AI Operations Management - Containerized 24.4
OPTIC Management Toolkit (OMT) uses Fluent bit to collect and gather logs for OMT system components, containers, and
Kubernetes. For more information, see the OMT documentation.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 17
AI Operations Management - Containerized 24.4
Available log types are access , audit , bvd and error . The logs follow the naming schema <log type>-<pod name>.log
The access logs contain information about all requests processed by the server (for example HTML files or graphics).
This data can then be statistically analyzed and summarized by another program.
The audit logs contain information about successful and failed user logins.
The logs contain debug information for more detailed troubleshooting.
You can enable additional logging for the receiver, controller, and web server. To enable additional logging, complete the
following steps:
Enabling access logging for the receiver will also log the API keys. This might impact the security of the application as every
user with access to the log files will be able to see the API keys.
https://<external_access_host>:5443
<external_access_host> is the fully qualified domain name of the host which you specified as EXTERNAL_ACCESS_HOST in
the install.properties file during the OPTIC Management Toolkit (OMT) installation. Usually, this is the master node's
FQDN.
2. Click on Launch Dashboard (opens a new browser page), select <application namespace> and Deployments.
bvd-redis : In memory database for statistics and session data. Message bus for server process communication
4. In the new window, search for "debug" to find the DEBUG entry.
6. Click UPDATE.
If you install the application with the internal PostgreSQL, you can also view the Redis and PostgreSQL logs using kubectl :
Note
: After the installation of the application some errors are expected and the application processes and PostgreSQL logs the errors.
They should stop after a few minutes, once the application is up and running correctly.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 18
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 19
AI Operations Management - Containerized 24.4
Cause
This occurs when there is a connection error between the Message Bus and Vertica. Following is the error recorded in the
scheduler pod NFS logs:
Failed creating reader to topic - requesting restart: ConnectError | Zero message count found in topic!
Solution
Perform the following steps as a workaround to resolve this error:
1. Run the following command to add the existing application suite chart values to the helmvalues_opsb.yaml:
2. Update the following parameters in the helmvalues_opsb.yaml file and set the values as shown below:
pulsar: pulsar.itomqapri.saqa-aws.cloud
externalDNS:
enabled: true
4. Verify the following new entries created for the Message Bus in AWS Route53 service:
A-record
TXT-record
5. Ensure that there are no issues in the data flow and the Message Bus to Vertica connection is restored.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 20
AI Operations Management - Containerized 24.4
Error
psql: error: connection to server at "port 5432 failed: FATAL: database "dbadmin" does not exist"
Issue
Automatically create required databases functionality fails.
Solution
Create a database with the same name as the database administrator user name.
For example, if your database administrator user name is dbadmin, then run the following command to create a database
with the same name:
and then use the same user dbadmin while deploying using Automatically create required databases feature on Apphub.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 21
AI Operations Management - Containerized 24.4
Cause
When you set the deployment size to Demo and you have UCMDB probe integrated in your enviroment, the UCMDB probe
might restart.
Solution
Perform the following steps to remediate the issue:
helm get values <helm deployment name> -n <application namespace> > <VALUES_FILE_NAME>
Example
ucmdbprobe.size Small
ucmdbprobe.deployment.maxRemoteProcesses 2
Example
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 22
AI Operations Management - Containerized 24.4
Error
The console has error messages that the lines are too long in the taskcontroller.log .
for example, taskcontroller.log requires a larger buffer size, lines are too long. Skipping file.
Cause
Insufficient buffer size for taskcontroller.log
Solution
The workaround for this issue is to update the fluent bit configmap created for OPTIC Data Lake Postload.
2. Search for " di-postload-input.conf " key in the configmap and add the following lines at the end of the INPUT section
for taskcontroller.log as shown in the examples.
Buffer_Chunk_Size 64KB
Buffer_Max_Size 128KB
Important
di-postload-input.conf: |-
[INPUT]
Name tail
Tag odlpostload.*
Path /fluentbit/deployment-log/nom/nom*postload-taskcontroller*/taskcontroller.log
Multiline On
Parser_FirstLine odl-postload-tc
Buffer_Chunk_Size 64KB
Buffer_Max_Size 128KB
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 23
AI Operations Management - Containerized 24.4
di-postload-input.conf: |-
[INPUT]
Name tail
Tag odlpostload.*
Path /fluentbit/deployment-log/opsb/opsb*postload-taskcontroller*/taskcontroller.log
Multiline On
Parser_FirstLine odl-postload-tc
Buffer_Chunk_Size 64KB
Buffer_Max_Size 128KB
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 24
AI Operations Management - Containerized 24.4
Solution
1. Check the guided_Install_dd-mm-yy-h:min:sec.log file. For example, guided_Install_18-04-23-08:15:03.log file.
2. If it fails stating disk is already mounted or the disk already has a partition created, add a new disk that's not mounted
or partitioned.
3. Refer to Configure separate disk devices for each LPV on worker nodes and complete the prerequisite.
4. If it fails for any other reason follow the steps mentioned in Uninstall local storage provisioner.
Solution 1
Check the firewall status. If you have disabled the firewall on the master node and enabled on Vertica nodes, disable the
firewall on all Vertica nodes.
Solution 2
1. Check the guided_Install_dd-mm-yy-h:min:sec.log file. For example, guided_Install_18-04-23-08:15:03.log file.
2. If the error is due to Vertica failure, run the following commands on all the Vertica nodes to clean up:
3. If you have used the certificates created by the guided install script, on the Vertica server, clean up the certificates and
key in the following directories:
1. < directory where you have downloaded and unzipped the files>/resources/issuecert directory
2. /tmp on all the vertica nodes
If you have used existing certificates, on all the Vertica servers, clean up the certificates and key in the /tmp directory.
ERROR: Vertica processes still running.ERROR: You must stop them prior to uninstall.
Solution
Perform the following steps to resolve this issue:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 25
AI Operations Management - Containerized 24.4
1. Stop the Vertica processes that are running on all the nodes.
2. Run the following command on all Vertica nodes:
rpm -e vertica-<version>
Solution
Check the guided_Install_dd-mm-yy-h:min:sec.log file. For example, guided_Install_18-04-23-08:15:03.log file. If the cause of the
issue is an existing /home/dbadmin/.itomdipulsarudx , then follow these steps:
1. Remove the /home/dbadmin/.itomdipulsarudx file from all the Vertica nodes and follow the instructions at Install OPTIC
Data Lake Vertica Plugin.
2. Rerun the guided install script.
If there is any other issue with the plugin, see Install the Vertica OPTIC DL plugin, Create the variables file, and Configure the
Vertica database and Enable TLS to install and configure manually.
< directory where you have downloaded and unzipped the files>/resources/issuecert directory
/tmp
If you have used existing certificates, on the PostgreSQL server, clean up the certificates and key in the /tmp directory.
1. Uninstall OMT.
2. Edit your least-input.properties or custom-input.properties file depending on what you have used as shown in the example
below:
completedModule: "Master,Worker,NFS,relationalDatabase,Vertica,OMTDatabase,OMTPreRequisite"
If script fails at OMTDeployment stage, perform the clean up steps and then remove both the modules OMTDatabase and
OMTPreRequisite from completedModule section and add them in the deploy section.
deploy: "OMTDatabase,OMTPreRequisite,OMTDeployment,LPVDeployment,ApplicationPrerequisite"
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 26
AI Operations Management - Containerized 24.4
Error
Error: UPGRADE FAILED: create: failed to create: Secret "sh.helm.release.v1.opsb.v5" is invalid: data: Too long:
must have at most 1048576 bytes
Issue
idl_config.sh script fails due to helm secrets size issue.
Cause
opsb-background.png under the images directory is taking more disk space, up almost 217kb .
Solution
1. Unzip the application chart
unzip opsbridge-suite-<version>.tgz
2. Go to /opsbridge-suite/_bosun/images/
cd /opsbridge-suite/_bosun/images/
3. Delete opsb-background.png
rm opsb-background.png
4. Continue with running the idl_config.sh script using the following command:
./idl_config.sh -cacert <absolute path of obm certificate> -chart < absolute path of opsbridge application chart tgz> -namespace < a
pplication namespace> -release <deployment name>
Example:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 27
AI Operations Management - Containerized 24.4
Cause
You may lose OOTB collector configurations after 2021.11 patch1 installation if autoStartAgentMetricCollector to true.
Solution
If you have changed OOTB configuration, make sure that AMC isn't going to autostart after Patch installation. Set autoStartAg
entMetricCollector to false in the values.yaml file before patch installation. Follow the installation instructions given in the
patch documentation.
Tip
Don't customize OOTB collector configurations but instead clone and disable the OTTB versions before the
installation.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 28
AI Operations Management - Containerized 24.4
Cause
This issue occurs due to the incorrect structure of anomaly parameters in the values.yaml file. The itom-oba-config chart ( anom
alyDetection capability) doesn't read the parameters from the suite provided in the values.yaml file.
Solution
1. Run the following command to get the values.yaml file.
helm get values <helm deployment name> -n <suite namespace> > /tmp/values.yaml
2. Rearrange the configuration parameters located under itom-oba-config instead of anomalyDetection as follows:
itom-oba-config:
deployment:
oba:
protocol: https
host: OBA Application Server host
configParameterServicePort: 9090
helm upgrade <helm deployment name> -n <suite namespace> -f <values.yaml> <chart> --reuse-values
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 29
AI Operations Management - Containerized 24.4
1.2.9. itom-monitoring-service-data-broker
remains in Init mode when OBM capability is not
selected
Cause
When you have deployed OPTIC Reporting or HyperScale Observability through AppHub without selecting external OBM or
OBM capability, itom-monitoring-service-data-broker remains in Init status.
Solution
In AppHub, edit the same deployment to enable OBM capability and redeploy.
If you want to use Classic OBM, do the following and redeploy:
For OPTIC Reporting > Enable Agent Metric Collector > enable Use Classic Operations Bridge Manager
(OBM)
For HyperScale Observability > enable Use External OBM.
Note
There may be a delay during Data broker pod startup as agent configuration runs every time the pod is recreated. This is the
expected behavior and requires no action.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 30
AI Operations Management - Containerized 24.4
Cause 1
The IsAgentMetricCollectorEnabled parameter wasn't enabled in the values.yaml file during installation.
Solution 1
Ensure that you set the parameter IsAgentMetricCollectorEnabled to true in the values.yaml file. For more information,
see Configure System Infrastructure Reports using Agent Metric Collector.
Cause 2
All the containers in the required pods (Content Manager service, Monitoring Admin service, Data Administration service, IDM
service, and BVD service) aren't in running state.
Solution 2
Ensure that all the pods are up and running so that you can deploy the AMC configuration file. Run the following command to
verify if all the pods are running:
Cause 3
Download of the ops-monitoring-ctl and ops-content-ctl CLIs failed or Sysinfra content pack ( OpsB_SysInfra_Content_202x.
xx.xxx.zip) failed.
Solution 3
See Administer AMC with CLI to know more about how and where to download the CLIs.
kubectl logs -n $(kubectl get pod -A | awk '/autoconfigure/ {print $1, $2}') > autoconfig.log
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 31
AI Operations Management - Containerized 24.4
If the credential and target configuration files aren't available on the NFS server, you can manually create the credential,
target, and collector configurations. For more information, see AMC credential configuration and AMC target configuration.
To create and deploy the configurations, see Administer AMC with CLI.
Cause 4
Certificate exchange between Operations Bridge Manager (OBM) and Data Broker Container (DBC) isn't done within two
hours of installation.
Solution 4
This suggests that the certificate exchange didn't happen within two hours.
You can use one of the following ways to grant certificates from OBM:
From OBM UI - Administration -> Setup and Maintenance -> Certificate Requests
Or
From CLI
Run the following command to get a list of available certificates. You will get the core IDs of the available certificates. For
example: "daf17e6a-e203-75d6-10d6-ff59507e88dc"
# /opt/OV/bin/ovcm -listpending
For example:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 32
AI Operations Management - Containerized 24.4
kubectl get jobs -n $(kubectl get jobs -A | awk '/autoconfigure/ {print $1, $2}') -o yaml > file_itom-monitoring-collection-autoconfigure
-job.yaml
yq eval -i 'del(.spec.template.metadata.labels)' file_itom-monitoring-collection-autoconfigure-job.yaml
yq eval -i 'del(.spec.selector)' file_itom-monitoring-collection-autoconfigure-job.yaml
kubectl delete job -n $(kubectl get jobs -A | awk '/autoc/ {print $1, $2}')
kubectl apply -f file_itom-monitoring-collection-autoconfigure-job.yaml
After the AMC configuration files are deployed, you will see the following message in the auto configuration log:
=======================================SUMMARY====================================
=================
obmendpoint - https://hostname.swinfra.net:443/
Certificates :
32fdaad0-76e8-75d5-1966-88cc61e5a54c
Trusted Certificates :
CA_41f46360-1006-75d5-0dc3-8fe1ffa780f9_2048
MF RE CA on Vault b355f46c
MF RID CA on Vault b355f46c
kubernetes
=================================================================================
===================
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 33
AI Operations Management - Containerized 24.4
Cause
While performing the AWS infrastructure manual set up Prepare VPC step, uploading the network-with-vpc.template , the upload
fails with the error. This is because the default Elastic IP setting in the service quota is 5. This isn't enough to successfully
create the stack.
Solution
To resolve this issue, you must enhance the quota (for example: 10) in the AWS Management Console. After the quota
approval to a higher number, the stack gets created. For more information, see AWS documentation.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 34
AI Operations Management - Containerized 24.4
Solution
Perform the following steps to resolve this issue:
{
"max-concurrent-downloads": 1,
"max-concurrent-uploads": 1
}
region=<ecr_region>
ecrURL=`aws ecr get-authorization-token --region $region --query="authorizationData[0].proxyEndpoint"| grep -oE "[0-9]+[^\"]*"
`
ecrUserName=AWS
ecrUserPassword=`aws --region $region ecr get-login-password`
python3 image-transfer.py -su <source-username> -sp <source-password> -so <source-orgname> -sr <source-registry> -p /tmp/
cdf-image-set.json -ts 4 -ry 3 -tu $ecrUserName -tp $ecrUserPassword -to <target-orgname> -tr $ecrURL
python3 image-transfer.py -su <source-username> -sp <source-password> -so <source-orgname> -sr <source-registry> -p /tmp/
_image-set.json -ts 4 -ry 3 -tu $ecrUserName -tp $ecrUserPassword -to <target-orgname> -tr $ecrURL
<ecr_region> is the region where you will put your ECR repositories.
<source-username> is the user name of the source docker registry.
<source-password> is the password of the source docker registry.
<source-registry> is the URL of the source registry. For example, registry.hub.docker.com .
<source-orgname> is the organization name in the source docker registry. It's hpeswitom if you use Docker Hub, or
contact the admin of the registry.
<target-orgname> is the organization name in the target docker registry. You can get it from the admin of the
registry.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 35
AI Operations Management - Containerized 24.4
Error: Target group name 'xxx-tg-TLS-5443' cannot be longer than '32' characters
Cause
This issue is because you have set the environment-prefix in itom-cloud-toolkit-20xx.xx.xx-XX/aws/tf-itom-sa/template.tfvars file with
more characters than the default.
Solution
Ensure to set the environment-prefix in itom-cloud-toolkit-20xx.xx.xx-XX/aws/tf-itom-sa/template.tfvars file with the same character
length as the default value.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 36
AI Operations Management - Containerized 24.4
Cause
The issue occurs if the default objects fail to upload again after itom-monitoring-admin pod restarts.
Solution
To resolve this issue, follow these steps:
1. For the embedded PostgreSQL database, get the credentials of the PostgreSQL database, log in to the pod, and then log
in to the PostgreSQL database:
2. For the external PostgreSQL database, log into PostgreSQL database using the IP address:
6. Delete thresholds:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 37
AI Operations Management - Containerized 24.4
Cause:
This issue occurs in the host systems that don't give enough entropy for secure random numbers.
Solution:
You can resolve this issue by installing haveged package in all worker nodes. Perform the following steps to add the Extra
Packages for Enterprise Linux (EPEL) repository for Red Hat Enterprise Linux (RHEL) and Community ENTerprise Operating
System (CentOS):
1. Run the following command to create a temporary directory for storing the EPEL repository rpm file:
mkdir <folder_name>
2. Navigate to the newly created directory and then download the EPEL repository rpm file by running the following
command to download RHEL:
3. Run the following command to install the newly downloaded rpm package:
b. Enable haveged :
c. Start haveged :
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 38
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 39
AI Operations Management - Containerized 24.4
Solution
Follow the steps to resolve this issue:
Task 1: Change the passwords - Change the Vertica password without using the " = " character in the Vertica database for
Vertica DB, read write, read-only users.
For External Vertica, don't change the password for the dbadmin . You must change the password only for the vertica_rouser
and vertica_rwuser you created during the Vertica installation. Perform the following steps:
Tip
If you have used embedded Vertica, you must perform these steps for the dbadmin
user.
Run the following command on the control plane to verify if you can update the password for ITOMDI_DBA_PASSWORD_KEY and
ITOMDI_RO_PASSWORD_KEY parameters:
cd /opt/kubernetes/scripts/
./generate_secrets -u -n <suite namespace> -c <opsbridge-suite_chart_file> -o <secrets.yaml>
Where:
<suite namespace> is the namespace where you have installed AI Operations Management.
<opsbridge-suite_chart_file> is a file with .tgz extension. The suite zip contains the file opsbridge-suite-<version>.tgz in
the charts directory.
If you have upgraded to this version from a version where you had = in the passwords, you won't be able to edit the ITOMDI_
DBA_PASSWORD_KEY and ITOMDI_RO_PASSWORD_KEY parameters and you see the following message:
You must perform the next task to patch the secret with the new password.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 40
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 41
AI Operations Management - Containerized 24.4
[FATAL] 2020-06-05 09:17:11: Cannot install secondary deployment because 'MultipleDeployment' is not enabled in
core/feature-gates is seen when you try to create suite namespace or secondary namespace.
# ./cdfctl.sh deployment create -d <suite namespace >-t Helm -u admin -p <'password>
Cause
FEATURE_GATE setting does not allow multiple deployments.
When changing install.properties we have set "Prometheus=true". But that will not allow taking the default of
"MultipleDeployment=true".
Solution
1) kubectl edit cm feature-gates -n core
2) set the MultipleDeployment feature to true, save, and exit.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 42
AI Operations Management - Containerized 24.4
Cause
Helm install verifies the values.yaml file and throw warnings if there are any syntax errors. Helm install will not proceed
further until all the syntax errors are corrected.
Solution
Manually verify and correct all the errors. You may even choose to use any other syntax validator tools available.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 43
AI Operations Management - Containerized 24.4
Cause
If you accidentally cancel or for some reason the install session gets aborted before it is completed and you have to re-run
the helm install command.
Solution
1. Uninstall the deployment which has aborted or canceled by using the below command:
helm uninstall <helm deployment name> -n <suite namespace> --no-hooks
Example:
Important
There is no need to
uninstall OMT.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 44
AI Operations Management - Containerized 24.4
Cause
This issue occurs when one or more Kubernetes jobs have failed and have exceeded the number of retries.
The installation process has a check for certificates and databases. During the installation, the installation process validates
the following:
Certificate validation: External database CA certificates, application client authentication certificates, OPTIC Data Lake
message bus client authentication certificates, and ingress controller certificates.
Database validation: External database connection parameters for PostgreSQL and Oracle. If you have selected a
capability that requires OPTIC Data Lake, the Vertica database connection parameters are also checked.
OPTIC Data Lake Plugin validation: Supported Vertica versions, OPTIC Data Lake Vertica Plugin versions, and the Vertica
configurations.
The certificate validation, database validation, or Plugin validation may fail due to various reasons.
The following table provides more information on specific parameters and their expected value:
Error Cause
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 45
AI Operations Management - Containerized 24.4
Error Cause
ERROR driver: Failed to establish a connection to the primary server or an Vertica servers aren't responsive.
y backup host. ... no such host Vertica host name isn't correct.
The network path to Vertica hostname isn't reachable.
Could not connect to Vertica: Error: [28000] Invalid username or passwor The Vertica user name isn't correct.
d The Vertica user password that's stored in a secret isn't
correct.
Error: [3D000] Database "<myDB>" does not exist The Vertica database name is wrong or doesn't exist.
Invalid UDX version detected 2.4.0-8, expected 2.5.0-19 OPTIC Data Lake Vertica Plugin version is wrong.
Solution
ERROR driver: Failed to establish a connection to the primary server or Check the Vertica port configuration and the Vertica
any backup host. ... connect: connection refused connection.
Could not connect to Vertica: Error: [28000] Invalid username or passw Check and give the correct Vertica username and
ord password.
x509: certificate signed by unknown authority ... Check and give the correct Vertica certificate name.
Error: [3D000] Database "<myDB>" does not exist Check and give the correct Vertica database name.
Invalid Vertica version detected Vertica Analytic Database v9.2.1, expe Ensure that you have deployed or upgraded to the
cted v10.1.0,v10.1.1 supported Vertica version.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 46
AI Operations Management - Containerized 24.4
Error Solution
Install the OPTIC Data Lake Vertica Plugin. Ensure that the
Failed reading UDX version
Vertica user has the required privileges.
2. Check for all the failed jobs and their logs to identify the problem and fix the issues.
Run the following command to see the logs of the failed job:
kubectl logs pods/<name of the failed job pod name> -n <application namespace>
Resolve the issues related to OPTIC Data Lake Plugin validation according to the error logs as follows:
Error Solution
ERROR driver: Failed to establish a connection to the primary server or Check the Vertica port configuration and the Vertica
any backup host. ... connect: connection refused connection.
Could not connect to Vertica: Error: [28000] Invalid username or passw Check and give the correct Vertica username and
ord password.
x509: certificate signed by unknown authority ... Check and give the correct Vertica certificate name.
Error: [3D000] Database "<myDB>" does not exist Check and give the correct Vertica database name.
Invalid Vertica version detected Vertica Analytic Database v9.2.1, expe Ensure that you have deployed or upgraded to the
cted v10.1.0,v10.1.1 supported Vertica version.
Install the OPTIC Data Lake Vertica Plugin. Ensure that the
Failed reading UDX version
Vertica user has the required privileges.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 47
AI Operations Management - Containerized 24.4
Example
This is an example scenario when itom-certificate-validator-job has failed.
In this example, the itom-certificate-validator-job has failed, it's showing 0/1 is complete.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 48
AI Operations Management - Containerized 24.4
Common Certificate Validation Failed for CA Certificates: Certificate: SAN: sac-hvm03312.swinfra.net Issuer CN: sac-hvm03312.swinfra.
net Common Name: sac-hvm03312.swinfra.net
Parsing Certificate: /var/run/secrets/db-certificates/vertica-ca.crt
Parsed the Certificate: /var/run/secrets/db-certificates/vertica-ca.crt sucessfully
Begin the expiry check for the certificate.
Certificate expiry validation successful.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 49
AI Operations Management - Containerized 24.4
Cause
During the suite deployment or upgrade, the job itom-di-scheduler-udx-preinstall runs. This job checks for the supported
Vertica, OPTIC DL Vertica Plugin versions, and the Vertica configurations. If the versions don't match the error appears.
Solution
Follow these steps to resolve this error:
You can resolve the issue according to the error logs as follows:
Could not connect to Vertica: Error: [28000] Invalid use The Vertica user name isn't correct.
rname or password The Vertica user password that's stored in a secret isn't correct.
You must check and give the correct Vertica username and password.
This error is because you have configured Vertica with TLS, but the Vertica
x509: certificate signed by unknown authority ... certificate for the suite isn't correct. You must check and give the correct Vertica
certificate name.
This error is because the Vertica database isn't correct. You must check and give
Error: [3D000] Database "<myDB>" does not exist
the correct Vertica database name.
Invalid Vertica version detected Vertica Analytic Datab This error is because the Vertica version isn't supported. You must ensure to
ase v9.2.1, expected v10.1.0,v10.1.1 deploy/upgrade to the supported Vertica version.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 50
AI Operations Management - Containerized 24.4
You must install the OPTIC DL Vertica Plugin. Ensure to give required privileges
to the OPTIC DL Vertica Plugin in Vertica.
Invalid UDX version detected 2.4.0-8, expected 2.5.0-1 This error is because OPTIC DL Vertica Plugin isn't the correct version. You must
9 install the supported OPTIC DL Vertica Plugin version.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 51
AI Operations Management - Containerized 24.4
After the installation of AI Operations Management, a 502 Bad Gateway error displays when trying to access OBM.
Cause
The 502 error displays because OBM is not yet up and running.
Solution
Depending on the host machine, it might take up to one hour for OBM to start after the initial configuration.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 52
AI Operations Management - Containerized 24.4
After the installation of the Container Deployment Foundation, a 503 Nginx error displays when trying to access the Suite
Installer.
Cause
This error may display because the time on the master and worker nodes is different.
Solution
To resolve this issue, synchronize the time on your nodes by using, for example, NTP or VMWare tools.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 53
AI Operations Management - Containerized 24.4
Solution
To solve this problem, run the following commands to recreate the suite-db pods:
You can restart the virtual machine where the suite has installed the suite-db pod, as an alternate method.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 54
AI Operations Management - Containerized 24.4
Cause
This relates to the vault-renewal container, which does not get a valid token.
Solution
You have to delete the failed pods. Once the pods delete, they recreate automatically and should run without error.
You can get the status of all pods with the following command:
kubectl get pods --all-namespaces
First, delete all failed database related pods ( suite-db , idm-postgresql , postgresql-aplm ).
After that delete all failed pods within the namespace opsbridge , starting with postgres , ucmdb , omi , redis , bvd ). Use the
following command to delete the failed pods within the namespaces specified above:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 55
AI Operations Management - Containerized 24.4
Cause
This issue occurs because the first control plane node is crashed and the virtual IP address is resolved to the second or the
third control plane node.
Solution
Continue the installation from the following URL:
https://<second/third_control_plane_node>:3000
In this URL, replace <second/third _control_plane_node > with the hostname of either the second control plane node or the
third control plane node.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 56
AI Operations Management - Containerized 24.4
Cause
This issue occurs because the root CA isn't imported to the certificate trust management of your browser.
Solution
1. Run the following command on your terminal to check the certificate details:
kubectl get cm -n $CDF_NAMESPACE public-ca-certificates -o yaml
2. Copy the RE_ca.crt part to a file as a certificate and save it.
3. Upload the RE_ca.crt certificate file to your browser.
4. Upload the ca.crt file from $CDF_HOME/ssl to your browser.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 57
AI Operations Management - Containerized 24.4
1. Before deleting content packs, make sure the Content Pack Development is enabled in the infrastructure settings
Note
i. Go to Administration > Event Processing > Automation > Time-Based Event Automation .
ii. Select the rightmost icon in the icon list preceding the rules list to open the Scripts Manager.
iii. In the Scripts Manager, select the CloseCauseIfSymptomsClosed script and delete it.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 58
AI Operations Management - Containerized 24.4
c. In the Scripts Manager, select the COSO Data Lake Event Forwarding Script script and delete it.
10. In the Infrastructure Settings, revert the settings for COSO endpoints
/opt/OV/bin/ovcert -list
On Windows:
"%OvInstallDir%\bin\win64\ovcert" -list
b. In the list of installed certificates, find the certificates that begin with MF CDF or MF RE or MF RIC or MF RID .....
Remove the certificates from both resource groups.
For example, on Linux:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 59
AI Operations Management - Containerized 24.4
For example:
cd integration-tools
./call-analytics-datasources.sh -aec-namespace <AEC_NAMESPACE> list both
Check the list for the endpoint ID of the OBM instance to remove.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 60
AI Operations Management - Containerized 24.4
Also, the CAS deployment log shows unsuccessful deployment. You can verify the logs by running the command:
Cause
This issue may occur because of many reasons. One such reason is the itom-opsb-content-manager pod restarts during
installation, causing the schema installation to become unresponsive.
Solution
You can resolve the issue in either of the following ways:
Solution 1
When multiple contents have issues, then you must restart the CAS job pod. To restart, run the command:
Solution 2
When one or two contents have issue, then you can resolve the issue by force starting the content installation. To force start
the installation, run the command:
ops-content-ctl install content -n <name of the content> -v <version for the content> -f
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 61
AI Operations Management - Containerized 24.4
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 26m default-scheduler 0/1 nodes are available: 1 Too many pods.
Warning FailedScheduling 10m (x17 over 25m) default-scheduler 0/1 nodes are available: 1 Too many pods.
Cause
This issue occurs because by default Kubernetes only allows 110 pods for each node.
Solution
To work around this issue, you need to modify the maximum number of pods running on each node. Perform the following
steps:
1. Run the following command to get the maximum pods value for the node:
2. Run the following command to update maxPods field in the kubelet-config file. The <value> must be a non-negative
integer:
Note
Ensure that the current node has enough resources to run these
pods.
yq -i e '.maxPods=<value>' $CDF_HOME/cfg/kubelet-config
For example:
yq -i e '.maxPods=110' $CDF_HOME/cfg/kubelet-config
4. Run the following command to check the maxPods value for the node:
Repeat the steps if you want to modify the maxPods value for other nodes. You should ssh to these nodes before performing
the steps.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 62
AI Operations Management - Containerized 24.4
Issue
When helm upgrade is run on an application to enable a new capability, the log forwarding for the new capability
is not configured. The log entries for the new capability won't appear in Elasticsearch.
Cause
The log forwarding configuration associated with the new capability is not loaded.
Solution
The workaround is to reload the configuration by restarting the fluent-bit pod. You must scale down the replicas of the itom-fl
uentbit pod to 0 and scale it back to 1.
This forces fluent bit to read all the configmaps including that of the new capability .
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 63
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 64
AI Operations Management - Containerized 24.4
[main] WARN com.microfocus.pulsar.config.job.OpticBacklogQuota - Attempt 1 retrying to get topic backlog quota for the persistent://p
ublic/default/valid_data_topic due to {}"
ERROR com.microfocus.pulsar.config.job.OpticBacklogQuota - Attempt 5 retrying to get topic backlog quota for the persistent://public/d
efault/valid_data_topic due to {}"
Additionally, run the command helm list -A for the application deployment. The error similar to the following appears:
Upgrade "aws" failed: pre-upgrade hooks failed: timed out waiting for the condition
Cause
The helm upgrade fails when itomdipulsar-broker-pre-upgrade-backlog-setting takes longer than the configured helm timeout
interval. This can happen when the pre-upgrade backlog job isn't complete which is because the topic level policy isn't
working due to a corrupted policy.
Solution
Follow these steps to delete the corrupted topic and resolve this issue:
1. Ensure the itomdipulsar-broker-pre-upgrade-backlog-setting pod isn't in Running state. Run the following command to check
the state:
If the pod isn't running, skip step 2 and perform from step 3.
2. If the pod is running, run the following commands:
Note down the number of replicas and then run the following commands:
Wait until the broker is completely up and then perform the next steps.
3. Run the following commands:
cd bin
5. Run the following command to list the topics and ensure that you see the __change_events topic in the output:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 65
AI Operations Management - Containerized 24.4
6. Start the upgrade for the application. For more information, see the Upgrade section.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 66
AI Operations Management - Containerized 24.4
Solution
To resolve this issue, follow these steps:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 67
AI Operations Management - Containerized 24.4
Solution
To resolve this issue, uninstall AWS content and install it again. Uninstalling content will lead to data loss. So make sure that
you back up the data before uninstalling content.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 68
AI Operations Management - Containerized 24.4
Cause
The issue occurs due to the versioning scheme changes between prior releases and the current one.
Solution
You can resolve the issue by deploying the UCMDB views and then uploading and deploying the content packs manually.
Launch the RTSM UI of a target OBM server as a desktop application from the OBM UI.
Deploy the UCMDB views package to RTSM from your local directory.
1. Go to Administration > RTSM Administration and click Local Client to download the Local Client tool.
2. Launch the Local Client tool.
1. Extract the UCMDB_Local_Client.zip package to a location of your choice, for example, the desktop.
2. Double-click UCMDB Local Client.cmd (Windows) or UCMDB Local Client.sh (Mac). The UCMDB Local Client window
opens.
3. Add or edit the login configuration for the target OBM server that you want to access.
1. Click or . The Add/Edit Configuration dialog opens.
2. Enter the following details:
Host/IP: Specify the value provided in the values.yaml for <externalAccessHost>.
Protocol: Select HTTPS as the protocol from the drop-down list.
Port: Specify the value provided in the values.yaml for <externalAccessPort>.
Target Env: Select UD/UCMDB as the target environment from the drop-down list.
3. Click OK.
4. Launch RTSM UI from the UCMDB Local Client window.
1. In the UCMDB Local Client window, click the Label value for the OBM server that you want to access. The Log
In dialog opens.
2. In the Log In dialog, enter your login parameters.
3. Click Login. The RTSM UI opens in a new window.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 69
AI Operations Management - Containerized 24.4
where, <service_name> is any Hyperscale Observability service such as AWS, Azure, GCP, Kubernetes, and VMware.
1. Download the Event Mapper content pack from the following location:
On Linux:
On Windows:
https://<externalAccessHost>:<externalAccessPort>/staticfiles/monitoring-service/Monitoring_Service_Event_Mapper_<version>.
zip
2. On the OBM user interface, go to Administration > SETUP AND MAINTENANCE > Content Packs.
3. Click Import. The Import Content Pack window appears.
4. Browse to the location where you have saved the Event Mapper content pack and then click Import. The Event Mapper
content pack gets imported. Click Close.
On Windows:
https://<externalAccessHost>:<externalAccessPort>/staticfiles/monitoring-service/Monitoring_Service_<service_name>_Content
_Pack_<version>.zip
2. On OBM user interface, go to Administration > SETUP AND MAINTENANCE > Content Packs.
3. Click Import. The Import Content Pack window appears.
4. Browse to the location where you have saved the <service_name> content pack and then click Import. The
<service_name> content pack gets imported. Click Close.
where <service_name> is any Hyperscale Observability service such as AWS, Azure, GCP, Kubernetes, and VMware.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 70
AI Operations Management - Containerized 24.4
Cause
This issue may occur if the embedded PostGreSQL database is interrupted during creation of schema and fails to release the
Liquibase lock.
Solution
To resolve this issue, follow these steps:
1. Applies to PostgreSQL only. Get the password of the embedded PostgreSQL pod:
2. Applies to PostgreSQL only. Log in to embedded PostgreSQL pod and log in into psql.
Example:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 71
AI Operations Management - Containerized 24.4
Solution
To resolve this issue, you must delete the pods that aren't running as follows:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 72
AI Operations Management - Containerized 24.4
Run the ops-monitoring-ctl get collector -V2 command to check the status of the collection configurations. You may get the
following error:
"statusCode": 403,
"reasonPhrase": "Forbidden"
Error: Invalid username or password, you must be logged in to the server
This error message appears when the username and password are invalid or the user is unauthorized.
You can also check the application.log file to find out the issue. The log file is available in this location: <log-vol>/cloud-monitori
ng/monitoring-admin/<pod_name>. See Map to NFS page for more information.
Cause
The user isn't assigned to the monitoringServiceAdminRole role or to the Administrators group.
Solution
1. Create the role monitoringServiceAdminRole if it doesn't exist. To do that:
a. Enter the URL https://<external_hostname>:<port>/idm-admin on the browser to access IDM. Enter your credentials
and log in. (Go to Organization > Roles)
b. To find the role exists or not, click the search icon the on the upper-right of the screen and enter the role name.
d. Click to add the role. Enter the related values for Name, Display name, Description, Application, and Associate
permission, and then click SAVE.
2. Assign the role to the Administrators group if the role isn't assigned to the group. To do that:
a. Go to Organization > Group.
b. To check the group exists or not click the search icon the on the upper-right of the screen and enter the group
name. The page displays the group name and related roles.
c. Click the Administrators link to see the Associated roles in the Group Settings.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 73
AI Operations Management - Containerized 24.4
d. Search for the role. If it doesn't exist in the list, add the role.
3. Rerun autoconfiguration job.
a. kubectl get jobs -n $(kubectl get jobs -A | awk '/autoconfigure/ {print $1, $2}') -o yaml > file_itom-monitoring-collection-au
toconfigure-job.yaml
d. kubectl delete job -n $(kubectl get jobs -A | awk '/autoc/ {print $1, $2}')
e. kubectl get jobs -n $(kubectl get jobs -A | awk '/autoconfigure/ {print $1, $2}') -o yaml > file_itom-monitoring-collection-au
toconfigure-job.yamlyq eval -i 'del(.spec.template.metadata.labels)' file_itom-monitoring-collection-autoconfigure-job.yamly
q eval -i 'del(.spec.selector)' file_itom-monitoring-collection-autoconfigure-job.yamlkubectl delete job -n $(kubectl get jobs -
A | awk '/autoc/ {print $1, $2}')kubectl apply -f file_itom-monitoring-collection-autoconfigure-job.yaml
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 74
AI Operations Management - Containerized 24.4
curl: (60) Peer's certificate issuer has been marked as not trusted by the user. More details here:
http://curl.haxx.se/docs/sslcerts.html. curl performs SSL certificate verification by default, using a "bundle" of
Certificate Authority (CA) public keys (CA certs). If the default bundle file isn't adequate, you can specify an
alternate file using the --cacert option. If this HTTPS server uses a certificate signed by a CA represented in
the bundle, the certificate verification probably failed due to a problem with the certificate (it might be
expired, or the name might not match the domain name in the URL). If you'd like to turn off curl's verification
of the certificate, use the -k (or --insecure) option. 2021-06-03_00:27:20.158 [ ERR] Failed to execute curl when
posting secrets yaml -- curl error
Solution
On master node, under /root directory run ls -a to find the file .gs.<hostname>.curl-ca-bundle.crt and delete it.
cd /root
ls -a
rm -rf gs.<hostname>.curl-ca-bundle.crt
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 75
AI Operations Management - Containerized 24.4
Scenario 1
Error: UPGRADE FAILED: pre-upgrade hooks failed: job failed: BackoffLimitExceeded.
Solution1
CLI
In a shared OPTIC Data Lake setup, when you upgrade the consumer before upgrading the provider using CLI , the
upgrade will fail with "Error: UPGRADE FAILED: pre-upgrade hooks failed: job failed: BackoffLimitExceeded"
Example output:
2. Get the logs for the itom-restrict-consumer-upgrade pod for the detailed cause of the failure:
kubectl logs <pod-name> -n <application namespace> -c <container name>
Example output:
"Consumer is not eligible for upgrade as Optic DL Provider is not Upgraded to 2023.05".
If consumer upgrade fails then "restrict-upgrade" pre-upgrade job will be in 0/1 state.
4. If any other job other than restrict-upgrade job has failed, try re-running the upgrade. If the upgrade fails the second
time, You can contact Support and services.
AppHub
In a shared OPTIC Data Lake setup, when you upgrade the consumer before upgrading the provider using AppHub ,
the upgrade will fail with "Error: UPGRADE FAILED: pre-upgrade hooks failed: timed out waiting for the condition"
1. In AppHub UI goto DEPLOYMENTS > View Health > select (Pre-hook)itom-restrict-consumer-upgrade container
which will be in an error state.
2. Click on Logs tab and then click on "restrict-consumer-upgrade (Error)".
You will see the message Consumer is not eligible for upgrade as Optic DL Provider is not Upgraded to
2023.05.
In a no shared OPTIC Data Lake setup, follow the steps mentioned in solution2.
Solution2
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 76
AI Operations Management - Containerized 24.4
1. List all the pods and see which pod is in an error state:
kubectl get pods -n <suite namespace>
2. Find the pod which is in an error state.
3. Get the logs of the pod for the detailed cause of the failure:
kubectl logs <pod-name> -n <suite namespace>
4. If there is an issue with the itom-opsb-db-connection-validator-job , there might be some misconfiguration with the external
database used. The cause of the failure can be found in the logs. Fix verify the parameters passed and re-run the
upgrade.
5. If any other job other than itom-opsb-db-connection-validator-job has failed, try re-running the upgrade. If the upgrade fails
the second time, You can contact Support and services.
Scenario 2
Error: UPGRADE FAILED: cannot patch "itomdipulsar-bookkeeper" with kind StatefulSet: StatefulSet.apps "itomdipulsar-boo
kkeeper" is invalid
Solution
1. Verify the parameters i tomdipulsar.bookkeeper.volumes.ledgers.size , itomdipulsar.bookkeeper.volumes.journal.size and itomdip
ulsar.zookeeper.volumes.data.size in the current values file with the values file passed during installation.
To get the values passed for installation use the command :
helm get values <release-name> -n <namespace>
2. Make the values of the parameters itomdipulsar.bookkeeper.volumes.ledgers.size, itomdipulsar.bookkeeper.volumes.journal.size
and itomdipulsar.zookeeper.volumes.data.size same as values passed during instalation and re run the upgrade.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 77
AI Operations Management - Containerized 24.4
Solution
Follow these steps to troubleshoot the upgrade failure:
If the backup-complete file exists, the OMT backup completed successfully. Follow the remaining steps to
troubleshoot the upgrade.
4. Run the following command to check the status of the kubelet service:
systemctl status kubelet
If the kubelet service is not active, delete the kubelet.service file in the /usr/lib/systemd/system directory.
If the kubelet service is active, run the following command to stop the kubelet service. Then, delete the
kubelet.service file in the /usr/lib/systemd/system directory.
5. Run the following command to check the docker service status: systemctl status docker
If the docker service is not active, delete the docker.service file in the /usr/lib/systemd/system directory.
If the docker service is active, run the following command to stop the docker service. Then, delete the
docker.service file in the /usr/lib/systemd/system directory.
6. Check the docker-bootstrap service status with the command: systemctl status dockerbootstrap.
If the docker-bootstrap is not active, delete the docker-bootstrap.service file in the /usr/lib/systemd/system
directory.
If the docker-bootstrap is active, run the following command, and then delete the docker-bootstrap.service file in
the /usr/lib/systemd/system directory.
for data in $(mount | grep "${K8S_HOME}/data/" | cut -d" " -f3 | sort -r);do umount -f -l $data; done
for data in $(mount | grep "/usr/lib/kubelet" | cut -d" " -f3 | sort -r);do umount -f -l $data; done
rm -rf $<K8S_HOME>
10. Run the following command to roll back the $<K8S_HOME> directory:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 78
AI Operations Management - Containerized 24.4
mv ${K8S_HOME}/docker.service /usr/lib/systemd/system/
mv ${K8S_HOME}/docker-bootstrap.service /usr/lib/systemd/system/
13. (Optional) If the upgrade failed on the first master (control plane) node that was stopped, manually restore the data on
the NFS server.
14. Run the following command to retry the upgrade:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 79
AI Operations Management - Containerized 24.4
fatal dbinit: ROLLBACK 5365: [42704] User available location ["/home/dbadmin/.itomdipulsarudx"] does not exist on node ["v_itomdb_n
ode0001"]
Cause
This issue is because you may have cleaned up the Vertica server before the upgrade. However, some residual files may
have remained.
Solution
Perform these steps to resolve this issue:
1. Log on to the same Vertica node where you have already installed the OPTIC DL Vertica Plugin. This means the node
that has the /usr/local/itom-di-pulsarudx folder.
2. Delete the directory <path to Vertica admin home>/.itomdipulsarudx
3. Upgrade the OPTIC DL Vertica Plugin.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 80
AI Operations Management - Containerized 24.4
Cause
Some values in nginx-default-secre t are missing.
Solution
Follow these steps to add the missing values:
helm list -n <opsb namespace> -a 2>/dev/null | grep -E 'opsbridge-suite-[0-9]+\.[0-9]+\.[0-9]\+[0-9]+' | awk '{print $1}' | xargs
2. Run the following commands to add annotations to the nginx-frontend-secret secret and put it under helm management:
kubectl patch secret nginx-default-secret -n <suite-ns> -p "{ \"metadata\": { \"annotations\": { \"meta.helm.sh/release-name\": \"
<deployment-name>\", \"meta.helm.sh/release-namespace\": \"<suite-ns>\" }, \"labels\":{ \"app.kubernetes.io/managed-by\": \"H
elm\" } } }"
where,
suite-ns is the namespace where you have installed the AI Operations Management.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 81
AI Operations Management - Containerized 24.4
Problem
After the n-2 upgrade of the application, events sent from OBM aren't stored in the opr_event Vertica table as the opr_event
target and source scheduler streams are missing.
Solution
Restart the suite ( cdfctl runlevel set -l DOWN/UP -n <suite_namespace> ) to resolve this issue.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 82
AI Operations Management - Containerized 24.4
Cause
This issue happens because the container software insists that the RDS certificate needs a hostname in its SAN (Subject
Alternate Name) field.
Solution
To resolve the issue:
1. Include the Subject Alternative Name (SAN) extension in the TLS certificate for PostgreSQL by following the steps
outlined in the Enable TLS in PostgreSQL.
2. Run the below Helm upgrade command:
helm upgrade <helm deployment name> -n <suite namespace> -f <values.yaml> <chart> [--set-file "caCertificates.vertica-ca\.
crt"=<vertica certificate file> ] [--set-file "caCertificates.postgres\.crt"=<relational database certificate file> [--set-file oracleWalle
t=<base64 encoded wallet text file> ]] [-f <deployment.yaml> ] [-f <secrets.yaml>] --timeout 15m
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 83
AI Operations Management - Containerized 24.4
Cause
This issue is caused by Java conflicts in the adapter resource package. You need to manually remove the unnecessary .jar
files.
The following nine .jar resources are the ones that are required and should be kept. You can remove other .jar files from the
Pulsar Push adapter package.
adapterCode/PulsarPushAdapter/dependencies/javax.activation.jar
adapterCode/PulsarPushAdapter/dependencies/javax.ws.rs-api.jar
adapterCode/PulsarPushAdapter/dependencies/jcip-annotations.jar
adapterCode/PulsarPushAdapter/dependencies/kafka-push-adapter.jar
adapterCode/PulsarPushAdapter/dependencies/pulsar-client-admin-api.jar
adapterCode/PulsarPushAdapter/dependencies/pulsar-client-api.jar
adapterCode/PulsarPushAdapter/dependencies/pulsar-client.jar
adapterCode/PulsarPushAdapter/dependencies/validation-api.jar
adapterCode/PulsarPushAdapter/PulsarPushAdapter.jar
Solution
To remove the unnecessary resources that causing Java conflicts, follow these steps:
1. From UCMDB UI, go to Administration > Package Manager, and then select PulsarPushAdapter.
2. Click the Undeploy resources button from the toolbar.
3. Select the unnecessary jars causing conflicts for removal. That is, select the .jar resources (starting with "adapaterCode
- PulsarPushAdapter") that are not in the required .jar list. Don't select the nine resources listed in the Cause section as
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 84
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 85
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 86
AI Operations Management - Containerized 24.4
Cause
This issue occurs if logs consume a lot of space in NFS storage.
Solution
Follow these steps to fix this issue:
1. Run the following command to identify the logging persistent volume name:
2. Run the following commands to identify the NFS server and NFS path that the logging persistent volume is mounted to.
Replace <logging PV name> with the persistent volume name that you identified in the previous step.
3. Log in to the NFS server that you identified in the previous step and go to /var/vols/itom/<log file directory path> . The <log
file directory path> depends on whether logging volumes are created manually or by using the storage provisioner.
4. Identify and delete the old log files (generated by OMT and AI Operations Management capabilities) that are no longer
required.
5. Configure log rotation or deletion to avoid encountering this issue again. For detailed steps, see Change the log rotation
or delete configuration.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 87
AI Operations Management - Containerized 24.4
PostgreSQL
psql -f RemoveSQL.sql
Oracle
echo exit | sqlplus sys/mysyspassword as SYSDBA @RemoveSQL.sql
You may not be able to remove a Oracle or a PostgreSQL database and may get below errors.
Oracle:
ERROR at line 1:
PostgreSQL:
Cause
This error occurs due to the following reasons:
There are multiple active connections or sessions that are accessing the database.
The suite is not uninstalled before deleting the relational databases.
Solution
You need to disconnect from the database or close the active sessions in order to remove the users or databases.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 88
AI Operations Management - Containerized 24.4
Cause
itom-ucmdb-probe pod gets stuck at 1/2 state.
Solution
1. Log on to any one of the master (control plane) nodes.
2. Run the following command:
4. If data is present even after the deployment is down then truncate the table:
TRUNCATE jgroupsping;
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 89
AI Operations Management - Containerized 24.4
Symptom
Vault pod isn't up after restoring the backed up data.
Solution
1. Scale down vault deployment. Run the following command:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 90
AI Operations Management - Containerized 24.4
Cause
This issue occurs because you have configured the EXTERNAL_ACCESS_PORT in the base-configmap to a wrong port.
Solution
1. Log on to any one of the master (control plane) nodes.
2. Run the following command to edit the base-configmap . You need to change the value of the parameter EXTERNAL_ACCES
S_PORT to "5443". kubectl edit cm base-configmap -n core
3. Run the following command to get list the running pods:
kubectl get pods -n core
Your terminal resembles the following:
4. Run the following command to delete the cdf-apiserver-xxxx pod. You need to replace the <cdf-apiserver> with the pod
name you get from the previous step.
kubectl delete pod -n core <cdf-apiserver>
5. Wait for some minutes till all cdf-apiserver starts again. You can run the following command to check the pod status.
kubectl get pods -n core
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 91
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 92
AI Operations Management - Containerized 24.4
RUM
Event
Internal file - OpsB_RUM_Content_2020.08.zip
Internal file - OpsB_Event_Content_2020.08.zip
Real User Monitor schema and Real User Monitor
Event schema
Reports
SysInfra BPM
Internal file - OpsB_SysInfra_Content_2020.08.zip Internal file - OpsB_BPM_Content_2020.08.zip
System Infrastructure schema Synthetic Transaction schema
CMDB
Internal file - OpsB_CMDB_Content_2020.08.zip
OPTIC Data Lake tables (purely internal - not
documented)
These content zips are deployed to the respective services by the Content Administration Service (CAS). Further, there is a
startup Job that helps invoke CAS right after AI Operations Management is deployed. In case you notice any issues with
CAS' operation, use this document to isolate the root cause and resolve it.
Flow
During helm deployment:
Note
metadata, enrichment, retention, roll-up, blk-upload, entityConfig are concepts internal to OPTIC Data
Lake
If a request to OPTIC Data Lake to create metadata, enrichment and roll-up (OPTIC Data Lake tables) fails, the
request is tried again every 3 minutes, 10 times.
If the request fails even after 30 minutes, content deployment exits with the following log message:
Exiting as metadata table not created. List of tables not created are: <list of tables failed>
Once each section is completed, you will see the following message:
Logs
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 93
AI Operations Management - Containerized 24.4
Accessing logs
For offline analysis:
Ask for a tar/zip with the contents of the {log-vol}/{namespace}/*content-admin*/ folder
For real-time analysis:
On any of the CDF nodes, start by setting environment variables for the namespace and various pods
export cas_ns=$(kubectl get pods --all-namespaces | awk '!/job/ && /content-manager/{printf $1}');
export cas_job=$(kubectl get pods -n $cas_ns| awk '/job/ && /content-management/{printf $1}');
export cas_pod=$(kubectl get pods --all-namespaces | awk '!/job/ && /content-manager/{printf $2}');
CAS Job
CAS
cas-util tool helps interact with CAS. Use the following command to download it to one of the CDF nodes
Tail current logs from CAS using and check for errors:
Analyzing logs
CAS Job
On successfully requesting CAS to deploy content:
INFO: <content> Content Administration Service Job now calling curl -k -s -o /dev/null -w %{http_code} https://itom-opsb-co
ntent-administration-svc:8443/v1/content/configuration/all/category/RUM
INFO: Curl command to CAS is triggered with response code 202
INFO: <content> configuration is triggered. Please check the status
ERROR: <content> was not accepted by either OPTIC Data Lake Administration Service or BVD. Will not be able to conf
igure <content>. Please check the status in Content Administration Service.
Steps to resolve:
CAS
On successfully deploying a BVD dashboard:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 94
AI Operations Management - Containerized 24.4
INFO: Performing configure task on 'OPTIC Data Lake'. May take some time to complete task...
…
2020-10-07 07:45:25 INFO: Job Summary:
Type||Category||Status
======================================
bvd||system_infra||Completed
blk-upload(ingestion-conf)||system_infra||Skipped [Reason: Content is not available].
blk-upload(load-conf)||system_infra||Skipped [Reason: Content is not available].
entityConfiguration||system_infra||Skipped [Reason: Content is not available].
metadata||system_infra||Completed
retention||system_infra||Completed
enrichment(hourly)||system_infra||Completed
enrichment(daily)||system_infra||Completed
enrichment(forecast)||system_infra||Completed
roll-up(perl)||system_infra||Completed
roll-up(task)||system_infra||Completed
roll-up(task-flow)||system_infra||Completed
======================================
Steps to resolve:
Resolve the root cause on OPTIC Data Lake (see OPTIC Data Lake Troubleshooting for details)
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 95
AI Operations Management - Containerized 24.4
Note
Note
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 96
AI Operations Management - Containerized 24.4
Cause
This issue occurs when you don't have the di-admin role set in your provider org for the ops-content-ctl .
Solution
To resolve the issue, you must manually add the di-admin role as your provider in ops-content-ctl .
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 97
AI Operations Management - Containerized 24.4
Issue
Agent System Metric Push doesn't have a Store and Forward capability.
Solution
1. Install Operations Agent (OA) 12.15.
2. Contact Software Support and get the hotfix (OCTCR19G1192060) for OA 12.15. This provides the Store and Forward
capability.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 98
AI Operations Management - Containerized 24.4
Solution
LC_MESSAGES is responsible for printing out messages in the required language. Therefore set LC_MESSAGES to US English in
the PostgreSQL server.
export LC_MESSAGES=en_US.UTF-8
Note
You can also configure the locale for system error messages in postgresql.conf
file.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 99
AI Operations Management - Containerized 24.4
1.5.5. itom-opsbridge-cs-redis in
CrashLoopBackOff state
Redis pod is in CrashLoopBackOff state.
The cs-redis pods are restarting continuously with the following errors:
dis-5f69c95669-z4h49","opsb-helm","xxxxx","cs-redis"
dis-5f69c95669-z4h49","opsb-helm","xxxxx","cs-redis"
/redis.conf","itom-opsbridge-cs-redis-5f69c95669-z4h49","opsb-helm","xxxxx","cs-redis"
dis-export
Solution
1. Edit itom-opsbridge-cs-redis deployment to increase the memory limit to 8 GB.
name: cs-redis
ports:
- containerPort: 6380
protocol: TCP
- containerPort: 9121
name: redis-exporter
protocol: TCP
resources:
limits:
cpu: "2"
memory: 2Gi
requests:
cpu: 100m
memory: 1Gi
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 100
AI Operations Management - Containerized 24.4
name: cs-redis
ports:
- containerPort: 6380
protocol: TCP
- containerPort: 9121
name: redis-exporter
protocol: TCP
resources:
limits:
cpu: "2"
memory: 8Gi
requests:
cpu: 100m
memory: 2Gi
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 101
AI Operations Management - Containerized 24.4
Note
: For enhanced logging in itom-monitoring-oa-metric-collector ( agent-collector-sysinfra ), you can edit the collector
configuration and set the metricCollectorLogLevel parameter to DEBUG.
For example:
2. Edit the collector configuration yaml file. Set the metricCollectorLogLevel parameter to DEBUG .
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 102
AI Operations Management - Containerized 24.4
Problem
You may encounter any of the issues when you configure self monitoring for Monitoring Service Edge to send alerts to OBM.
You need to check the log file. Log file path of Edge self monitoring: /var/opt/OV/log/edge-self-monitoring.log
Solution 1
You can try the following:
Solution 2
You can try the following:
1. Search for "CI:" in the Edge self monitoring log and check if any POD is missing in the list.
2. Enable debug logging by setting SELF_MON_LOG_LEVEL to debug in the deployment of data broker container.
3. After debug log is enabled, search for "Topology XML:" in the log and check the XML contents.
The following screenshot illustrates a part of topology xml which contains the pod CI.
Issue 3: Event not generated for collection failure or pod not in running state
Solution 3
You can try the following:
The following screenshot is a sample scenario for an event generated when the pod isn't in a running state.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 103
AI Operations Management - Containerized 24.4
Cause
You may get this error if the communication between the Agent Metric Collector and the Operations Agent fails. This happens
if you have changed the ASYMMETRIC_KEY_LENGTH from 2048 to 4096 on the OBM server and not on DBC.
Note
The Data Broker Container (DBC) is an Operations Agent node that's managed by OBM. It enables the Agent Metric Collector to
communicate with OBM and receives the certificate updates.
Solution
Change the ASYMMETRIC_KEY_LENGTH to 4096 on DBC. Follow the steps on DBC (Agent node or managed node):
ovconfchg -ns sec.cm -set ASYMMETRIC_KEY_LENGTH <RSA Encryption algorithm supported key length>
2. To remove the existing node certificate on the agent, run the following commands:
3. To request a new node certificate from the management server, run the following command:
ovcert -certreq
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 104
AI Operations Management - Containerized 24.4
Cause
This issue may occur if a metric class is missing in the parameter file.
Solution
1. Go to:
On Linux: /var/opt/perf/parm
On Windows: %OvDataDir%\parm.mwc
2. Look for the following line in the parameter file: log global application process device=disk, cpu, filesystem transaction
If a metric class is missing then add it.
3. Run the command: ovc -restart oacore
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 105
AI Operations Management - Containerized 24.4
javax.net.ssl.SSLProtocolException: The certificate chain length (11) exceeds the maximum allowed length (10)
Cause
This issue is because the connection to the OPTIC Data Lake Message Bus proxy fails and the topology data push doesn't
work.
Solution
Follow these steps to resolve this issue:
1. The error indicates that the certificate chain length is greater than the configured. Run the following command to
update the certificate chain length:
helm upgrade <release name> -f <values YAML filename> -n <application namespace> <chart location> --set itomdipulsar.proxy.co
nfigData.PULSAR_MEM="-Xms2g -Xmx2g -XX:MaxDirectMemorySize=1g -Djdk.tls.maxCertificateChainLength=15"
2. Run the following commands to verify the certificate chain length settings:
kubectl get pods -n <application namespace>
kubectl exec -it itomdipulsar-proxy-<pod value> -n <application namespace> -c itomdipulsar-proxy /bin/bash
ps -ef |grep -i java
The Java process JVM arguments display the parameter updated in step 1.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 106
AI Operations Management - Containerized 24.4
Problem
Agent Metric Collector (AMC) isn't collecting metrics from new nodes.
Cause
The new nodes aren't available in a collector configuration or node filter file.
Solution
When your configuration includes the node filter files, those filter files should contain newly added node names. To ensure
that the node filter files contain the new node names, do the following:
Run the AMC benchmark tool on all nodes to get an updated recommendation and node filter list. For more information,
see Use the amc-benchmark-tool.
Update the node filter file with the newly added nodes.
Enable the delta detection capability to get the list of new nodes for which metrics aren't collected. For more
information, see Manage new nodes.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 107
AI Operations Management - Containerized 24.4
Make sure the resource allocation is as per Sizing Calculator. For more information on Sizing Calculator, see: Sizing the
deployment.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 108
AI Operations Management - Containerized 24.4
Solution
We recommend to increase the disk I/O to minimum 100-150 MBs.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 109
AI Operations Management - Containerized 24.4
Solution
To solve this problem, make sure that the / and /var directories have at least 5 GB free disk space.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 110
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 111
AI Operations Management - Containerized 24.4
Solution
To solve this problem, try these solutions:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 112
AI Operations Management - Containerized 24.4
Solution
Edit the kube-registry-proxy.yaml file by adding the following parameters:
name: DOCKER_FIX
value: "dockerfix"
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 113
AI Operations Management - Containerized 24.4
Solution
1. Change the subnet mask to 255.255.255.0.
2. Configure the parameter FLANNEL_BACKEND_TYPE as follows:
FLANNEL_BACKEND_TYPE = vxlan
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 114
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 115
AI Operations Management - Containerized 24.4
Example: The following command lists all the pods that are part of collection-services capability.
kubectl get pods --selector="itom.microfocus.com/capability"="collection-services" -n opsb-helm
The available key-value pairs are:
app:
app.kubernetes.io/managed-by
app.kubernetes.io/name
app.kubernetes.io/version
itom.microfocus.com/capability
itom.microfocus.com/description
You can use this command to view pods which are not running:
kubectl get pods --all-namespaces -o wide | awk -F " *|/" '($3!=$4 || $5!="Running") && $5!="Completed" {print $0}'
You can use this command to watch the pods' status changes:
Pod Description
bvd-controller-
Performs database initialization.
deployment
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 116
AI Operations Management - Containerized 24.4
Event reports:
RUM_AllApps_Dashboard
RUM_PerApp_Dashboard
Pod Description
itom-analytics-
REST service that manages information that is
datasource-
required to integrate EA with OBM.
registry
itom-analytics-
Indicates CRON job. The CRON job runs every
auto-event-
10 minutes.
correlation-job
The Kubernetes keeps up to three associated pods in the active pod list and their status should either be Running or
Completed.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 117
AI Operations Management - Containerized 24.4
Note
The itom-analytics-datasource-registry pod will switch to Running from Init only after the DI administration services become
available.
In Vertica, the name of the tables associated with Automatic Event Correlation are prefixed with " aiops_ ." Two of the tables
that are associated with a Data Set created by DI are:
aiops_correlation_event
aiops_correlation_event_rejected
The Automatic Event Correlation capability creates the schema " itom_analytics_provider_default". The following internal tables
for Automatic Event Correlation are created in that schema:
aiops_internal_aec_user_groups
aiops_internal_correlation_graph
aiops_internal_correlation_groups
aiops_internal_correlation_metadata
aiops_internal_correlation_transactions
aiops_internal_topology_metadata
aiops_internal_topological_mappings
Note
You will be able to see the aiops_internal_topological_mappings table created only after the cmdb_entity_* tables are
created by Content Administration Service job.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 118
AI Operations Management - Containerized 24.4
Cause
This issue relates to the container kubernetes-vault-renew. You see this error message when the vault token has expired.
Solution
You have to generate a new vault token. You can follow either of these steps.
Solution 1
Initialize a new token with the following commands on the master (control plane) node:
cd $CDF_HOME/bin
kube-restart.sh
Solution 2
Delete the pod manually if the ReplicationController of Deployment manages the pod. A new pod creates automatically.
If the Replication Controller does not manage the pod, run the following command on the node which runs the pod:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 119
AI Operations Management - Containerized 24.4
Cause
This issue occurs if you have configured local storage provisioner on the control plane node. When the control plane node is
not configured to share the workload with the worker nodes, you must not add new disks or use the local directories and
configure local storage provisioner on the control plane node.
Solution
Perform the following tasks to remove local storage provisioner on the control plane node:
1. Make a note of the number of replicas for the book-keeper and zoo-keeper pods:
kubectl get statefulset -n <suite_namespace>
Example:
Here, the number n/n under Ready indicates the total number of replicas for that pod. In this example, the bookkeeper
and zookeeper pods have 3 replicas each.
2. Scale down the bookkeeper pods:
kubectl scale -n <suite_namespace> statefulset itomdipulsar-bookkeeper --replicas=0
kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGEC
LASS REASON AGE
db-single 5Gi RWX Retain Bound core/db-single-vol cdf-default 4
7d
itom-logging 5Gi RWX Retain Bound core/itom-logging-vol cdf-default
47d
itom-vol 5Gi RWX Retain Bound core/itom-vol-claim cdf-default 4
7d
local-pv-1131a5eb 154Gi RWO Delete Bound opsb/itomdipulsar-bookkeeper-journal-itomdipulsar-bookkeeper
-2 fast-disks 47d
local-pv-1529aa1b 154Gi RWO Delete Available fast-disks 47
d
local-pv-282f7bdf 154Gi RWO Delete Available fast-disks 47
d
local-pv-2c1edf48 154Gi RWO Delete Bound opsb/itomdipulsar-zookeeper-zookeeper-data-itomdipulsar-zook
eeper-2 fast-disks 47d
local-pv-30cf49a5 154Gi RWO Delete Available fast-disks 47
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 120
AI Operations Management - Containerized 24.4
d
local-pv-3aeb6ceb 154Gi RWO Delete Available fast-disks 47
d
local-pv-3ea851fe 154Gi RWO Delete Bound opsb/itomdipulsar-bookkeeper-ledgers-itomdipulsar-bookkeeper
-0 fast-disks 47d
local-pv-40b7d363 154Gi RWO Delete Available fast-disks 47
d
local-pv-525ce6a3 154Gi RWO Delete Bound opsb/itomdipulsar-bookkeeper-ledgers-itomdipulsar-bookkeeper
-2 fast-disks 47d
local-pv-56c439d0 154Gi RWO Delete Available fast-disks 47
d
local-pv-57221822 154Gi RWO Delete Available fast-disks 47
d
local-pv-65e8b9f1 154Gi RWO Delete Available fast-disks 47
d
local-pv-704dc194 154Gi RWO Delete Available fast-disks 47
d
local-pv-730252fc 154Gi RWO Delete Bound opsb/itomdipulsar-bookkeeper-journal-itomdipulsar-bookkeeper-
0 fast-disks 47d
local-pv-76fbd452 154Gi RWO Delete Available fast-disks 47
d
local-pv-7aa7b6d9 154Gi RWO Delete Available fast-disks 47
d
local-pv-7b218186 154Gi RWO Delete Available fast-disks 47
d
local-pv-7c22b037 154Gi RWO Delete Available fast-disks 47
d
local-pv-842f2fa0 154Gi RWO Delete Bound opsb/itomdipulsar-bookkeeper-ledgers-itomdipulsar-bookkeeper-
1 fast-disks 47d
local-pv-87baf490 154Gi RWO Delete Bound opsb/itomdipulsar-zookeeper-zookeeper-data-itomdipulsar-zook
eeper-1 fast-disks 47d
local-pv-93b6c6d7 154Gi RWO Delete Available fast-disks 47
d
local-pv-99265f58 154Gi RWO Delete Bound opsb/itomdipulsar-zookeeper-zookeeper-data-itomdipulsar-zook
eeper-0 fast-disks 47d
local-pv-a0e1a70e 154Gi RWO Delete Available fast-disks 47
d
local-pv-a8bb2cfa 154Gi RWO Delete Bound opsb/itomdipulsar-bookkeeper-journal-itomdipulsar-bookkeeper-
1 fast-disks 47d
local-pv-b96141b2 154Gi RWO Delete Available fast-disks 47
d
local-pv-b9b30a27 154Gi RWO Delete Available fast-disks 47
d
local-pv-bb1fb96e 154Gi RWO Delete Available fast-disks 47
d
local-pv-bca7630d 154Gi RWO Delete Available fast-disks 47
d
local-pv-bff6f09 154Gi RWO Delete Available fast-disks 47
d
local-pv-c303c8b6 154Gi RWO Delete Available fast-disks 47
d
local-pv-cf2f5162 154Gi RWO Delete Available fast-disks 47
d
local-pv-e8aa1bc6 154Gi RWO Delete Available fast-disks 47
d
local-pv-f085e3cc 154Gi RWO Delete Available fast-disks 47
d
local-pv-f619faaf 154Gi RWO Delete Available fast-disks 47
d
local-pv-fe791519 154Gi RWO Delete Available fast-disks 47
d
local-pv-ff46c635 154Gi RWO Delete Available fast-disks 47
d
vol1 10Gi RWX Retain Bound opsb/opsb-dbvolumeclaim 4
7d
vol2 10Gi RWX Retain Bound opsb/opsb-configvolumeclaim
47d
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 121
AI Operations Management - Containerized 24.4
5. Delete the PVCs of zookeeper and bookeeper pods that are bound to disks mounted on control plane nodes:
kubectl delete pvc <PVC_name_of_the_fastdisks> -n <suite_namespace>
Example:
6. Delete the PVs of zookeeper and bookeeper pods that are bound to disks mounted on control plane nodes: kubectl delete
pv <PV_name_of_the_fastdisks>
Example:
7. Umount the disks that were added for the configuration of local storage provisioner on the control plane node:
umount /mnt/disks/<vol_name>
Example:
umount /mnt/disks/vol1
umount /mnt/disks/vol2
umount /mnt/disks/vol3
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 122
AI Operations Management - Containerized 24.4
./find-current-pod-logs.sh
When you run the script it'll prompt for the POD number. Enter the POD number from the list for which you want to find
the relevant NFS server and log folder details.
Note
To view the relevant information you must give the POD number from the list and not the POD
name.
For example:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 123
AI Operations Management - Containerized 24.4
43) itom-omi-aec-integration-watcher-27248110-9l8tn
44) itom-omi-aec-integration-watcher-27248120-bzgvc
45) itom-omi-csr-granter-gc6xi-np5dn
46) itom-omi-di-integration-w0d4g-c5kd6
47) itom-opsb-content-management-job-wihuv-x6gxj
48) itom-opsb-content-manager-7968f45f8c-sbkr2
49) itom-opsb-db-connection-validator-job-7457f
50) itom-opsb-resource-bundle-7c7d559bc5-h7tkp
51) itom-opsbridge-cs-redis-fd8848b6c-xl7nq
52) itom-opsbridge-data-enrichment-service-c545d766-8fjvh
53) itom-reloader-d78b57dc5-s42lz
54) itom-ucmdb-0
55) itom-vault-5977f4b7ff-2q2zs
56) itomdimonitoring-gen-certs-job-shzrt
57) itomdimonitoring-verticapromexporter-58bfcd4bd4-cpjk5
58) itomdipulsar-bookkeeper-0
59) itomdipulsar-bookkeeper-init-vxgzo5r-gbhcj
60) itomdipulsar-broker-68597546cb-b25nn
61) itomdipulsar-minio-connector-post-upgrade-job-ym81yor-npt9w
62) itomdipulsar-proxy-85c5d89594-kktfl
63) itomdipulsar-zookeeper-0
64) itomdipulsar-zookeeper-metadata-lwjdff3-6dffx
65) omi-0
66) omi-artemis-7bdd945f5b-2jzwd
67) opr-event-dataset-updater-lcsb3-pkbjx
68) webtopdf-deployment-978d56fb5-4d85v
Select POD: 6
bvd-redis-757f478468-89c26
Go to the NFS Server: yournfsserver.example.net
Then: cd /var/vols/itom/cdf-log/container/
Note
Instead of redirecting to the NFS server manually you can use the option to mount/unmount the NFS volume and move to the local
folder. You can use the following options with the POD logger script:
-m : (optional) will mount the NFS volume locally and move to that local folder
-u : (optional) will unmount a previously mounted volume
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 124
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 125
AI Operations Management - Containerized 24.4
Solution
Make sure the user owns the right group, for example, group ID:1999: User ID: 1999.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 126
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 127
AI Operations Management - Containerized 24.4
Solution
Follow these steps to resolve the issue:
1. Open a JMX Console (there is one provided in <SiteScope root directory>\java\bin\jconsole.exe), and
enter 28006 (the default port) in the Port field.
In the MBeans tab, select com.mercury.sitescope/Integration/Bac/Tools/BacIntegrationToolsJMX.
For objects with duplicate APM IDs, activate fixDuplicateBACConfiguration() .
For objects with APM ID == (-1), activate fixMinusOneBACConfiguration() .
It's also recommended to activate softSync() to send the new configuration to APM.
2. If measurements have the wrong category ID, restart SiteScope.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 128
AI Operations Management - Containerized 24.4
Cause
Metrics don't reach OPTIC Data Lake.
Solution
Ensure that the SiteScope metric streaming is enabled to OPTIC Data Lake and that SiteScope is integrated with OBM.
Validation
Run the following queries to ensure that the metric steraming is enabled and reaching OPTIC Data Lake:
Exmaple output:
output
If you are using a different schema name, then run the following query:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 129
AI Operations Management - Containerized 24.4
Problem
You can't see the SiteScope providers, monitor groups, and monitors under the provider groups in Agentless Monitoring UI
after you onboard SiteScope.
Solution
Go through the following sections to check and rectify the issue.
For example:
3. Run the command to check the provider properties; check if it points to the exact target and provider group names.
Ensure that the name should match as names are case sensitive.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 130
AI Operations Management - Containerized 24.4
2. Check the URL of the IDM server in the classic SiteScope master.config file or in infrastructure settings. The URL must
end with "/".
For example: https://<FQDN_of_the_external_access_host>/idm-service/v3.0/tokens/
3. You must import the application certificate to SiteScope from SiteScope UI > Preferences > Certificate
management
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 131
AI Operations Management - Containerized 24.4
2. If SiteScope is https, you should import the SiteScope CA certificate to the suite and vice versa. See Add SiteScope
certificates.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 132
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 133
AI Operations Management - Containerized 24.4
Solution
User must manually delete CI's from MCC UI.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 134
AI Operations Management - Containerized 24.4
Cause
Network or deployment issues. Check the log files in the apm-config-sync-service pod to verify the sync flow.
Solution
Restart the apm-config-sync-service pod.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 135
AI Operations Management - Containerized 24.4
Cause
The data sync begins after the APM to MCC sync interval of 15 minutes is completed.
Solution
Restart APM sync service pod to start the data sync immediately.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 136
AI Operations Management - Containerized 24.4
Solution
Restart monitoring-resources pod.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 137
AI Operations Management - Containerized 24.4
Solution
Update the env name and value of the deployment, perform the following steps:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 138
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 139
AI Operations Management - Containerized 24.4
Cause
This issue occurs when the following NFS volumes hosting omi-0 and omi-1 pods have files from previous chart installation
Solution
To resolve this issue, do the following:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 140
AI Operations Management - Containerized 24.4
Problem
After the successful containerized installation and deployment of OBM in high availability mode, the omi-1 pod does not have
any policies deployed to it.
Solution
Do the following to deploy the policies to omi-1 :
List the deployed policies on omi-0 : kubectl exec omi-0 -n <namespace> -c omi -- /opt/OV/bin/ovpolicy -list.
In the OBM UI, deploy the same policies to omi-1 .
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 141
AI Operations Management - Containerized 24.4
Cause
Due to the discovery or integration jobs, marble receives massive topology changes from the UCMDB. By design, marble
restarts the dashboard service to reload the model. The outage lasts about 90 seconds after the shutdown notification was
received.
Solution
We suggest not to run discovery or integration jobs during the hours in which you modify dashboards.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 142
AI Operations Management - Containerized 24.4
Cause
This issue occurs in classic OBM due to the browser cookie for SSO of AI Operations Management overwriting the browser
cookie of OBM. To solve this issue, configure the AI Operations Management to use lightweight single sign-on (LW-SSO) and
not hpsso.
Note
This issue occurs only for classic OBM and not for AI Operations Management.
This issue occurs only when the FQDN of classic OBM and AI Operations Management are the same.
For example, FQDN of classic OBM has the format, [hostname1].[domain].[tld] such as mambo8.mambo.net and
FQDN of AI Operations Management has the format, [hostname2].[domain].[tld] such as omidock.mambo.net .
In this scenario, you can’t access the classic OBM because the AI Operations Management sets an SSO for *.mambo.net,
which overwrites the SSO set by the classic OBM to mambo8.mambo.net. However, if the FQDN of classic OBM is completely
different, you won’t encounter the issue.
Solution
Perform the following steps:
4. From the LWSSO section, locate and double-click the Creation Domain Mode.
8. In classic OBM, from your web browser, open the developer console and expand Cookies.
For example, to open the developer console and check cookies in Google Chrome:
a. Right-click on the browser and click Inspect to open the developer console
10. Reload the classic OBM page in the same browser and you can now access it.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 143
AI Operations Management - Containerized 24.4
Solution
If you want to open the CDF management portal and OBM in the same browser, use a private browsing window for one of
them.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 144
AI Operations Management - Containerized 24.4
After reloading the Event Browser (HTML), Japanese view names are displayed as question marks, and no events are listed.
Solution
This issue occurs if the Microsoft SQL Server database is not installed on a Japanese operating system. To resolve the issue,
install the MS SQL database on a Japanese operating system.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 145
AI Operations Management - Containerized 24.4
OBM dialog boxes and applets, such as the Authentication Wizard, do not load properly.
Cause
Old java files on your client computer.
Solution
Clear the Java cache by following this procedure:
1. Open Control Panel > Java > Temporary Internet Files > Settings .
2. In the Temporary Internet Files section, click Settings.
3. In the Temporary File Settings dialog box, click Delete Files.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 146
AI Operations Management - Containerized 24.4
Solution
In order to reduce the startup time, UCMDB packages are uploaded only once and ucmdb_pkg_up_ok.marker file is created after
a successful upload attempt. If you want to reconfigure OBM by using a new set of empty databases, you will skip the upload
step as the marker file indicates that the UCMDB packages are already uploaded.
To work around this issue, before running the reconfiguration, make sure you delete the marker file in the following location:
/var/opt/OV/conf/ucmdb_pkg_up_ok.marker
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 147
AI Operations Management - Containerized 24.4
Cause
You face this issue due to lightweight single sign-on (LW-SSO). You have to set the LW-SSO expiry period value properly.
Solution
Follow the steps:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 148
AI Operations Management - Containerized 24.4
This error occurs if the default templates for recipients were not loaded when OBM was installed. Do the following to fix this
issue:
3. Click Add, select the BSMAlerts.zip file and click Open. Deploy the package.
You may have to copy the BSMAlerts.zip file from the OBM system to your local system.
Recipients that were create before BSMAlerts.zip was redeployed have no valid notification template and must be deleted
and added again.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 149
AI Operations Management - Containerized 24.4
Solution
Make sure that the OBM gateway server is able to access the Default Virtual Server for Application Users URL. This URL
can be available in Infrastructure Settings (go to Administration > Setup and Maintenance > Infrastructure Settings)
If you are using a reverse proxy or load balancer, make sure you log in through the Default Virtual Server for Application
Users URL.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 150
AI Operations Management - Containerized 24.4
Solution
Go to Administration > Setup and Maintenance > Infrastructure Settings . Check and update the limit of the Max
Waiting Queue Size parameter.
The default value is 5000. The valid range is 100 to 20000. If you are experiencing this problem, lower the incoming event
rate or increase the Max Waiting Queue size limit. If the limit is increased, you should also monitor the memory consumption
and, if necessary, increase the memory setting (parameter -Xmx) for the opr-backend process.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 151
AI Operations Management - Containerized 24.4
Cause
When you try to run a query on large CI models, the business_impact_new service is busy for a few minutes. During this
time, the RTSM gateways get locked.
Solution
1. Go to Administration > RTSM Administration > Modeling > Modeling Studio.
2. Edit user preferences in Modeling Studio:
1. Go to Tools > User Preferences > General.
2. Set Show Hidden queries in the Modelling Studio to True and click OK.
3. Log out of OBM and log in again.
4. Depending on your environment, do one of the following to edit the business_impact_new query:
Delete the SLA branch from the query (including C2 CI) if you do not have SLA CIs.
Limit the depth to 4 (or smaller if required).
Limit the number of visited objects to 400000000 or smaller by editing the tql.compound.link.max.visited.objects
file.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 152
AI Operations Management - Containerized 24.4
Cause
This issue occurs if a new database is created on the data processing server, with the uimashup files being on the gateway
server.
Solution
If the database was recreated during the reconfiguration, run the following commands to copy the uimashup files to the
correct location:
Windows:
Linux:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 153
AI Operations Management - Containerized 24.4
1. On the source gateway system, open the JMX console: http(s)://localhost:29000 . Log in to the JMX console using the
appropriate credentials.
2. Invoke Foundations:service=UIMDataLoader .
1. The path to the directory where OBM should save the configuration files for the exported data.
2. customerID = 1
You can also export a specific type of content, rather than all content, by using the exportEventsMetaData method
for events, the exportComponentsMetaData method for components, or the exportPagesData method for pages.
4. Go to the directory you specified in the previous step and find the following files:
EventsMetaData_<date>_<timestamp>.uim.xml
ComponentsMetaData_<date>_<timestamp>.uim.xml
PagesData_<date>_<timestamp>.uim.xml
5. Copy these files to the target system and save them under <OMi_Home>/conf/uimashup/import/toload in the corresponding
folder: Events, Components, or Pages.
6. On the target gateway system, open the JMX console: http(s)://localhost:29000 . Log in to the JMX console using the
appropriate credentials.
7. Invoke Foundations:service=UIMDataLoader .
If you only exported a specific type of content, use loadEventsMetaData , loadComponentsMetaData , or loadPagesData .
9. Log in to OBM and go to the My Workspace area. All content exported from the source system should now be available
in the target system. If everything was imported correctly, the files were moved from the toload to the loaded folder and
there is nothing in the errors folder.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 154
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 155
AI Operations Management - Containerized 24.4
Cause
Once you have defined the domain of a Probe, you can change its range, but not the domain.
Solution
Install the Probe again:
1. If you are going to use the same ranges for the Probe in the new domain, export the ranges before removing the Probe.
2. Remove the existing Probe from RTSM. For more information on removing Probe, see the Remove Domain or Probe
button in Data Flow Probe Setup Window topic.
3. Install the Probe. For more information on installing the Probe, see the section about installing the Data Flow Probe in
the UCMDB Help.
4. During installation, give the new Probe a different name or delete the reference to Probe from the original domain.
Related topic
For more information on removing Probe, see the Remove Domain or Probe button on Data Flow Probe Setup
Window page.
For more information on installing the Probe, see Install the Data Flow Probe.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 156
AI Operations Management - Containerized 24.4
Cause
We do not know the cause for this issue.
Solution
Check the following on the Probe machine:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 157
AI Operations Management - Containerized 24.4
Cause
The cause for this issue is unknown.
Solution
Ensure that none of the Probe ports are in use by another process.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 158
AI Operations Management - Containerized 24.4
Cause
The cause for this issue is unknown.
Solution
Install a license for the Probe. For more information, see "Licensing Models for Run-time Service Model" Page is not found.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 159
AI Operations Management - Containerized 24.4
Cause
We do not know the cause for this issue.
Solution
Add the host machine name to the Windows HOSTS file on the RTSM Data Flow Probe machine.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 160
AI Operations Management - Containerized 24.4
Cause
We do not know the cause for this issue.
Solution
To delete all files, restart the machine which has the installed Data Flow Probe.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 161
AI Operations Management - Containerized 24.4
Cause
We do not know the cause for this issue.
Solution
The Probe's CUP version must match the RTSM Server's CUP version. If the CUP versions do not align, you must update the
Probe's CUP version. In some cases, you may have to deploy the CUP manually on a Probe.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 162
AI Operations Management - Containerized 24.4
Cause
The Data Flow Probe Setup module displays Data Flow Probes for discovery. Integration Probes—i.e., Probes on Linux
machines, and Windows Probes configured for integration— do not display in the Data Flow Probe Setup module.
Solution
To see the connection of integration Probe, create a dummy integration point and verify that the Probe gets listed among the
Probes that you can select for the integration point (in the Data Flow Probe field).
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 163
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 164
AI Operations Management - Containerized 24.4
Solution
The table below lists the Data Flow Probe database scripts. You can modify these scripts for administration purposes, both in
Windows and Linux environments.
The Data Flow Probe machine hosts these scripts in the following location:
Windows: C:\hp\UCMDB\DataFlowProbe\tools\dbscripts
Linux: /opt/hp/UCMDB/DataFlowProbe/tools/dbscripts
You should change the Data Flow Probe database scripts for specific administration purposes.
Script Description
importPostgresql [Export file name] [PostgreSQL root account Imports data from a file created by the exportPostgresql script
password into the DataFlowProbe schema
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 165
AI Operations Management - Containerized 24.4
Cause 1
Hosts machine must not contain "localhost".
Solution 1
On the Data Flow Probe machine, open
Windows: %systemroot%\system32\drivers\etc\hosts
Linux: /etc/hosts and ensure that you comment-out all lines containing "localhost".
Cause 2
You have installed Microsoft Visual C++ 2010 ×64 Redistributable during the installation of the Probe. If for some reason if
you uninstall this redistributable installation, PostgreSQL stops working.
Solution 2
Check if you have installed the Microsoft Visual C++ 2010 x64 Redistributable. If not, reinstall it.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 166
AI Operations Management - Containerized 24.4
Solution
After enabling the infrastructure setting Enable forwarding Downtime/Service Health Data to OPTIC DL , login to the
OBM Pod and restart the oprAS process.
kubectl exec -ti -n $(kubectl get pods -A | awk '/omi-0/ {print $1,$2}') -c omi -- bash
<OBM_HOME>/opr/support/opr-support-utils.sh -restart oprAS
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 167
AI Operations Management - Containerized 24.4
Problem
When you open the graph_type excel, you see the file format and extension pop-up.
Solution
Click Yes in the pop-up to see the data stored in the graph_type excel.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 168
AI Operations Management - Containerized 24.4
Problem
After you install or upgrade OBM, you see errors in opr-configserver log when you upload content pack. You see the "
Instrumentation was not added as binary part is empty" error message.
Solution
Ignore the errors when you launch the Content Packs UI and find the instrumentation in the respective Content Packs.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 169
AI Operations Management - Containerized 24.4
Problem
If the installation of the AI Operations Management certificate on a Windows OBM system takes too long, the certificate
installation process isn’t terminated correctly and the OBM Configurator tool is stuck.
Solution
Perform the following steps:
2. Run the OBM Configurator tool as you were using it before by specifying all parameters but with an extra parameter, --f
orce .
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 170
AI Operations Management - Containerized 24.4
The following table lists common shortcuts of HTML5 OBM UIs. Each shortcut's functionality is specific to the listed UI context.
Not all shortcuts might apply to all HTML5 UIs.
Keyboard
Context Description
shortcut
Select or activate an UI control that is focused. For example, you can open the Edit Key Performance
Enter Controls
Indicator panel if the focus is on the Edit button.
Up Arrow or Radio
Switch the focus from one radio button to another.
Down Arrow buttons
Depending on the context, the arrow keys behave differently. In general, use the arrow keys to navigate
For example, when using the Top View component, use the keys to move from one CI to another.
Shift+Home or Text
Select the complete text.
Shift+End fields
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 171
AI Operations Management - Containerized 24.4
Cause
This is because PD maintains a cache for each GW server. After the dashboards or favorites are added or removed on one
GW server, the GW server cache is not updated.
Solution
To resolve this problem, perform the following steps on the Gateway Server:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 172
AI Operations Management - Containerized 24.4
Problem
PD queries data using the time zone of the OBM system. If the agent system is using a different time zone that is behind
OBM's time zone, PD does not find any entry point to forward to BVD.
Solution
Make sure that the agent and the OBM use the same time zone.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 173
AI Operations Management - Containerized 24.4
Problem
For Chinese locale sometimes all strings are not localized. Due to this, you will not see the footer inside the variable picker
from where you can open the Policy Parameters page.
Solution
To resolve this issue, do one of the following:
Refresh the page and check if you can see the Chinese messages and the footer to open the Policy Parameters page.
Switch to another locale to manage policy parameters.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 174
AI Operations Management - Containerized 24.4
Cause
Conditional dashboard assignment fails if a static CI Property is selected while creating a conditional dashboard.
Solution
Do not select a static CI Property (for example, "Actual Deletion Period", "Deletion Candidate Period") while creating
conditional dashboards.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 175
AI Operations Management - Containerized 24.4
Symptom
When you connect OMW systems to OBM, the connection fails with an error. Error message appears as
Operation failed. HTTP Status: 500 (Internal Server Error). Internal server error. Details: Connection reset.
Solution
Follow the steps on the OMW system:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 176
AI Operations Management - Containerized 24.4
Including another directory on top of the package directories in the ZIP file.
Solution: ZIP the package in the same directory as the package directories such as discoveryResources,
adapterCode, etc. Do not include another directory level on top of this in the ZIP file.
Solution: Do not change your naming convention in mid-stream once you begin the re-naming procedure. If you realize
that you need to change the name, start over completely rather than trying to retroactively correct the name, as there
is a high risk of error. Also, use search and replace rather than manually replacing strings to reduce the risk of errors.
Deploying adapters with the same file names as other adapters, especially in the discoveryResources and
adapterCode directories.
Solution: You may be using a UCMDB version with a known issue that prevents mapping files from having the same
name as any other adapter in the same UCMDB environment. If you attempt to deploy a package with duplicate names,
the package deployment will fail. This problem may occur even if these files are in different directories. Further, this
problem can occur regardless of whether the duplicates are within the package or with other previously deployed
packages.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 177
AI Operations Management - Containerized 24.4
1.13.28. Logs
This section provides information about OBM logs.
Log file properties are defined in files in the following directory and its subdirectories: <OBM_HOME>/conf/core/Tools/log4j .
In addition to application log files, there are web server log files. The web server log files are in the <OBM_HOME>/WebServer/lo
gs directory.
When comparing logs on client machines with those on the OBM server machines, keep in mind that the date and time
recorded in a log are recorded from the machine on which the log was produced. It follows that if there is a time difference
between the server and client machines, the same event is recorded by each machine with a different time stamp.
Typical log levels are listed below from narrowest to widest scope:
Error. The log records only events that adversely affect the immediate functioning of OBM. When a malfunction occurs,
you can check if Error messages were logged and inspect their content to trace the source of the failure.
Warning. The log's scope includes, in addition to Error-level events, problems for which OBM is currently able to
compensate and incidents that should be noted to prevent possible future malfunctions.
Info. The log records all activity. Most of the information is routine and the log file quickly fills up.
Debug. This level is used by Software Support when troubleshooting problems.
The default severity threshold level for log files differs per log but is generally set to either Warning or Error.
The names of the different log levels may vary slightly on other servers and for various procedures. For example, Info may
be referred to as Always logged or Flow.
If required, you can change the log level in the respective properties file in the log directory: <OBM_HOME>/conf/core/Tools/log4
j.
For many logs, you can configure the number of archived log files that are saved. When a file reaches its size limit, it's
renamed with the numbered extension 1 (log.1). If there is currently an archived log with the extension 1 (log.1), it is
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 178
AI Operations Management - Containerized 24.4
renamed to log.2, log.2 becomes log.3, and so on, until the oldest archived log file (with the number corresponding to the
maximum number of files to be saved) is permanently deleted.
The maximum file size and the number of archived log files are defined in the log properties files located in <OBM_HOME>/con
f/core/Tools/log4j .
property.<MODULE>_fileMaxSize = 2000KB
property.<MODULE>_backupCount = 10
For example,
property.opr-backend_fileMaxSize = 2000KB
property.opr-backend_backupCount = 10
*.hprof Files
*.hprof files contain a dump heap of an OBM process's data structures. These files are generated by the Java virtual machine
(JVM) if a process fails with a Java Out Of Heap Memory condition.
You are rarely aware of a problem because the problematic process restarts automatically after a failure. The existence of
many *.hprof files indicate that there may be a problem in one of the OBM components, and its contents should be analyzed
to determine the problem.
If you run out of disk space, you can delete the *.hprof files.
To enable event flow logging for all events, set the infrastructure setting Event Flow Logging Mode to file.
You can enable trace logging on the OM server or agent sending the event, or you can add the trace to the event at a later time. Whenever
this custom attribute is enabled on an event, trace output for this event appears in the following flow trace logs:
Tasks
1. Stop OBM.
2. Delete all files under <OBM_HOME>/log . Don't delete the log directory.
3. Delete all .hprof files under /var/opt/OV/log/ (Linux) or %OvDataDir%\log (Windows).
4. Delete all files under <OBM_HOME>/WebServer/logs . Don't delete the logs directory.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 179
AI Operations Management - Containerized 24.4
If the log directories fill up quickly, it's possible that you still have the loglevel set to DEBUG from troubleshooting an issue.
Change it back to its original value (usually INFO ) if you're done with troubleshooting.
If you have increased the log file size or number of files (for example, when you enabled DEBUG logging), change those back
to their original value (otherwise, more space may be consumed).
1. Open the log properties file in a text editor. Log file properties are defined in files in the following directory: <OBM_HOME
>/conf/core/Tools/log4j .
2. Locate the loglevel parameter. For example,
loglevel=ERROR
loglevel=DEBUG
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 180
AI Operations Management - Containerized 24.4
The following table lists common shortcuts of HTML5 OBM UIs. Each shortcut's functionality is specific to the listed UI context.
Not all shortcuts might apply to all HTML5 UIs.
Keyboard
Context Description
shortcut
Select or activate an UI control that is focused. For example, you can open the Edit Key Performance
Enter Controls
Indicator panel if the focus is on the Edit button.
Up Arrow or Radio
Switch the focus from one radio button to another.
Down Arrow buttons
Depending on the context, the arrow keys behave differently. In general, use the arrow keys to navigate
For example, when using the Top View component, use the keys to move from one CI to another.
Shift+Home or Text
Select the complete text.
Shift+End fields
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 181
AI Operations Management - Containerized 24.4
Problem
With the latest OBM versions, in an IdM enabled environment, creating a role in OBM results in two corresponding roles in
IdM: one for the OBM application and another for the CMDB application. In older OBM versions, the system maps only one IdM
role to each OBM role for the OBM application.
During an upgrade from version 2022.05 to a newer OBM version, like 24.2 or 24.4 , the second IDM role that should be
mapped to the OBM one does not show up.
Solution
To ensure that existing roles are properly reflected in IdM after the upgrade, either re-create the role(s) or modify a role in
OBM (for example, by changing the description), which will sync it to IdM as two roles. Next, update your existing groups in
IdM to assign both the OBM and CMDB roles.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 182
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 183
AI Operations Management - Containerized 24.4
Solution
1. Get the API Key from the BVD admin UI (referred to below as <apikey> ).
2. In the OBM UI, go to Administration > Setup and Maintenance > Connected Servers, and click Business Value
Dashboard in the left pane.
3. Click New.
4. Enter the required details.
For example, in an AI Operations Management deployment with OBM capability, enter the details as follows:
Display Label: Local BVD deployment
Identifier: Local_BVD_Deployment
Endpoint URL: https://bvd-receiver:4000/bvd-receiver/api/submit/<apikey>
5. Click Create.
Related topics
For more information on creating Connected Servers, see Connected Servers.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 184
AI Operations Management - Containerized 24.4
1. Ensure OPR-SO-FORWARDER service is running on your Operations Bridge Manager (OBM). Run the command: <OBM_HO
ME>/tools/bsmstatus/bsmstatus . If you don't see the OPR-SO-FORWARDER service listed in the status then:
1. Stop OBM. Run the command:
For Linux: sh /opt/HP/BSM/scripts/run_hpbsm stop .
For Windows: <Topaz_Home>\bin\SupervisorStop.bat (for nanny).
<Topaz_Home>\bin\SupervisorStopAll.bat (for UCMDB and nanny).
2. Re-run the config wizard. Run the command:
For Linux: sh /opt/HP/BSM/bin/config-server-wizard.sh .
For Windows: <Topaz_Home>\bin\config-server-wizard.bat .
3. While running the config-server-wizard.sh script, in the Server deployment page, select the Forwarding service to
OPTIC DL enabled checkbox.
4. Start OBM. Run the command:
For Linux: sh /opt/HP/BSM/scripts/run_hpbsm start .
For Windows: <Topaz_Home>\bin\SupervisorStart.bat .
You can view the OPR-SO-FORWARDER service running.
2. Ensure to enable the forwarding infrastructure setting: Enable forwarding Downtime/Service Health data to
OPTIC DL.
3. Ensure to install the OpsB_serviceHealth content in Operations bridge Suite setup, verify if opr_hi_* and opr_kpi_* tables
are present in the Vertica DB.
Run the command on the master node to list the contents: ops-content-ctl list content. For more information,
see Command line interface.
Note
It could be possible that there are no new service health records generated in OBM and that's why you don't see any records in DB.
To confirm you can enable DEBUG in <OBM_HOME>/conf/core/Tools/log4j/opr-so-forwarder/opr-so-forwarder.prop
erties and you should see the following log lines every 5 minutes in the log file (<OBM_HOME>/log/opr-so-forwarder/opr-
so-forwarder.log ). Got 0 hi entries from DB /Got 0 kpi entries from DB.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 185
AI Operations Management - Containerized 24.4
Problem
The Operations Agent Health dashboard isn't displayed when selecting an Operations Agent CI in the Monitoring Health tab.
Solution
1. Go to Administration > Operations Console > Performance Dashboard Mappings .
2. Select the Operations Agent entry in the CI type tree on the left.
3. Move the Operations Agent Health dashboard to the first position in the list on the right.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 186
AI Operations Management - Containerized 24.4
Problem
PD graphing fails from OPTIC DL as content packs having PD artifacts (MetricConfig) to graph metrics from OPTIC DL fail to
import when OBM is upgraded from or older than version 2020.10.
Solution
Run the PD_Identity_Issue_Workaround.sql file in the event database and then run the command for uploading the content
pack: <OBM_HOME>/bin/opr-content-auto-upload.<sh/bat> -a -forceReload -uploadFolder <OBM_HOME>/conf/opr/content/en_US
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 187
AI Operations Management - Containerized 24.4
Problem
OMi Server self-monitoring content pack shows errors and unresolved content after upgrading OBM from versions older than
2022.05.
Solution
This issue doesn't affect functionality. To resolve this error, run the below commands in DPS to import the content pack
again.
For Windows
1. %topaz_home%/bin/opr-content-manager.bat -i %topaz_home%\conf\opr\content\en_US\OMi_Self_Monitoring.zip -user <Login na
me of the user required for authentication> -pw <Password for the specified user>
2. %topaz_home%/opr/bin/opr-assign.bat -user <Login name of the user required for authentication> -pw <Password for the specifie
d user> -delete_auto_assignment -id d474d89b-26c3-46e5-b0f8-4772d63c5ea5
For Linux
1. /opt/HP/BSM/bin/opr-content-manager.sh -i /opt/HP/BSM/conf/opr/content/en_US/OMi_Self_Monitoring.zip -user <Login name of th
e user required for authentication> -pw <Password for the specified user>
2. /opt/HP/BSM/opr/bin/opr-assign.sh -user <Login name of the user required for authentication> -pw <Password for the specified u
ser> -delete_auto_assignment -id d474d89b-26c3-46e5-b0f8-4772d63c5ea5
/opt/HP/BSM/opr/bin/opr-assign.sh -list_auto_assignment_by_view -view_name "OMi Deployment" -user <Login name of the user r
equired for authentication> -pw <Password for the specified user>
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 188
AI Operations Management - Containerized 24.4
This issue is known to occur on Azure with Azure PostgreSQL Flexible Server database.
Cause
This issue occurs if the size of the CRL size is huge. For example, more than 35 MB.
Solution
To fix this issue, do the following:
1. Go to the directory where you have the values.yaml file for AI Operations Management deployment.
2. In the OBM Settings section, look for obm.deployment.database.postgresCrlCheckEnabled parameter.
3. Set the value of this paramater to false to disable CRL checks while connecting to the database. Before you disable the
check, make sure you have read and agreed to the security implications.
4. Save the file.
5. Run the following command to upgrade the deployment for the changes done:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 189
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 190
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 191
AI Operations Management - Containerized 24.4
For example:
Cause
The new versions of Operations Cloud content packs aren't properly recognized and thus not uploaded.
Solution
Perform these steps to resolve this issue:
Note
kubectl -n <application namespace> exec -ti $(kubectl -n <application namespace> get pods -l=app.kubernetes.io/name=conten
t-service|tail -1|awk '{print $1}') -c content-service -- bash -c 'touch /tmp/uif-content/*'
2. Run the following command to touch a configuration map with name: (For example, obmContentPack.json , aecContentPack
-1.4.2.json, …)
kubectl exec -it $(kubectl -n <application namespace> get pods -l=app.kubernetes.io/name=content-service|tail -1|awk '{print $
1}') -c content-service -n <application namespace> -- bash -c "touch /tmp/uif-content/<Name of the Json file>"
If you see the issue persists even after running any of the above two commands, restart the pod. To restart the pod, do the
following:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 192
AI Operations Management - Containerized 24.4
Cause
The issues occurs, when the randomly generated password begins with '--' characters. It is considered as another argument
instead of a value thus causing pod startup failure.
Solution
1. Run the following command to login to any container that's in running state:
After updating the secret, in the next pod restart attempt, redis pod which is in crashloopstate must come into the running
state.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 193
AI Operations Management - Containerized 24.4
To do this:
1. Navigate to side navigation panel, click Administration > Setup & Configuration > Settings.
2. Under User settings, select the localized language for the dashboard.
Failing to do so will result in PDF getting generated in English instead of the desired localized language.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 194
AI Operations Management - Containerized 24.4
Cause
This issue occurs in the Firefox browser.
Solution 1
Either use Chrome or Edge browser.
Solution 2
1. Make sure you've two Stakeholder dashboards.
2. After login, navigate to a different dashboard first and then to the one you want to see.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 195
AI Operations Management - Containerized 24.4
Cause
To identify a known server, the CLIs are storing the server certificates in a sub directory of the users home directory. If the
server certificate changes after renewal, the CLIs can't identify the certificate and will display the "unable to get local issuer
certificate" error message.
Solution
1. Go to your home directory.
2. For bvd-cli you'll find a bvdCliCert directory.
3. For PDF print you'll find Web2PdfCert directory.
4. Delete the file that has the same name as the server that you're trying to use with the CLI from the directory.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 196
AI Operations Management - Containerized 24.4
Cause
The issue occurs whenever you edit the data table with a row selected and attached predefined query has a dimension.
Solution
Change the time range in the time selector or remove the context that you applied for the dimension.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 197
AI Operations Management - Containerized 24.4
Cause
When you don't specify file name while requesting PDF of UIF page, WebtoPDF fails to save the file.
Solution
Give the file name in the CLI command using -- out option
Example:
[pdf-print|pdf-print.exe] --suite_url <suite url> --url <URL of the page> --user <webpage username> --pass <webpage password> --ou
t <output file location>
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 198
AI Operations Management - Containerized 24.4
2022-11-10T07:44:05.731Z bvd:error:db Upgrade to DB schema version 20170907110000 failed. Error: ERROR: Migration table is already
locked
2022-11-10T07:44:05.732Z bvd:error:init Error during startup of init: ERROR: Migration table is already locked
2022-11-10T07:44:05.796Z bvd:error:init Init process is aborting now
Cause
The databasecreation container, which is responsible for creating/updating the database tables, got interrupted before it
could finish. This will happen only after an installation or upgrade.
Solution
If you are sure migrations aren't running you can release the lock manually by running the command: knex migrate:unlock
1. Scale bvd-controller-deployment to 0.
2. Delete all tables from the BVD database.
3. Scale bvd-controll-deployment to 1.
For upgrade:
1. Scale bvd-controller-deployment to 0.
2. Restore the BVD database from the backup taken before the upgrade.
3. Scale bvd-controll-deployment to 1.
This should trigger the update of the BVD database tables to the current version.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 199
AI Operations Management - Containerized 24.4
Cause
The log shows that the suite admin (trustee) tries to impersonate the admin (trust) but there is no trust relationship between
the groups of these two users that they belong to.
For example:
trust trustee
PreSales superIDMAdmins
Administrators superIDMAdmins
superIDMAdmins superIDMAdmins
Solution
Perform any of the following to resolve the issue:
Create another trust between one of groups of admin ( Administrators , SuiteAdministrators , admin ) to the group of
suiteadmin (“suitegroup”)
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 200
AI Operations Management - Containerized 24.4
bvd:error:bvd-ap-controller Error getting token from IDM ERROR: getaddrinfo ENOTFOUND <external suite hostname>
bvd:error:bvd-ap-controller Aborting steps to get the feature details
Cause
AP-Bridge accesses IdM using an external hostname instead of an internal name. The external suite hostname isn't DNS
resolvable from within the Kubernetes cluster.
Solution
Make sure that AP-Bridge uses the internal IdM hostname and that it's resolvable.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 201
AI Operations Management - Containerized 24.4
Cause
If the server receives more than one request at the same time the following error appears:
Solution
Wait until the first CLI request is complete before sending another Web to PDF CLI request.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 202
AI Operations Management - Containerized 24.4
Cause
If you enter a wrong proxy configuration in the Web to PDF CLI, the current request and next requests with the wrong proxy
to the server will fail.
Solution
Make sure to type a correct proxy configuration in the Web to PDF CLI.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 203
AI Operations Management - Containerized 24.4
{"error":true,"message":"Forbidden"}
bvd:error:idmService Failed to get IDM groups of user admin. Status Code:403. Error Detail:Access is denied
bvd:passportIdMStrategy Error verifying authentication Error: Status Code:403. Error Detail:Access is denied
Cause
The integration user credential has been changed and isn't added to any group, so it has no permission in the system to
validate the access to BVD console, which results in 403 forbidden error.
Solution
You need to update the integration_admin settings from the database and restart bvd-www-deployment pod. You will be able
to login to BVD console successfully.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 204
AI Operations Management - Containerized 24.4
Cause
If a scheduled export fails (BVD or SMTP server not reachable, wrong SMTP configuration, etc.), the server will retry to
execute the schedule five times before giving up. When the server gives up, it will delete the schedule without notice.
Solution
If you're missing the export result and the schedule is gone, request your administrator to inspect the WebToPDF-
deployment logs. If necessary, fix these reasons together with your admin and create that schedule again.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 205
AI Operations Management - Containerized 24.4
Solution
Set the current child report title for the CSV zip file.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 206
AI Operations Management - Containerized 24.4
Cause
The URL contains &page=1 . Hence, the complete report isn't loaded.
Solution
Remove &page=1 from the URL and reload the page. This will enable the scroll bar of the widget group, allowing you to see
all data by scrolling through it.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 207
AI Operations Management - Containerized 24.4
Cause
The browser and the server time are not in sync. If the server time is more than 10 seconds after the browser time, popup
cannot be displayed.
Solution
You need to sync the server and browser time for the notifications to popup.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 208
AI Operations Management - Containerized 24.4
The number of bytes received from Vertica exceeded the configured maximum. Terminated connection. Received bytes: 5247312; Allowed
maximum: 5242880
Cause
The number of bytes received from Vertica exceeded the following configured limit:
Solution
To fix the issue, apply any one of the following solutions:
To reduce the response size, change the query so that it returns fewer data. Example: Remove unused data fields or limit the
number of rows returned.
Solution 2: Increase the maximum response limit through the QUERY_RESPONSE_LIMIT environment variable.
- name: DEBUG
value: bvd:error*,bvd:audit*
- name: QUERY_RESPONSE_LIMIT
value: 100000
4. Replace 100000 with the required query size in bytes. Make sure to retain the indentation.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 209
AI Operations Management - Containerized 24.4
However, entering the same password (used in the suite configuration) in the BVD Vertica connection settings slide in works
without any issues.
Cause
BVD falsely assumes that the password provided is base64 encoded if the password contains the characters a-z, A-Z, 0-9, /
and +, and the length of the password is a multiple of 4.
Solution
Follow these steps to fix this issue:
1. Change the Vertica password to include another character besides a-z, A-Z, 0-9, / and +, or use a password with a
length that isn't a multiple of 4.
2. Reconfigure the suite to use the changed password
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 210
AI Operations Management - Containerized 24.4
Issues
The two RUM dashboards, RUM_PerApp and RUM_AllApps don't show any data.
Solution
1. Go to Rum_AllApps_Dashboard
2. Click Settings and select Data Collectors.
3. In the Data Collectors, choose and delete these three queries:
Application #1 (app01)
Application #2 (app02)
Application #3 (app03)
2. In the Query field, add the value: select distinct(application_name) as app_name from opsb_rum_page
3. Click RUN.
4. Scroll down, in both the Value column and Label column drop-down choose app_name.
5. Click and select Create Parameter Query from the menu to create the second query.
1. Add values
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 211
AI Operations Management - Containerized 24.4
2. In the Query field, add the value: select distinct(application_name) as app_name from opsb_rum_page
3. Click RUN.
4. Scroll down, in both the Value column and Label column drop-down choose app_name.
6. Click and select Create Parameter Query from the menu to create the third query.
1. Add values
2. In the Query field, add the value: select distinct(application_name) as app_name from opsb_rum_page
3. Click RUN.
4. Scroll down, in both the Value column and Label column drop-down choose app_name.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 212
AI Operations Management - Containerized 24.4
After you create the three new parameter queries you will see data in the RUM BVD dashboard.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 213
AI Operations Management - Containerized 24.4
Issue 1: The format of the next data aging and channel statistics aging time read from the DB isn't valid. After a
restart, the next data aging happens in 10 days and statistics aging in 24 h regardless of what's configured in the UI.
Issue 2: All next schedules will happen in 10/1 day intervals, unless the intervals get changed in the UI again.
Issue 3: In the System settings UI, the Save button doesn't get enabled to save the changes, if only Aging data gets
modified.
Issue 1
After a restart, the next data aging happens in 10 days and statistics aging in 24 h regardless of what's configured in the
UI. After restarting the system, the BVD controller logs the following:
2021-07-21T09:12:13.595Z bvd:error:controller The format of the next data aging time read from the DB is not valid.
Time read: "2021-07-30T10:52:14.308Z"
2021-07-21T09:12:13.602Z bvd:error:controller The format of the next channel statistics aging time read from the DB is
not valid.
Cause
The format of the next data aging and channel statistics aging time read from the DB isn't valid.
Solution
Use a database trigger to strip the quotes before inserting/updating the bvdSettings table.
For Postgres:
For Oracle:
For Postgres:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 214
AI Operations Management - Containerized 24.4
For Oracle:
Issue 2
All the next schedules will happen in 10/1 day intervals, unless the intervals get changed in the UI again
Cause
BVD controller doesn't load configured aging interval on startup.
Solution
Workaround 1: After bvd-controller-deployment got started/restarted, change the aging settings in the UI and save them.
This will make controller to pickup the changed settings. This must repeat after each start/restart of bvd-controller-deployment .
Workaround 2: Configure the aging time range with environment variables to overwrite the default values. For this, you
need to edit the bvd-controller-deployment using the following command:
containers:
- args:
- controller
env:
- name: BVD_AGING_CHANNEL_STORAGE_TIME
value: "2"
- name: BVD_AGING_PURGE_OLDER
value: "2"
Note
Keep the same indentation as the other lines after "env:". Replace the value 2 with the actual interval in days that you want to
configure. In the UI, the name of the BVD_AGING_CHANNEL_STORAGE_TIME is Data channel statistics and
BVD_AGING_PURGE_OLDER is Data records in the UI.
After saving the changes, when the BVD controller gets restarted, the changed settings are applied.
Issue 3
In System settings dialog box in UI, Save button not enabled if you modify only Aging data.
Solution
Add a space character to custom CSS field and remove it again. This will enable the save button.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 215
AI Operations Management - Containerized 24.4
1. Common Name (CN) of Certificate Authority (CA) and server certificate are the same
2. CN of CA and server certificate differ
3. CN of CA is empty
FATAL ERROR: MarkCompactCollector: young object promotion failed Allocation failed - JavaScript heap out of memory.
Cause
The issue arises, if the common name (CN) of the certificate authority (CA) is the same as the CN of the server certificate.
This is a third-party issue in NodeJS: https://github.com/nodejs/node/issues/37757
Solution
Cause
CA signs the Server certificate, but the CA isn't in the trusted list of BVD
Solution
Add the CA (and if necessary, the whole certificate chain) to BVD during the configuration of the Vertica connection. You can
add multiple certificates to the same file.
CN of CA is empty
bvd-quexserv shows the following error in the log file:
ERROR: Found an empty CN, Please use proper Issuer/Subject CN for the server(vertica.example.com) certificate for
database(vertica).
Cause
Solution
Use a non empty CN for the CA. You need to give an arbitrary value (should be different from the CN of the server certificate)
to avoid this error.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 216
AI Operations Management - Containerized 24.4
Cause
This error occurs when Operations Cloud tries to make an HTTPS request to the Vertica server with a wrong or faulty self-
signed SSL certificate. Operations Cloud rejects such certificates by default.
Solution
Operations Cloud rejects the Certificate Authority (CA) with an empty Common Name (CN) field regardless of any other
attributes within the certificate. To resolve this issue, assign a valid domain name to the Common Name (CN) field in the SSL
certificate.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 217
AI Operations Management - Containerized 24.4
Cause
This is a suspected memory leak issue in the bvd-quexserv .
Solution
Run the following command:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 218
AI Operations Management - Containerized 24.4
In the Quexserv log you can find the following error message for DB connection failure.'
Error: The sending password for "vertica", encryption algorithm MD5 does not match the effective server
configured encryption algorithm SHA512
Cause
BVD doesn't support SHA512 encrypted passwords for Vertica connections. Supports only MD5 as encryption algorithm for
authentication with Vertica.
Solution
Run the following query on the Vertica database to alter the security algorithm to MD5 for the username used to connect
BVD to that Vertica:
In case you changed the password for that user, configure BVD with that new password in the respective environment. In
case the Vertica connection details got configured in the BVD UI, update the password there; If configured during the suite
installation, update the configuration there.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 219
AI Operations Management - Containerized 24.4
Cause
To access redis , the password of redis doesn't match with the password of BVD PODs.
Solution
Delete the bvd-redis-deployment POD using the following command:
This will delete the redis POD and will trigger Kubernetes to create a new one. The new redis POD will then use the same
password as the BVD PODs.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 220
AI Operations Management - Containerized 24.4
Cause
The issue is due to Internet Explorer 11 browser limitation.
Solution
After saving the uploaded dashboard, refresh the Dashboards page, for example by pressing F5.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 221
AI Operations Management - Containerized 24.4
Cause
Visio supports styles for shapes which the SVG standard doesn't. Having applied one of these styles to a shape can lead to
missing shapes or empty dashboards after exporting to SVG. This happens, for example, to all shapes with shadows.
Solution
Remove the unsupported style or review the Microsoft Visio documentation for possible workarounds.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 222
AI Operations Management - Containerized 24.4
Cause
This issue can occur due to one of the following possible causes when sending the data to BVD:
Solution
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 223
AI Operations Management - Containerized 24.4
Certificate Authority (CA) issued certificates are safer. Other entity can verify the issued certificate.
You can exchange the self signed certificate in Vertica with a CA certificate as follows:
1. Prerequisite. You must have a CA private key and public certificate ready.
2. Log on to Vertica's Management Console as the administrator.
3. On the home page, click Settings.
4. In the left panel, click SSL certificates.
5. Click Browse to import the new key, and click Apply to apply the change.
6. Restart the Management Console.
For more information and additional details, review the Vertica documentation.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 224
AI Operations Management - Containerized 24.4
Solution
Login into BVD UI and change the password of the database connection in the Predefined query UI.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 225
AI Operations Management - Containerized 24.4
Check if the uif-contentservice log file displays the parse error as follows:
Cause
This issue is because the baseCP.json file isn't entirely copied during the uif-contentservice-deployment pod start up.
Solution
Follow these steps to resolve this issue:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 226
AI Operations Management - Containerized 24.4
Cause
Default query execution request timeout of 4 minutes not working.
Solution
kubectl set env deployment/bvd-www-deployment -n <namespace> REQ_TIMEOUT=240000
kubectl set env deployment/bvd-quexserv -n <namespace> REQ_TIMEOUT=240000
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 227
AI Operations Management - Containerized 24.4
Cause
BVD treats Vertica timestamp data type values as being from UTC time zone. Otherwise, it will lead to different time stamps
getting displayed for the same value in line charts and text widgets.
Solution
Use the to_char Vertica function to convert timestamp data type to a string.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 228
AI Operations Management - Containerized 24.4
Cause
If you enter a wrong proxy configuration in the Web to PDF CLI, the current request and next requests with the wrong proxy
to the server will fail.
Solution
Make sure to type a correct proxy configuration in the Web to PDF CLI.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 229
AI Operations Management - Containerized 24.4
Cause
If the URL or the user credentials of the BVD web page entered in the Web to PDF CLI is wrong, the following error appears:
ERROR: Processing request from server: Request failed with status code 500
Solution
Make sure to type correct URL and the user credentials of the BVD web page to the Web to PDF CLI.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 230
AI Operations Management - Containerized 24.4
Cause
If the URL entered in the CLI is wrong the following error appears:
ERROR: Processing request from server: Request failed with status code 404
Solution
Make sure to type correct URL in the Web to PDF CLI.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 231
AI Operations Management - Containerized 24.4
Cause
If the server receives more than one request at the same time the following error appears:
Solution
Wait until the first CLI request is complete before sending another Web to PDF CLI request.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 232
AI Operations Management - Containerized 24.4
(node:9837) Warning: Accessing non-existent property 'padLevels' of module exports inside circular dependency
Cause
The issue is due to Third party–Node.js packaging.
Solution
You can ignore the warning.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 233
AI Operations Management - Containerized 24.4
Failed to get the idm token Error: Request failed with status code 502
Cause
If user doesn't specify a port in the URL, the WebtoPDF service isn't taking the default value 443 for port number to generate
PDF.
Solution
You need to specify the default port number 443 in the command.
Example:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 234
AI Operations Management - Containerized 24.4
Cause
Due to the slow response time of the server.
Solution
Use --SaveOnServer and set the option to "True". This parameter saves the PDF on the server. Saves the PDF in the
default location of the Suite container. Under reports, you can find the output pdf
file. Example: /var/vols/itom/cdf/vol3/ips/reports.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 235
AI Operations Management - Containerized 24.4
bvd:error:schedule Failed to send the mail to the user. Error: "ERROR: Mail command failed: 501 Invalid MAIL FROM address provided"
Cause
WebToPDF uses the user name configured to connect to the SMTP server as the sender of the emails. If that user name isn't
an email address, the server throws an error depending on the SMTP server configuration.
Solution
You can try any one of the following:
Configure the email address as user name to login to the SMTP server.
Configure the SMTP server to use the accounts email address as "FROM" when sending emails.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 236
AI Operations Management - Containerized 24.4
The data received from the server does not contain {0} but {1} values
The chart by default displays {0} values. However, the data sent by the server only contains {1} values. Check the data sent
by the server and make sure that it contains all required data fields. Open Administration > Dashboards & Reports >
Stakeholder Dashboards & Reports > Dashboard Management to verify the configured data fields and the data sent by
the server in the data channel selector.
Widgets: Bar Chart, Status Color Group, Donut Chart, Feed, Status Images, Sparkline / Multiple Area Chart, Text Value, Status
Visibility Group
There is no data channel set for this widget. Therefore it won't receive any data to display. Open Administration >
Dashboards & Reports > Stakeholder Dashboards & Reports > Dashboard Management to configure a data channel for
the widget.
There are no 'path' or 'rect' elements for the dashboard item '{0}'
The Status Color Group is coloring lines and areas of the grouped shapes. If these don't contain lines and areas, coloring will
fail with this error. Open the dashboard in Visio and make sure that valid shapes are grouped with this Status Color Group.
The property '{0}' used in the coloring rule is not part of the data. Please check the syntax of the rule.
(Dashboard item '{1}')
Open Administration > Dashboards & Reports > Stakeholder Dashboards & Reports > Dashboard Management and
verify the rule's syntax: the rule should only use data fields that are included in the data sent by the server.
The coloring rule '{0}' is not valid. Please check the syntax of the rule. (Dashboard item '{1}')
Open Administration > Dashboards & Reports > Stakeholder Dashboards & Reports > Dashboard Management and
verify that the rule adheres to the format described in the Coloring Rule section.
The data received from the server is missing either the 'link' or the 'title' field (Feed '{0}')
Widgets: Feed
Make sure that the data sent by the server contains the data fields “link” and “title”.
The data received from the server does not contain a status (Status images '{0}')
The status field configured in Administration > Dashboards & Reports > Stakeholder Dashboards & Reports >
Dashboard Management isn't part of the data sent by the server. Make sure to choose the correct status field or the server
sends the correct data.
The data received from the server does not contain a property called '{0}' (Spark line '{1}')
The data field configured in Administration > Dashboards & Reports > Stakeholder Dashboards & Reports >
Dashboard Management isn't part of the data sent by the server. Make sure to choose the correct data field or the server
sends the correct data.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 237
AI Operations Management - Containerized 24.4
There is no URL set for this widget. Therefore it won't display any data. Open Administration > Dashboards & Reports >
Stakeholder Dashboards & Reports > Dashboard Management to configure a URL of the dashboard item.
The property '{0}' used in the visibility rule is not part of the data. Please check the syntax of the rule.
(Dashboard item '{1}')
Open Administration > Dashboards & Reports > Stakeholder Dashboards & Reports > Dashboard Management and
verify the rule's syntax: the rule should only use data fields that are included in the data sent by the server.
The visibility rule '{0}' is not valid. Please check the syntax of the rule. (Dashboard item '{1}')
Open Administration > Dashboards & Reports > Stakeholder Dashboards & Reports > Dashboard Management and
verify that the rule adheres to the format described in the Visibility Rule section.
Unable to calculate a color with the data received from the server (Dashboard item '{0}')
The value sent by the server, together with the given coloring rule, didn't result in a color. Open Administration >
Dashboards & Reports > Stakeholder Dashboards & Reports > Dashboard Management to verify the accuracy of your
coloring rule and give a default color as the last entry. For details on defining coloring rules, see the Coloring Rule section.
The widget type you placed in your Widget Group isn't supported. Not supported widget types are Feed and Web Page
widgets.
Edit the dashboard or template in Visio to make sure to group the supported widget type with the widget group. For more
information, see the Group Widgets topic.
Unable to create Text Value widget. There is no valid data channel set for the text value
In the report, the widget having hyperlinks uses Text Value widget. To avoid this message, use Hyperlink Group widget.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 238
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 239
AI Operations Management - Containerized 24.4
How to's
How to check the Vertica tables
How to check if OPTIC DL Message Bus topics and data is created
How to check the OPTIC DL Message Bus pod communication
How to recover OPTIC DL Message Bus from a worker node failure
How to check connectivity between Vertica node and OPTIC DL Message Bus Proxy services
How to verify the OPTIC DL Vertica Plugin version after reinstall
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 240
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 241
AI Operations Management - Containerized 24.4
Cause
Metric data doesn't reach the Vertica database if a Pulsar topic isn't created.
Solution
Follow the steps:
1. Run the query on the Vertica database to identify the missing topics: select distinct(source_name) from itom_di_scheduler_pro
vider_default.stream_microbatch_history where end_message ilike '%topicnotfound%';
2. Run the command and note down bastion pod name: kubectl get pods -A |grep -i bastion
3. Run the command to log into the Pulsar bastion pod: kubectl -n opsb-helm exec -it <itomdipulsar-bastion-pod-name> -c itomdi
pulsar-bastion bash
4. For each of the topic from step 1, run the command to create a Pulsar topic: bin/pulsar-admin topics create-partitioned-topic
-p 3 <topic_name>
For example: bin/pulsar-admin topics create-partitioned-topic -p 3 opsb_agent_cpu
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 242
AI Operations Management - Containerized 24.4
Solution
5. If the flow doesn't continue and the task goes to the “FAILED_NON_RECOVERABLE” state often, perform this step:
Go to <conf-volume> location and perform the following steps:
i. Open the task-executor-logback.xml file.
ii. Change the level from INFO to DEBUG in the following line:
<logger name="com.microfocus" level="INFO" additivity="false">
<appender-ref ref="LogFileAppender"/>
</logger>
iii. Save the file.
iv. Open the log4perl.conf file.
v. Change the level from INFO to DEBUG in the following line:
log4perl.logger.topology = INFO, topologyAppender
vi. Save the file.
vii. Run the curl command from the step 3c .
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 243
AI Operations Management - Containerized 24.4
viii. Collect the logs and contact Software Support to resolve the issue.
1. itom-di-postload-taskcontroller
2. itom-di-postload-taskexecutor - This pod receives and processes the jobs from the data processor master pod. Acts as the
task manager.
Check if the Vertica database is reachable. To troubleshoot, see the Vertica database isn't reachable. Also, check the
task status, see Postload task flow not running from the Related topics.
Check if the Vertica sessions have exceeded. Run the query
SELECT node_name, current_value FROM v_monitor.configuration_parameters WHERE parameter_name = 'MaxClientSessions';
You can increase the max session based on the memory available in Vertica.
Check error, tasks, and state in data processor logs:
taskexecutor.log
forecast.log
For more information on log files and the locations, see Find the log files from the Related topics.
If there are no forecast logs, see the troubleshooting scenario No logs for aggregate and forecast from the Related
topics.
Related topics
To troubleshoot Vertica connection issues, see Failed to connect to host.
To check the Vertica tables, see How to check the Vertica tables.
To troubleshoot Vertica database is reachable, see Vertica database isn't reachable.
To troubleshoot Data is in OPTIC DL Message Bus topic but not present in Vertica tables, see Data is in OPTIC DL
Message Bus topic but not present in Vertica tables.
To check the task status of Vertica database, see Data Processor Postload task flow not running.
To troubleshoot, see Forecast data is not displayed in System Infrastructure Summary or System Resource Details
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 244
AI Operations Management - Containerized 24.4
reports.
For information on data flows and task flows that enable the OPTIC Reporting - System infrastructure reports and event
reports, see Reporting data and task flows.
To troubleshoot Aggregate table has missing or no data, see Aggregate table has missing or no data.
To verify the creation of OPTIC DL Message Bus topics and data, see How to check if OPTIC DL Message Bus topics and
data is created.
To prepare Vertica database, see Prepare Vertica database.
To troubleshoot Aggregate table has missing or no data, see Aggregate table has missing or no data.
To see the aggregate and availability log files and to set the log file level from INFO to DEBUG , see System
Infrastructure reports are showing no or partial data or updated data is not shown in the reports and System
Infrastructure Availability data is missing in reports.
To troubleshoot Aggregate tables are not updated data in the system infrastructure or event reports are not refreshed,
see Aggregate tables are not updated data in the system infrastructure or event reports are not refreshed.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 245
AI Operations Management - Containerized 24.4
Cause
This issue is because the OPTIC DL HTTP Receiver isn't responsive. The receiver log file shows the java.lang.OutOfMemoryError:
GC overhead limit exceeded error.
Solution
Perform these steps to resolve the issue:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 246
AI Operations Management - Containerized 24.4
Cause
This issue is because of any of the following reasons:
the task got stuck in DISPATCHED or RUNNING or FINISHED state and you have upgraded the suite before the task
recovered.
OR
because the task related message was lost in the OPTIC DL Message Bus.
Solution
Perform these steps to resolve this issue:
1. Run the following commands on the master node to scale down the itom-di-postload-taskcontroller and itom-di-postload-task
executor pods:
kubectl scale deployment itom-di-postload-taskcontroller --replicas=0 -n <suite namespace>
kubectl scale deployment itom-di-postload-taskexecutor --replicas=0 -n <suite namespace>
2. Run the following commands to log on to the pulsar bastion pod:
kubectl get pods -n <suite namespace>
Note down the bastion pod name.
kubectl -n <suite namespace> exec -ti <bastion pod-<POD value>> -c pulsar -- bash
3. Run the following commands to delete the topics:
bin/pulsar-admin topics delete-partitioned-topic -f persistent://public/itomdipostload/di_internal_postload_state
bin/pulsar-admin topics delete-partitioned-topic -f persistent://public/itomdipostload/di_postload_task_status_topic
bin/pulsar-admin topics delete-partitioned-topic -f persistent://public/itomdipostload/di_postload_task_topic
4. Run the following commands on the master node to scale up the itom-di-postload-taskcontroller and itom-di-postload-
taskexecutor pods:
kubectl scale deployment itom-di-postload-taskcontroller --replicas=<replica count> -n <suite namespace>
kubectl scale deployment itom-di-postload-taskexecutor --replicas=<replica count> -n <suite namespace>
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 247
AI Operations Management - Containerized 24.4
Possible causes
Cause 1: The required pods aren't running.
Cause 2: The taskflow is in FAILED_NON_RECOVERABLE state.
Cause 3: Errors in the log file.
Cause 4: Vertica connection issue.
Solution
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 248
AI Operations Management - Containerized 24.4
log4perl.logger.topology = INFO,
topologyAppender
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 249
AI Operations Management - Containerized 24.4
Cause
This issue may occur if there is a mismatch in the mapping of a topic name to the corresponding table name.
Solution
1. Log on to the Vertica database. Use either the admin tools or any of the DB Visualizer tool of your choice to view the
database tables.
2. Click itom_di_configuration_* > TABLE > MICROBATCH. In the Data tab, check the CONTENT_JSON row if the table
name and topic name are the same.
For example:
"topic_name": "SCOPE_GLOBAL",
"streaming_table_schema": "<suite>_store",
"streaming_table_name": "SCOPE_GLOBAL"
If there is a mismatch, update them.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 250
AI Operations Management - Containerized 24.4
Cause
This is because the metadata information stored in OPTIC Data Lake isn't cleaned.
Solution
Perform the following steps to clean up metadata information:
1. Drop the required tables from the database by running the command:
2. Go to <conf-volume>/di/administration/conf/metadata/<table_schema> directory.
3. Run the following command to delete the required files ending with metadata.json :
rm -f <table_name>_metadata.json
4. Go to <conf-volume>/di/vertica-ingestion/conf/dataset/<table_schema> directory.
5. Run the following command to delete the required files ending with dataset.json :
rm -f <table_name>_dataset.json
rm -f <table_name>_microbatch.json
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 251
AI Operations Management - Containerized 24.4
Cause
Data sent to the OPTIC DL HTTP Receiver doesn't have the required headers. The receiver-itom-di-receiver-dpl-<pod value>.log
displays receiver.topic.from.header value is true but no header provided with header fieldname x-monitoredcollection message.
Solution
To resolve this issue, make sure that the data streamed to the OPTIC DL HTTP Receiver contains the required header fields. If
data has required headers, you can check the sent data in the log by changing the log level from INFO to DEBUG as follows:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 252
AI Operations Management - Containerized 24.4
Cause
If a Postload task flow encounters an error, the task flow execution retries the task for the configured number of times, and
then the task switches to FAILED_NON_RECOVERABLE state. Common causes for these errors are:
Solution
When a task in the data processing flow fails, it means that it has retried for the configured number of times (=1440 for bulk
load - approx. 1 day), and then the task status set to FAILED. Perform the following steps:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 253
AI Operations Management - Containerized 24.4
Cause
This issue is because during the suite upgrade or reconfigure, you've changed the value for the parameter global.di.cloud.exter
nalAccessHost.pulsar in the values.yaml file. Due to this, the data flow from OPTIC DL Message Bus to Vertica database gets
interrupted.
Solution
To resolve this issue, you must restart the Vertica database.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 254
AI Operations Management - Containerized 24.4
Cause
This issue is due to wrong backlog information retrieved from the OPTIC DL Message Bus, due to which OPTIC DL Streaming
Loader doesn't run micro batches for such topics.
However, it's observed that backlog information gets reflected the moment more than two messages are streamed in for the
same topics.
Solution
Perform the following steps to disable the OPTIC DL Streaming Loader backlog checks and resolve this issue:
helm upgrade <release-name> <application-chart> -n <application namespace> -f <values YAML filename> --set itom-di-udx-sc
heduler.scheduler.configData.scheduler.enableFrameBacklogCheck="false" -set itom-di-udx-scheduler.scheduler.configData.sche
duler.enableMicrobatchBacklogCheck="false"
2. Run the following command to verify if enableFrameBacklogCheck and enableMicrobatchBacklogCheck are false :
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 255
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 256
AI Operations Management - Containerized 24.4
WARNING: Certificate with alias '<alias name> on <host name>' is already installed
Cause
In the case of OPTIC Data Lake secure connection, if you have removed the earlier certificate and install a new certificate on
the same system, the warning appears. This issue may occur if the certificate is similar to any of the deleted certificates.
Solution
You can ignore this warning.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 257
AI Operations Management - Containerized 24.4
Todays OPTIC DL Vertica Plugin is older ... not supported ... 291 1
Cause
This issue occurs because you have run the dbinit.sh command as the database administrator user instead of the root user.
Also, the OPTIC DL Vertica Plugin RPM you are installing is older compared to the existing RPM.
Solution
You must make sure to install the RPM as a root user. Perform the following steps:
1. Log on to the Vertica node where you have installed the RPM as the root user.
2. Run the following command and note down the RPM version:
rpm -qa itom-di-pulsarudx
3. If you have run the RPM as a database administrator, you will see the files in the location $HOME/.coso_rpm_installed and
/home/dbadmin/.coso_rpm_installed . Run the following command to check the files and compare the RPM versions:
cat /home/dbadmin/.coso_rpm_installed
The RPM version will appear: itom-di-pulsarudx-<RPM version>
cat $HOME/.coso_rpm_installed
The RPM version will appear: itom-di-pulsarudx-<RPM version>
4. Run the following command and manually update the file with the latest RPM version from the compared files in step 3:
cat $HOME/.coso_rpm_installed/ itom-di-pulsarudx-<RPM version>
5. Run the following command to complete the RPM install:
./dbinit.sh <option>
Options usage: dbinit.sh [-h|-?|--help] [-p|--preview] [-s|--silent|--suppress] [-w|--dbapass]
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 258
AI Operations Management - Containerized 24.4
[Vertica][VJDBC](5156) ERROR: Unavailable: initiator locks for query - Locking failure: Timed out I locking Table:itom_di_scheduler_provi
der_default.stream_microbatch_history. . Your current transaction isolation level is SERIALIZABLE
Cause
This error appears when the Data Retention process runs in the background. This process deletes the old microbatch data
that's greater than the retention period and if lane workers try to insert new microbatch data into stream_microbatch_history
table at the same time.
Solution
You can ignore this warning. The lane worker will retry and stream the message again and consume the message.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 259
AI Operations Management - Containerized 24.4
Symptom
After executing obm-configurator.jar script to integrate OBM with the OPTIC DL, events aren't getting forwarded to OPTIC DL
from OBM and RUN TEST command from the connected server results in a NullPointerException error.
Cause
The log file opr-event-sync-adapter.log shows that the integration Groovy script times out in the init method:
Solution
Increase the timeout for init in OBM by executing the following commands:
On Linux:
On Windows:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 260
AI Operations Management - Containerized 24.4
Cause
This issue is because the OPTIC DL Message Bus topic consumer got disconnected and reconnected. The reconnect may
happen before the consumer resets in the server. This causes the consumer to conflict with its own subscription. The retries
in the consumer resolve the issue.
Solution
You can ignore this warning message.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 261
AI Operations Management - Containerized 24.4
Cause
This issue is because the worker node goes down and the data isn't published to topics even after the node comes up.
Solution
Perform the following steps to resolve this issue:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 262
AI Operations Management - Containerized 24.4
org.springframework.orm.ObjectOptimisticLockingFailureException: Batch update returned unexpected row count from update [0];
actual row count: 0; expected: 1; statement executed: HikariProxyPreparedStatement@141103738 wrapping com.vertica.jdbc.VerticaJ
dbc4PreparedStatementImpl@448a2be4; nested exception is org.hibernate.StaleStateException: Batch update returned unexpected ro
w count from update [0]; actual row count: 0; expected: 1; statement executed: HikariProxyPreparedStatement@141103738 wrapping
com.vertica.jdbc.VerticaJdbc4PreparedStatementImpl@448a2be4
IV lock table - deadlock error Deadlock IV locking Table:itom_di_metadata_provider_default.FIELD_TAG. IV held by [user vertica_rwuse
r
Cause
This issue is because of the deadlock in the Vertica tables and the itom-di-metadata-server pod couldn't perform all the table
update operations.
Solution
To resolve this issue, restart the itom-di-metadata-server pod to rerun all the modification updates in Vertica for the changes to
appear in the tables. Perform these steps:
1. Log on to the master node and run the following command to stop the itom-di-metadata-server pod:
kubectl -n <suite namespace> scale deployment itom-di-metadata-server --replicas=0
2. Run the following command to start the itom-di-metadata-server pod:
kubectl -n <suite namespace> scale deployment itom-di-metadata-server --replicas=1
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 263
AI Operations Management - Containerized 24.4
Rolling back microbatch: [Vertica][VJDBC](5948) ROLLBACK: Local temporary objects may not specify a schema name
java.sql.SQLSyntaxErrorException: [Vertica][VJDBC](5948) ROLLBACK: Local temporary objects may not specify a schema name
Cause
This issue is due to the accidental dropping of the rejected table. The data streaming stops when there are no rejected tables
present in Vertica.
Solution
Perform these steps to resolve this issue:
For example:
For example:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 264
AI Operations Management - Containerized 24.4
Error getting topic partitions metadata: ConnectError or Failed to establish connection: Connection refused
Cause
This issue is because the OPTIC DL Vertica plugin fails to establish connections due to the OPTIC DL Message Bus port not
being open or OPTIC DL SSL certificate issues. The OPTIC DL Message Bus client fails to interact with OPTIC DL Message Bus
and the message streaming gets affected. As a result, the data streaming to the database gets affected.
To further check the ERROR message and the reason from the scheduler.log , run the following command:
Solution
Follow these steps to resolve this issue:
1. Run the following command to get the configured OPTIC DL Message Bus Proxy or broker port from scheduler config
map:
kubectl get cm itom-di-udx-scheduler-scheduler -n opsb-helm -o yaml | grep pulsar.datasource.port
The output similar to the following appears:
pulsar.datasource.port: "31051"
2. Verify that the configured pulsar port is available and reachable from all the Vertica nodes.
3. Verify the OPTIC DL SSL certificates are valid.
4. Restart the Vertica database.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 265
AI Operations Management - Containerized 24.4
Cause
This is because the streaming and bulk upload channels store metrics into OPTIC Data Lake simultaneously. Also, Vertica
database queries for these operations are with high resource demands. This error message indicates that the query was
unable to execute as the available resources at that point in time in Vertica didn't match the estimated resource
requirement. The resource is memory in MB.
Solution
You must configure resource pools for different operations recommended by the suite. The parameters of the resource pools
depend on the concurrent load from the operation.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 266
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 267
AI Operations Management - Containerized 24.4
Cause
This error message appears in the /var/log/messages before the database goes into an unexpected shut down. This is because
the deployments have high RAM provisioning and low disk I/O.
Solution
To resolve this issue, set the dirty tuning ratios for the OS of the node where the issue occurred as mentioned in the Vertica
documentation: Tuning Parameters.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 268
AI Operations Management - Containerized 24.4
Cause
This issue occurs if the Vertica database is down or if the database connection details provided during initial configuration
isn't correct.
Solution
Try the following solutions one by one:
1. Verify the configuration parameters for the database connection are correct.
2. Verify if the Vertica database node is running using the adminTools interface.
3. If there are any configuration changes for any of the particular pod, restart the pod for the new configurations to get
updated.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 269
AI Operations Management - Containerized 24.4
Cause
The longer downtime may occur during the upgrade process. This is because the OPTIC Data Lake is inactive for more than
16 hours and correlated group of events does not appear in OBM.
Solution
This is a known issue in Kubernetes. Delete the cron jobs definitions and resubmit again.
rm community-detector-job.yaml
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 270
AI Operations Management - Containerized 24.4
Cause
This issue may occur when there could be a topology synchronization issue.
Solution
1. Check if the selected CI's “monitored_by” property contains value as “SiteScope” i.e. monitored by SiteScope.
2. In OBM UI, select Administration > Setup and Maintenance > Infrastructure Settings > Foundations. Select
ITOM Intelligent Data Lake and check if Data Receiver Endpoint URL is configured.
3. If data receiver endpoint is configured but still data source is not getting listed in PD UI then there may be an issue with
Data access Endpoint not being available. Contact Support team.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 271
AI Operations Management - Containerized 24.4
Run the following queries on the Vertica node for further check:
Note
For the tenant name ' provider ' and deployment name ' default ' the schema names are: itom_di_configuration_provider_
default , itom_di_metadata_provider_default , mf_shared_provider_default .
1. In the schema itom_di_configuration_<tenant>_<deployment> , for table DATASET , make sure that the configuration isn't
present:
Select * from "itom_di_configuration_<tenant>_<deployment>"."DATASET" where name ilike '<Dataset Name>';
2. In the schema itom_di_metadata_<tenant>_<deployment> , for table FIELD_METADATA , make sure the metadata is present:
Select * from "itom_di_metadata_<tenant>_<deployment>"."FIELD_METADATA" where dataset_name ilike '<Dataset Name>';
3. In the schema mf_shared_<tenant>_<deployment> , make sure that the table is present:
Select * from "mf_shared_<tenant>_<deployment>"."<Dataset Name>";
Cause
This issue is because the DELETE call sent by the itom-di-administration pod isn't received or not processed by the itom-di-meta
data-server pod.
Solution
Perform these steps on the Vertica node to resolve this issue:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 272
AI Operations Management - Containerized 24.4
Cause
This issue is because, during the OPTIC DL Vertica Plugin initialization or reconfiguration, the SSL directory path may get
initialized with unwanted characters. This may result in the creation of these empty folders in the Vertica catalog directory.
Solution
There is no other impact on the data ingestion and the folders can be manually cleaned up. You must check all the Vertica
nodes and perform the clean up on each of the Vertica nodes. Follow these steps:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 273
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 274
AI Operations Management - Containerized 24.4
Cause
This issue occurs during the suite installation when the itom-di-metadata-server pod gets restarted during the start of the pod
which causes the metadata schema to go into inconsistent state.
1. After the suite installation run the following command to check the pod status:
kubectl get pods -n <suite namespace>
The status of all the pods appears. The following output appears for the itom-di-metadata-server and itom-di-data-access-dp
l pods not running completely:
-----------------------------------------------------------------
SQL State : <SQL state>
Location : db/migration/<schema migration file name >.sql (<File Location with issue>)
Solution
Perform these steps to resolve this issue:
Note
Perform these steps only in the case of a new installation of the suite. In a managed Kubernetes deployment, a reference to the
master node in this topic implies bastion node.
1. Run the following command on the master node to scale down the itom-di-metadata-server pod:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 275
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 276
AI Operations Management - Containerized 24.4
The Grafana Pulsar-Bookie Metrics dashboard, ReadOnly Bookies panel displays value indicating that the pod
disk is 95% filled. This means that the disk space is low. The Healthy pane, the Writeable Bookies and writable bookies
(percentage) panels display lower values than configured.
Tip
You can check the number of configured Bookies on the Grafana Pulsar-Bookie Metrics
dashboard.
On further check, ITOM DI/Pulsar Overview dashboard, Pod overview pane, the Bookie and Bookies up panels display
lower values than configured.
The default value of EnsembleSize is 2 . For a production setup (other than the low footprint deployment) the default value is
2.
If the number of writable Bookies is less than the EnsembleSize configured, the application stops working.
Perform the steps mentioned in any one of the solutions to resolve the issue so that you have a minimum of two writable
Bookies.
Solution 1
This solution is applicable for the application deployed on AWS and gives you steps to increase the storage volume that helps
the system handle the additional load.
If you have set up AWS EBS dynamic volumes and deployed the application, you can edit the storage class and increase the
capacity of the PVC volume. Follow these steps to resolve this issue:
Solution 2
This solution gives you steps to add more worker nodes to handle the additional load.
To resolve this issue you must add additional itomdipulsar-bookkeeper pods. This solution requires additional worker nodes and
storage but, enables the OPTIC DL Message Bus to function without data loss.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 277
AI Operations Management - Containerized 24.4
For example: If the existing itomdipulsar-bookkeeper replicas are 3, the command to scale up is:
4. Go to the ITOM DI/Pulsar - Bookie Metrics dashboard and check if the Writable Bookies panel displays a minimum
of two BookKeeper pods as writable.
Solution 3
This solution is applicable for the application deployed on embedded Kubernetes and gives you steps to clean up the
BookKeeper pod data that aren't consumed by the application. This relieves the load on the system. However, there will be a
minor loss of data that isn't consumed by the application.
When two or more itomdipulsar-bookkeeper pods reach read-only state, the applications can't ingest data to the OPTIC DL
Message Bus. Applications can consume data from the OPTIC DL Message Bus but won't be able to acknowledge the
messages. To resolve this issue, you must clean up the itomdipulsar-bookkeeper pod data for the pods that aren't available.
With this solution, you will lose the messages that aren't yet consumed by the application.
2. On each worker node, clean up the volumes used by OPTIC DL Message Bus itomdipulsar-bookkeeper and itomdipulsar-zook
eeper pods. You must clean up only the contents of the volume folder. Run the following commands:
Note down the VOLUME names for the itomdipulsar-bookkeeper and itomdipulsar-zookeeper pods.
The default value for the Vertica Streaming Loader is: itom_di_scheduler_provider_default
For example:
You will see that the pods aren't in the Running state. The pods will run after you complete the next step.
5. Run the following command to bring the pods to the running state. For noop= true , you must use a parameter that isn't
used in any of the application charts:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 278
AI Operations Management - Containerized 24.4
helm upgrade <release name> <application chart path> -n <application namespace> --set noop=true -f <values YAML filename>
--no-hooks
6. Go to the ITOM DI/Pulsar - Bookie Metrics dashboard and check if the Writable Bookies panel displays a minimum
of two BookKeeper pods as writable.
7. Run the following commands to restart the itom-di-administration pod:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 279
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 280
AI Operations Management - Containerized 24.4
Cause
This issue is because of the following:
you have configured the Local Volume Provisioner's PV with disk sizes less than the default size
OR
the disk size isn't the same as the size in the values YAML file.
Solution
Perform these steps to resolve this issue:
1. Run the following command and note down the PVC names for the storage class:
kubectl get pvc -n <suite namespace>
The default name for the storage class is fast-disks . If you have provided any other name for the storage class note
down the storage class name and run the following command to list the PVC:
kubectl get pvc -n <suite namespace> | grep <storage class name> | grep -v NAME | awk '{print $1}'
2. Run the following command to delete each of the PVC:
kubectl -n <suite namespace> delete pvc <PVC name>
3. Run the following commands to delete the itomdipulsar-bookkeeper and itomdipulsar-zookeeper StatefulSet pods:
kubectl get statefulset -n <suite namespace>
Note down the itomdipulsar-bookkeeper and itomdipulsar-zookeeper StatefulSet pod names and run the following
commands:
kubectl delete statefulset -n <suite namespace> <itomdipulsar-bookkeeper pod name>
kubectl delete statefulset -n <suite namespace> <itomdipulsar-zookeeper pod name>
4. Run the following commands and note down the PV names for zookeeper-data , bookkeeper-journal and bookkeeper-ledgers :
kubectl get pv | grep <storage class name> | grep -v NAME | awk '{print $1}'
5. Run the following command and note down the Capacity of each of the PV:
kubectl describe pv <PV name>
6. You must update the values YAML file with the values from the earlier steps and upgrade the helm chart.
Follow these steps to update the YAML file and upgrade the helm chart:
1. Update the values YAML file for OPTIC DL Message Bus with the Capacity values noted down from the earlier step:
bookkeeper:
volumes:
journal:
name: "journal"
size: "<DISKSIZE>"
ledgers:
name: "ledgers"
size: "<DISKSIZE>"
zookeeper:
volumes:
data:
name: "zookeeper-data"
size: "<DISKSIZE>"
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 281
AI Operations Management - Containerized 24.4
Cause
This issue is because the node is restarted during the data flow. The itomdipulsar-zookeeper log file reports the following
exception:
Solution
Perform these steps to resolve this issue:
1. On the control plane run the following command to scale down the itomdipulsar-zookeeper pod:
kubectl scale statefulset itomdipulsar-zookeeper --replicas=0 -n <suite namespace>
2. Run the following command to get the PV and the PVC attached to the itomdipulsar-zookeeper pod in
CrashLoopBackOff state:
kubectl get pvc -n <suite namespace>
3. Run the following command to describe the PV:
kubectl describe pv of zookeeper data
4. From the output, go to the location of the PV on the node where it's running.
5. Delete the corrupted file reported in the itomdipulsar-zookeeper log file.
6. Run the following command to scale up the itomdipulsar-zookeeper pod:
kubectl scale statefulset itomdipulsar-zookeeper --replicas=1 -n <suite namespace>
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 282
AI Operations Management - Containerized 24.4
Cause
This is because of a failure in the creation of one or more of the Postload internal topics in the OPTIC Data Lake Message Bus.
Sometimes, the topic partitions aren't successfully created even though topic creation has returned success in OPTIC Data
Lake Message Bus. This is because of an issue in OPTIC Data Lake Message Bus:
https://github.com/apache/pulsar/issues/9173
For further debugging, you can check the taskcontroller.log file if the following error messages appear:
2021-06-07 08:54:29.036 INFO 106 [main] --- c.m.itomdi.utils.manager.TopicCreator : Going to create topic : di_postload_task_topic
with partition count 1
2021-06-07 08:54:29.065 INFO 106 [main] --- c.m.itomdi.utils.manager.TopicCreator : Namespace itomdipostload already exists
2021-06-07 08:54:29.470 ERROR 106 [main] --- c.m.i.u.p.PostloadProcessorProducer : Exception in creating producer for the topic: p
ersistent://public/itomdipostload/di_postload_task_topic.
Solution
Perform the following steps to resolve this issue:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 283
AI Operations Management - Containerized 24.4
Cause
This issue may be because of insufficient memory in the pod.
Solution
Perform the following steps to resolve the issue:
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 6515m (54%) 38150m (317%)
memory 19885Mi (98%) 49904Mi (246%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
The output displays the requested memory for the worker. If the percentage used in the Requests column is high, 98%
for example, it's likely additional pods won't schedule on this worker. The output displays a list of memory requested
from pods on the running system.
In the output, scroll to the section Non-terminated Pods similar to the following:
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AG
E
core fluentd-w589p 105m (0%) 500m (4%) 205Mi (1%) 650Mi (3%) 3h19m
core itom-logrotate-xs6h7 100m (0%) 200m (1%) 100Mi (0%) 200Mi (0%) 3h19
m
The list of pods and their requested usage appears. You may see some pods with zero memory requested. If a pod isn't
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 284
AI Operations Management - Containerized 24.4
executed yet but scheduled, it won't show any memory allocated. With this output, you can find what's deployed on the
node and the requested memory to schedule the pod. The stateful sets have an additional dependency in Kubernetes. If
you have multiple replicas, they will start in a numerical order. They will wait to indicate ready until the next replica is
up and running successfully. Therefore, for example, in a replica set, the bookkeeper or zookeeper pods fail to start
instance 0, it's possible that there is no enough memory on other workers to schedule the later replicas for execution.
You will have to check the other workers memory usage.
3. Make sure to calculate the memory according to the requirement and increase the memory of the worker nodes.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 285
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 286
AI Operations Management - Containerized 24.4
1. You mustn't alter or save the changes in the OOTB ITOM DI dashboard.
2. Create a copy from the ITOM DI dashboard and then add panels.
3. You must save the dashboard with a different name. Make sure that the new name isn't the same as the existing OOTB
ITOM DI dashboards.
4. You must create and save the custom dashboard with a different name and in a different folder.
Alter a dashboard
You mustn't alter or save the changes in the OOTB ITOM DI dashboard. You may create a copy or duplicate the required
dashboard and then use it to alter.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 287
AI Operations Management - Containerized 24.4
Cause
This issue is because the interval level in the ITOM DI/Vertica Streaming Loader dashboard isn't set to 1.5 times more the
value of scrapeIntervalSec in the values YAML file.
The monitoring.verticapromexporter:scrapeIntervalSec: 60 parameter in the values YAML and the interval mentioned in the ITOM
DI/Vertica Streaming Loader dashboard are dependent on each other. The default value of
the scrapeIntervalSec parameter in the values YAML file is 60 seconds.
If you have edited scrapeIntervalSec value make sure to change the interval level in the dashboard.
Solution
In the ITOM DI/Vertica Streaming Loader dashboard, considering the scrapeIntervalSec as 60 sec , update the interval level
as 2 mins or 120 sec .
You must update the following panels with the interval level:
1. Pod Overview
Avg message ingestion rate
Avg ingestion throughput (bytes/sec)
2. Data Flow Summary
Scheduler message ingestion rate (all topics)
Scheduler ingestion throughput (bytes/sec) (all topics)
3. Per Topic
Scheduler message ingestion rate
Scheduler read rate
4. Per Partition
Scheduler message ingestion rate
Scheduler read rate
Perform the following steps in each of the panels to view the data:
2. Click Edit.
3. In Metrics field, edit the value from sum(rate(vertica_pulsar_udx_message_count[1m])) to sum(rate(vertica_pulsar_udx_messag
e_count[2m]))
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 288
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 289
AI Operations Management - Containerized 24.4
Cause
The issue is either because the Vertica Streaming Loader isn't able to pull data from OPTIC DL Message Bus to these topics.
OR there may be an issue in Vertica resource usage which halts a few of the data streams.
Solution
Perform the following steps to resolve the issue:
1. Click the drop-down icon on the Top 10 topics backlog panel and go to the ITOM DI/ Pulsar - Topic dashboard.
2. Form the Topic drop-down, select the particular topic for which the message backlog is greater than 10k .
3. Scroll down to the Local msg batch backlog panel and hover the cursor over the graph to find which subscription is
having the message backlog for that topic.
4. If the subscription has itom_di_scheduler in the subscription name, go to the ITOM DI/Vertica Streaming
Loader dashboard.
5. In the ITOM DI/Vertica Streaming Loader dashboard, see the Scheduler message ingestion rate panel.
Select the topic for which the message backlog is higher.
If the ingestion rate is 0 for that topic, then there are some errors in the streaming of data from that particular topic. Follow
these steps:
1. Run the following command to check the Vertica Streaming Loader logs:
kubectl -n <suite namespace> logs itom-di-udx-scheduler-<pod value> -c itom-di-udx-scheduler-scheduler
2. From this log information check the exact cause of the issue and the remediation steps.
If the ingestion rate is > 0 for that topic, it's because the data is getting streamed to Vertica but the streaming rate is slow.
Follow these steps:
2. In the Vertica dashboard, check the CPU usage, memory usage, and Resource pool memory usage.
3. If any of these panels display resource issues, check the Vertica logs. Go to the following location on the Vertica system
to check the logs:
<catalog-path>/<database-name>/<node-name>_catalog/vertica.log
4. From this log information check the exact cause of the issue and the remediation steps.
If the cause of the issue isn't clear from logs, collect the log details and contact Software Support.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 290
AI Operations Management - Containerized 24.4
Possible causes
Cause 1: Data set and micro batch information aren't available
Cause 2: Communication issue between the Vertica node and OPTIC DL Message Bus Proxy services
Cause 3: Data is in the Vertica database rejected tables
Cause 4: Mismatch in the mapping of a topic name to the corresponding table name or missing configuration
Solution
If the data table is present, but micro batch (stream source) isn't present in stream_sources , follow these steps:
Tip
2. Run the queries mentioned in steps 2 and 3 and check if stream_sources have the micro batch information.
Note
In a managed Kubernetes deployment, a reference to the master node in this topic implies a bastion
node.
1. The disystemcheck tool is available when you install the OPTIC DL Vertica plugin. It's available in the /usr/local/itom-di-puls
arudx/bin/ folder. If you are testing on any other node, you must copy this tool to that node. Go to the /usr/local/itom-di-pul
sarudx/bin/ folder on the Vertica node.
2. Run disystemcheck tool as follows:
./disystemcheck -h <master hostname>
where:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 291
AI Operations Management - Containerized 24.4
-h : is the hostname of the Kubernetes master or worker that you want to check.
Note
If the OPTIC DL service ports are different than the default ones, use the disystemcheck CLI options, to specify the
ports.
3. If there are issues in connectivity or port availability, the tool gives error messages that you must fix.
If the data is available in the rejected table, check the rejection reason in the same table and fix the issue.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 292
AI Operations Management - Containerized 24.4
Case 1 - The data isn't loaded to the Vertica database. The Task Controller and Task Executor pods are up and running.
However, in the Postload Detail dashboard, the Taskflow drop-down doesn't list the taskflows.
To resolve this issue, follow the steps mentioned in the troubleshooting scenario Postload Detail dashboard Taskflow drop-
down doesn't list the configured taskflows.
The state and status for each taskflow and details of the configured tasks.
The Task Controller and Task Executor pod memory usage details like CPU, memory usage, and direct memory.
The following table provides you the details of the Postload detail dashboard:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 293
AI Operations Management - Containerized 24.4
This panel provides an overview of memory utilized by the running pods, job, and task details.
Task details - The list of tasks with details for selected taskflow. This panel displays the following information:
Taskflow a. Task id
statistics b. Task name
c. Taskflow id
d. Taskflow name
e. Task state
f. Task status
g. Start time
h. End time
i. Retry count
j. Task exec time exceeded
This panel provides the Memory and CPU usage information by the task controller and task executor pods.
1. Task Controller memory usage - The memory used over time by the itom-di-postload-taskcontroller pod.
2. Task Controller CPU usage - The CPU usage over time by the itom-di-postload-taskcontroller pod.
3. Task Executor memory usage - The memory used over time by the itom-di-postload-taskexecutor pod.
Pod details 4. Task Executor CPU usage - The CPU usage over time by the itom-di-postload-taskexecutor pod.
5. Task Controller direct memory usage - The direct memory used over time by the itom-di-postload-taskcont
roller pod.
6. Task Executor direct memory usage - The direct memory used over time by the itom-di-postload-taskexecu
tor pod.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 294
AI Operations Management - Containerized 24.4
Possible causes
Cause 1: Due to missing task flow configuration
Cause 2: The itom-di-administration pod isn't running
Solution
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 295
AI Operations Management - Containerized 24.4
Case 1 - Data isn't loaded to Vertica. In the Postload Overview dashboard, the Pod overview panel shows the pods aren't in a
running state and the meter isn't at 100%.
If you observe that the pods aren't running, perform the following troubleshooting steps:
Case 2 - The data isn't loaded to the Vertica database. In the Postload Detail dashboard, the Taskflow drop-down lists
the taskflow, but the taskflow doesn't run. In the Postload Overview dashboard, the Failed non-recoverable tasks
panel displays tasks that are erroneous.
To resolve this issue, follow the steps mentioned in the troubleshooting scenario Postload task flow not running.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 296
AI Operations Management - Containerized 24.4
The tasks and taskflow that are present and their states
The details of all the configured taskflows.
You can click the Postload detail dashboard button to go to the detailed dashboard.
The following table provides you the details of the Postload Overview dashboard:
Panel
Description
name
This panel provides an overview of Postload processing pod status and memory usage.
This panel provides an overview of tasks and task flows in Postload processing.
Failed non-recoverable tasks - The list of all tasks in the FAILED_NON_RECOVERABLE state. When a task in the data
processing flow fails, it means that it has retried for the configured number of times (=1440 for bulk load - approx. 1 day) and
then the task status set to FAILED_NON_RECOVERABLE .
This panel displays the following information:
1. Task id
2. Task name
3. Taskflow id
4. Taskflow name
5. Task state
6. Task status
7. Status info
8. Start time
9. End time
10. Retry count
11. Taskflow state
12. Taskflow status
Task exceeded maximum time - The list of tasks that exceed the defined maximum time. The value can be 0 or 1 . The
value 0 indicates the task hasn't exceeded the maximum time defined and 1 indicates the task has exceeded the maximum
running time.
1. A taskflow selected for execution is in SCHEDULED state and status NONE . All the tasks in this task flow are
in READY state at this time.
2. If a taskflow is in RUNNING state and status NONE , that means the taskflow is running or is about to run with the
checks passed. One of its tasks can be in DISPATCHED / RUNNING or all tasks can be in READY state at this time.
3. If a taskflow is in FINISHED state and status SUCCESS , that means all the tasks finished with status SUCCESS .
4. If a taskflow is in FINISHED state and status WARNING , that means one or more tasks finished with status SUCCESS_WI
TH_WARN or FAILED_RECOVERABLE and the other tasks finished with status SUCCESS .
5. If a taskflow is in FINISHED state and status ERROR , that means one of the tasks finished with the status FAILED_NON_
RECOVERABLE .
1. If a task is in state READY and status NONE , that means the task is in a taskflow that's either scheduled to run or is
currently running.
Taskflow 2. If a task is in state SCHEDULED and status NONE , that means the itom-di-postload-taskcontroller pod has identified
overview the task to will go to the itom-di-postload-taskexecutor pod for running.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 297
AI Operations Management - Containerized 24.4
3. If a task is in state DISPATCHED and status NONE , that means the itom-di-postload-taskcontroller pod has sent the
Panel
Description
task to the itom-di-postload-taskexecutor pod for running.
name
4. If a task is in state RUNNING and status NONE , that means the task is running in the itom-di-postload-
taskexecutor pod.
5. If a task is in state FINISHED and status SUCCESS , that means the task ran successfully.
6. If a task is in state FINISHED and status SUCCESS_WITH_WARN , that means the task run finished with a warning.
7. If a task is in state FINISHED and status FAILED_RECOVERABLE , that means the task run failed and its retry count is
less than or equal to maximum retries.
8. If a task is in state FINISHED and status FAILED_NON_RECOVERABLE , that means the task run failed and its retry count
exceeds maximum retries.
Last 10 executed Taskflows - The list of the last 10 executed taskflows with details. This panel displays the following
information:
1. Taskflow id
2. Taskflow name
3. Taskflow state
4. Taskflow status
5. Status info
6. Start time
7. End time
Taskflow details - The list of taskflows with details based on the selected time range. This panel displays the following
information:
1. Taskflow id
2. Taskflow name
3. Taskflow state
4. Taskflow status
5. Status info
6. Start time
7. End time
Task details - The list of tasks with details based on the selected time range. This panel displays the following information:
1. Task id
2. Task name
3. Taskflow id
4. Taskflow name
5. Task state
6. Task status
7. Start time
8. End time
9. Retry count
10. Taskflow state
11. Taskflow status
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 298
AI Operations Management - Containerized 24.4
Cause
The issue is because of the following:
Solution
Perform the following steps to resolve the causes:
1. In the ITOM DI/Data Flow Overview dashboard, click the drop-down icon on the Request error rate panel and go to
the Receiver dashboard.
2. In the ITOM DI/Receiver dashboard, check if the receiver pod is running from the Pod overview > Receiver and
the Data Flow Overview dashboard panels.
3. Check the Topic drop-down if the list displays the required topic.
4. If the pods aren't running or the topics aren't listed, run the following commands to check the OPTIC DL HTTP Receiver
logs:
From the log information check the exact cause of the issue and the remediation steps.
If the cause of the issue isn't clear from the logs, collect the log details and contact Software Support.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 299
AI Operations Management - Containerized 24.4
Cause
The issue is because either OPTIC DL HTTP Receiver isn't able to forward the messages OR the OPTIC DL Message Bus isn't
accepting messages.
Solution
Perform the following steps to resolve the issue:
3. Check if the average storage write latency is between 20 to 50 ms (or according to the suite requirement).
If the average storage write latency is greater than 1 second, it's because the OPTIC DL Message Bus isn't able to write
data to its persistent store. Or the write to the BookKeeper pod is slow.
4. Open the ITOM DI/Pulsar - Bookie Metrics dashboard for further debugging.
5. In ITOM DI/Pulsar - Bookie Metrics dashboard, check the Writable Bookies for the number of writable bookies and
Writable bookies (percentage).
6. If one or more BookKeeper has gone to a read-only state, the Writable bookies (percentage) appear less than 100.
8. If there are errors related to disk space utilization crossing 95%, increase the OPTIC DL Message Bus Bookkeeper
component replicas.
If the cause of the issue isn't clear from logs, collect the log details and contact Software Support.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 300
AI Operations Management - Containerized 24.4
Cause
This issue is because, with the time range increase in the dashboard, the query results for the panels increase and takes time
to populate. In the case of panels with graphs, the data points appear slow due to increased query results.
Solution
To resolve this issue, don't use the relative time range to view the data. Instead, you must use the absolute time range of the
period (that's lesser than 24 hours) for which you want to view the data. Follow these steps:
1. Log on to the ITOM DI dashboard for which you want to view the historical data.
2. Click .
3. In the Absolute time range section type the From and To time.
For example:
4. Click the Apply time range. You will be able to view the data.
If you prefer to save the dashboard, you must save it with a new name.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 301
AI Operations Management - Containerized 24.4
Cause
This issue is because the requests are reaching the OPTIC DL HTTP Receiver but the messages aren't getting published to the
OPTIC DL Message Bus successfully.
Solution
To resolve this issue, follow these steps:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 302
AI Operations Management - Containerized 24.4
Cause
This issue is because some or all requests coming to the OPTIC DL HTTP Receiver server are failing.
Solution
To resolve this issue, follow these steps:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 303
AI Operations Management - Containerized 24.4
Cause
This issue is because some or all the itom-di-receiver-dpl pods aren't in the running state.
Solution
To resolve this issue, check the status of the itom-di-receiver-dpl pod. Perform the solutions as mentioned for each of the
following pod states:
ImagePullBackoff : Check if pods are able to successfully pull image from the repository.
Pod Initializing : Check if the itomdipulsar-bookkeeper , itomdipulsar-broker , itomdipulsar-proxy , itomdipulsar-zookeeper, and itom-
idm pods are running. This is because the itom-di-receiver-dpl pod performs a dependency check on these pods.
Pending : Run the command kubectl desc itom-di-receiver-dpl-<pod value> and check if there is any memory or CPU crunch
on the CDF cluster.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 304
AI Operations Management - Containerized 24.4
Cause
This issue is because the ITOM DI/Vertica dashboard needs a memory resource pool to display metrics on the dashboard.
The dashboard runs the queries to get data from the Vertica database. By default, the dashboard uses the general resource
pool. Due to increased load in the general resource pool, the queries from the dashboard get rejected and metric collection
isn't complete. The dashboard appears blank for that period. To avoid this issue you must create a dedicated resource pool
for the dashboard.
Solution
Perform the following steps to resolve this issue:
monitoring
verticapromexporter
config
monitoringResourcePool: <resource pool name for dashboard>
For example:
monitoring
verticapromexporter
config
monitoringResourcePool: itom_monitor_respool_provider_default
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 305
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 306
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 307
AI Operations Management - Containerized 24.4
kubectl -n <application namespace> exec -ti <bastion pod-<POD value>> -c pulsar -- bash
The topics list appears. This confirms the creation of topics. You must note down the topic name.
For example, if the output of the command is: "persistent://public/default/di_task_status_topic-partition-0" , you must
note down the topic name: di_task_status_topic .
4. Run the following command to check the topic details:
For example:
The output appears with the ----- got message ----- separator. This confirms the creation of messages with the data.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 308
AI Operations Management - Containerized 24.4
1. Run the command to get the broker pod: kubectl get pods -n <suite namespace> . Note down the broker pod name.
2. Run the command to log on to the broker container: kubectl -n <suite namespace> exec -ti <broker pod-<POD value>> -c ito
mdipulsar-broker bash
3. Run the command to list the topics: ./bin/pulsar-admin topics list public/default . Note down the topic name.
4. Run the command to consume the message from the topic: ./bin/pulsar-client consume -s test-subscription --subscription-mod
e NonDurable -n 0 <topic name>
5. Open a new session and repeat steps 1 and 2.
6. Run the command to send the message to the topic: bin/pulsar-client produce -m hi <topic name> . The message hi appears
in the consumer session.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 309
AI Operations Management - Containerized 24.4
When the worker node is in an unrecoverable state, the OPTIC DL Message Bus itomdipulsar-autorecovery , itomdipulsar-bastion ,
itomdipulsar-bookkeeper , and itomdipulsar-zookeeper pods running on a Kubernetes worker node will be in an error state
(pending/terminating). Even when one of the pods is in an error state, the data ingestion and consumption to and from the
OPTIC DL Message Bus continues to work. However, there may be a performance degradation as the load needs to be
handled by a cluster that has one less node and the OPTIC DL Message Bus pod. To restore the Kubernetes cluster to its
original capacity, you must add a new worker node with a local disk provisioned manually. For more information, see the
Install section.
For the application deployed on embedded Kubernetes, follow these steps to recover the OPTIC DL Message Bus pods from
the error state to the running state:
1. Run the following commands to delete the itomdipulsar-autorecovery-0 and itomdipulsar-bastion-0 pods that are in a
terminating state:
kubectl -n <application namespace> delete pod itomdipulsar-autorecovery-0 --force
kubectl -n <application namespace> delete pod itomdipulsar-bastion-0 --force
2. Run the following command to delete the itomdipulsar-zookeeper-<pod value> pods and the attached PVC that are in a
terminating state:
kubectl delete pod itomdipulsar-zookeeper-<pod value> -n <application namespace> --force ; kubectl -n <application namespace>
delete pvc itomdipulsar-zookeeper-zookeeper-data-itomdipulsar-zookeeper-<pod value> --force
3. Run the following command to delete the itomdipulsar-zookeeper-<pod value> pods that are in pending state:
kubectl delete pod itomdipulsar-zookeeper-<pod value> -n <application namespace> --force
4. Run the following command to verify if the itomdipulsar-zookeeper-<pod value> pods are running in the new node:
kubectl get pods -n <application namespace> -o wide
5. Log on to the itomdipulsar-bastion-0 pod:
kubectl exec -it itomdipulsar-bastion-0 -n <application namespace> -c pulsar bash
6. Run the following command to list the bookie IDs and get the ID of the problematic bookie:
/pulsar/bin/pulsar-admin bookies list-bookies
Note down the ID of the bookie.
7. Run the following command to decommission the problematic bookie:
./bin/bookkeeper shell decommissionbookie -bookieid <ID noted in step 7>
8. Run the following command to check if the bookie is removed:
/pulsar/bin/pulsar-admin bookies list-bookies
9. Run the following command to delete the itomdipulsar-bookkeeper-<pod value> pods and the attached PVC that are in a
terminating state:
kubectl delete pod itomdipulsar-bookkeeper-<pod value> -n <application namespace> --force ; kubectl -n <application namespace>
delete pvc itomdipulsar-bookkeeper-journal-itomdipulsar-bookkeeper-<pod value> --force ; kubectl -n <application namespace> dele
te pvc itomdipulsar-bookkeeper-ledgers-itomdipulsar-bookkeeper-<pod value> --force
10. Run the following command to delete the itomdipulsar-bookkeeper-<pod value> pods that are in pending state:
kubectl delete pod itomdipulsar-bookkeeper-<pod value> -n <application namespace> --force
11. Run the following command to check if the itomdipulsar pods are running:
kubectl get pods -n <application namespace> -o wide| grep -i pulsar
The list of pods appears with Status as Running or Completed.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 310
AI Operations Management - Containerized 24.4
Cause
This issue is because the ports aren't open or due to the connectivity issues between the Vertica node and OPTIC DL Message
Bus Proxy.
Solution
You can use the disystemcheck tool to check the connectivity. Perform the following steps:
Note
In a managed Kubernetes deployment, a reference to the master node in this topic implies bastion
node.
1. The disystemcheck tool is available when you install the OPTIC DL Vertica plugin. It's available in the /usr/local/itom-di-puls
arudx/bin/ folder. If you are testing on any other node, you must copy this tool to that node. Go to the /usr/local/itom-di-pul
sarudx/bin/ folder on the Vertica node.
2. Run disystemcheck tool as follows:
./disystemcheck -h <master hostname>
where:
-h : is the hostname of the Kubernetes master or worker that you want to check.
Note
If the OPTIC DL service ports are different than the default ones, use the disystemcheck CLI options, to specify the
ports.
[verticadba@vhost1 testCerts]$ ls
disystemcheck server.crt server.key
[verticadba@vhost1 testCerts]$ /usr/local/itom-di-pulsarudx/bin/disystemcheck -h demosys-master-node.net
systemcheck version 1.0
[SUCCESS] Receiver port open (30001)
[SUCCESS] MINIO port open (30006)
[SUCCESS] Pulsar Admin service operational
[SUCCESS] OpticDL Administration port open (30004)
[SUCCESS] OpticDL Administration service operational
[SUCCESS] Data Access port open (30003)
[SUCCESS] Data Access service operational
[SUCCESS] Receiver service operational
[SUCCESS] Published Pulsar message
[SUCCESS] Received Pulsar message
3. If there are issues in connectivity or port availability, the tool gives error messages that you must fix. Following is the dis
ystemcheck sample output with errors:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 311
AI Operations Management - Containerized 24.4
Log on to the Vertica node where you have installed the RPM. Go to the location /usr/local/itom-di-pulsarudx/bin and run the
command . /disystemcheck --help to see usage.
-a, --admin itom-di-administration port. If it's different than the default, type the port number. 30004
-d, --dataaccess itom-di-data-access-dpl port. If it's different than the default, type the port number. 30003
-m, --minioport itom-di-minio port. If it's different than the default, type the port number. 30006
-p, --pulsaradmin OPTIC DL Pulsar Administration port. If it's different than the default, type the port number. 31001
-t, --pulsarclient itomdipulsar-proxy port. If it's different than the default, type the port number. 31051
-r, --receiver itom-di-receiver-dpl port. If it's different than the default, type the port number. 30001
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 312
AI Operations Management - Containerized 24.4
Perform the following steps on the Vertica node to verify the RPM version is the same as the rpm installed to the database:
OR
When the OPTIC DL Vertica Plugin processes data, a history record gets written to the database. This record also
displays the version of the OPTIC DL Vertica Plugin. Log on to the Vertica database and run the following query:
select ENDING_MSG from itom_di_scheduler_default_default.microbatch_history limit 1;
You will see a similar output with the version as follows:
ENDING_MSG
---------------------
1. cd /usr/local/itom-di-pulsarudx/lib
2. sha1sum libitom-di-pulsarudx- 9.2 . 1 .so
An output similar to the following appears:
d5b89107716658c38b74d8bbf8e4c7fbc34a7142 libitom-di-pulsarudx-9.2 . 1 .so
In this output, the is the d5b89107716658c38b74d8bbf8e4c7fbc34a7142 library version.
The output d5b89107716658c38b74d8bbf8e4c7fbc34a7142 is the same for both. This confirms the currently installed version of
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 313
AI Operations Management - Containerized 24.4
Step 4 - Uninstall the wrong RPM version and clear the dbinit.sh
history
If the output is different, you must uninstall the OPTIC DL Vertica Plugin. Perform the following steps:
You can now install the OPTIC DL Vertica Plugin with the same user names used before. For more information, see the
Prepare section.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 314
AI Operations Management - Containerized 24.4
Logging details
Category Log File Name Log Location
job-fetcher.log
Discovery <log-vol>/cloud-monitoring/aws/discovery/job-fetcher/<pod_name>/
aws-collector.log
Metric <log-vol>/cloud-monitoring/aws/metric/collector/<pod_name>/
1. List down the PVC's for suite namespace and look for opsb-logvolumeclaim:
kubectl describe pvc <PVC-NAME-for-logvolumeclaim> -n <suite-namespace> | grep Volume: | sed 's/.*Volume: *//' | xargs kubectl de
scribe pv | grep 'Server:\|Path:'
For example:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 315
AI Operations Management - Containerized 24.4
# kubectl describe pvc opsb-logvolumeclaim -n opsb-helm | grep Volume: | sed 's/.*Volume: *//' | xargs kubectl describe pv | grep
'Server:\|Path:'
Server: mycomputer.example.net
Path: /var/vols/itom/opsbvol1
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 316
AI Operations Management - Containerized 24.4
Cause
The issue occurs when the Hyperscale Observability Content Pack doesn't get imported to OBM during installation.
Solution
To resolve the issue, follow the steps to manually import the Content Pack.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 317
AI Operations Management - Containerized 24.4
Cause
This issue occurs due to a failure in redirection.
Solution
You can resolve this issue by going to the Monitoring Service Overview dashboard from the dashboard section.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 318
AI Operations Management - Containerized 24.4
Cause
The issue occurs when the Hyperscale Observability related UCMDB views aren't deployed to OBM during installation.
Solution
To resolve the issue, manually deploy the UCMDB views corresponding to the monitoring type. For example, if AWS views are
missing, you should manually deploy the UCMDB views of the AWS service package.
1. Go to Administration > RTSM Administration and click Local Client to download the Local Client tool.
2. Launch the Local Client tool.
a. Extract the UCMDB_Local_Client.zip package to a location of your choice, for example, the desktop.
b. Double-click UCMDB Local Client.cmd (Windows) or UCMDB Local Client.sh (Mac). The UCMDB Local Client window
opens.
3. Add or edit login configuration for the target OBM server that you want to access.
a. Click or . The Add/Edit Configuration dialog opens.
b. Enter the following details:
Host/IP: Specify the value provided in the values.yaml for <externalAccessHost>.
Protocol: Select HTTPS as the protocol from the drop-down list.
Port: Specify the value provided in the values.yaml for <externalAccessPort>.
Target Env: Select CMS as the target environment from the drop-down list.
c. Click OK.
4. Launch RTSM UI from the UCMDB Local Client window.
a. In the UCMDB Local Client window, click the Label value for the OBM server that you want to access. The Log
In dialog opens.
b. In the Log In dialog, enter your login parameters.
c. Click Login. The RTSM UI opens in a new window.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 319
AI Operations Management - Containerized 24.4
Cause
This issue occurs if you have deployed multiple Hyperscale Observability content packs with same CI types. The recently
deployed content pack overrides the previously deployed content pack. For example, if you deploy AWS content pack and
then deploy Azure or Kubernetes content pack, the Performance Dashboard for AWS displays graphs with no data.
Currently deploying multiple Hyperscale Observability content packs with same CI types isn't supported.
Solution
To resolve this issue, redeploy the Hyperscale Observability content pack for the required collector as mentioned below:
Prerequisites
Make sure that you have enabled the containerized OBM capability along with Hyperscale Observability capability.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 320
AI Operations Management - Containerized 24.4
Perform the following steps to import Kubernetes content pack into OBM:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 321
AI Operations Management - Containerized 24.4
Cause
Hyperscale Observability supports one default domain with a collector. Trying to use it with multiple domains results the
discovery to fail.
Solution
You can resolve the issue by changing the value of PROBE_DOMAIN environment variable to the DefaultDomain setting in the i t
om-ucmdb-probe deployment.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 322
AI Operations Management - Containerized 24.4
Cause
The cause behind the issue is wrong hostname resolution. Due to this, OBM fails to forward events from Hyperscale
Observability to OPTIC Data Lake.
Solution
To resolve this issue, configure the Data Broker to use an aliased hostname.
Additional configuration
You need to perform the additional steps to add the Fully Qualified Domain Name (FQDN) to your DNS.
1. Update Kube DNS Entry on Kubernetes Deployment : Run the following command to edit the ConfigMap dns-hosts-c
onfigmap in the kube-system namespace:
kubectl edit cm -n kube-system dns-hosts-configmap
2. Update the host keys: Modify the dns-hosts-key entry in the ConfigMap to include the alias name and the IP address of
externalAccesshost . The entry should follow this format:
<IP address> <external_accesshost>_hso.<domain>
Example:
34.12.13.54 myopsb_hso.com
3. Save and apply changes: After making the changes, save the ConfigMap . Kubernetes will automatically apply the
updated DNS configuration to the cluster.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 323
AI Operations Management - Containerized 24.4
You can't see the monitoring status because of the following panic error:
goroutine 1 [running]:
The following error displays in the monitoring-admin log located in the pod:
Cause
The monitoring-admin pod can't connect to the Redis pod.
Solution
The following are the solutions to fix this error:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 324
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 325
AI Operations Management - Containerized 24.4
Problem
Deleting a credential using the CLI fails sometimes.
Solution
Run the delete command again to delete a credential configuration:
Here, <filename> is the credential configuration YAML file that you want to delete.
For example:
OR
For example:
Important
Delete the credential yaml file ( input.yaml file in the example) after deleting the
credential.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 326
AI Operations Management - Containerized 24.4
Cause
The Operations Bridge Manager (OBM) pod isn't running.
Solution
1. Verify if the OBM pod is running:
AWS collector will start sending events to OBM only if the OBM pod is running:
Make sure that the omi-0 (and omi-1 in HA) pod is running.
For example:
2. Make sure that OBM trusts the Data Broker Container. See the section Grant certificate request on OBM at Prerequisites
for using Classic OBM with Hyperscale Observability.
3. Make sure that you have deployed the MonitoringService_Threshold_Event_Mapper policy.
4. Send test data to the endpoint ( generic_event_mapper ) of the MonitoringService_Threshold_Event_Mapper policy
and verify if a corresponding event gets generated in the OBM event browser:
Run the CURL command to send test data to the policy receiver endpoint ( generic_event_mapper ):
Here:
<URL> is:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 327
AI Operations Management - Containerized 24.4
<root>
<EventDetails>
<Title>Sample message to test connectivity</Title>
<Description></Description>
<MsgKey>Config1:CI1</MsgKey>
<Category></Category>
<Subcategory></Subcategory>
<ETI></ETI>
<Node></Node>
<RelatedCI></RelatedCI>
<SubComponent></SubComponent>
<SourceCI></SourceCI>
<SourceEventId></SourceEventId>
<eventDrillDownUrl></eventDrillDownUrl>
<CloseEventPattern>^Config1:CI1$</CloseEventPattern>
<SendWithClosedStatus>false</SendWithClosedStatus>
</EventDetails>
<ViolationDetails>
<Severity>Critical</Severity>
</ViolationDetails>
</root>
For example:
Verify if an event gets generated in the OBM event browser with the title ''Sample message to test connectivity".
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 328
AI Operations Management - Containerized 24.4
Error message
Run the command:
Job Processing Failed. Err: GetDiscoveryTriggerStatusError ('Failed to get basic session credentials via sts services. error message:', 'com.a
mazonaws.SdkClientException: Unable to execute HTTP request: Remote host terminated the handshake
Cause
This error occurs if you haven't configured a proxy or configured a wrong proxy.
Solution
Set the correct proxy. For details, see Configure proxy credential and target.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 329
AI Operations Management - Containerized 24.4
level=error msg="Error : [GetResourcesWithContext : RequestError: send request failed\ncaused by: Post \https://tagging.<aws_region
>.amazonaws.com/\: EOF] while fetching the Resource for Service. [service_name]"
For details about the log location, see Troubleshoot Hyperscale Observability.
Cause
This issue may occur if the proxy isn't set in the AWS target configuration.
Solution
Perform the following steps:
1. Edit the target configuration created earlier and add the proxy details under spec .
Example:
spec:
subType: aws-region
endpoint: <aws_region>
credential: <credential_name>
proxy:
url: http://<corp_proxy>:<port>/
Here,
<aws_region> is the AWS region that you want to monitor. For example, us-east-1, ap-east-1, eu-west-2 , etc.
<credential_name> is the name of the credential created earlier.
http://<corp_proxy>:<port>/ is the proxy URL that's used to connect to the internet.
2. Run the following command to update the existing target configuration:
On Linux:
On Windows:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 330
AI Operations Management - Containerized 24.4
Cause 1
When all metrics collected have null values, data isn't sent to the OPTIC Data Lake
Cause 2
You changed the metric collection frequency to 1 minute but didn't enable detailed monitoring for EC2 services on AWS.
Solution 1
This is the expected behavior. The AWS collector will send data to the OPTIC Data Lake only if the collector is able to poll at
least one metric value from AWS.
This implies that for a given frequency if data is available on the target AWS, the AWS collector will fetch the same data and
send it to the database. If the AWS collector receives null values for all the metrics, it won't send data to the database.
Solution 2
AWS collector won't send data to the database if you have scheduled the collector to run at a frequency lesser than 5
minutes and haven't enabled detailed monitoring.
For monitoring at a higher frequency of 1 minute, you must subscribe to AWS CloudWatch Detailed Monitoring Metrics (at a 1
minute frequency). For details, see Enable or turn off detailed monitoring for your instances.
Make sure you factor in the high frequency polling when you plan for detailed monitoring.
The following table gives you information about the Hyperscale Observability Metric Collection Interval and the AWS
CloudWatch Metric Interval:
Simple Storage Service (S3) 5 minutes for buckets and requests 24 hours for buckets and 1 minute for requests
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 331
AI Operations Management - Containerized 24.4
Cause 1
The AWS collector is unable to reach the target AWS account if you haven't configured a proxy or configured a wrong proxy.
Cause 2
Wrong access key, secret key, or AssumeRole ARNs.
Cause 3
AWS account doesn't have the ReadOnlyAccess permission to connect to the CloudWatch APIs.
Cause 4
Discovery failed because of timeout
Solution 1
1. Run the command:
You are missing or using the wrong proxy, you will see the following error:
Job Processing Failed. Error: GetDiscoveryTriggerStatusError ('Failed to get basic session credentials via sts services. error message:',
'com.amazonaws.SdkClientException: Unable to execute HTTP request: Remote host terminated the handshake
2. Set the correct proxy. For details, see Configure proxy credential and target.
Solution 2
1. Run the command:
If you have configured a wrong access key, secret key, or AssumeRole ARNs, you will see the following error:
If discovery fails:
corresponding to Invalid Access key : "The security token included in the request is invalid. (Service: AWSSecurityTokenService;
Status Code: 403; Error Code: InvalidClientTokenId; Request ID: 1cd7d66a-34c9-4b19-ae72-3bb3a5780d7b;"
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 332
AI Operations Management - Containerized 24.4
corresponding to Role Arn : "User: arn:aws:iam::<account ID>:user/CloudMon_User is not authorized to perform: sts:AssumeRole
on resource: arn:aws:iam::<account ID>:role/CloudMon_EC3 (Service: AWSSecurityTokenService; Status Code: 403; Error Code:
AccessDenied; Request ID: 61064e1d-18d1-4c58-bea8-217faf6b0cb6";
Here, <filename> is the credential configuration YAML file that you have to update. If prompted, enter the
password for the user that you configured in the Set up monitoring CLI step. After the configuration gets posted.
This change gets propagated to all running collectors using this credential.
For example:
Important
Delete the credential yaml file ( input.yaml file in the example) after updating the
credential.
Solution 3
1. Run the command:
If you haven't assigned a role with the ReadOnlyAccess policy to an AWS user, you may get the following errors:
[AccessDeniedException:
User: arn:aws:iam::<owner_id>:user/<User1> is not authorized to perform:
tag:GetResources\n\tstatus code: 400,
[AuthorizationError:
User: arn:aws:iam::<owner_id>:user/<User1> is not authorized to perform:
SNS:GetTopicAttributes on resource: arn:aws:sns:us-east-1:<owner_id>:SNS-TOPIC-TEST\n\tstatus
code: 403
2. Assign a role with the ReadOnlyAccess policy to an AWS user to monitor your AWS resources.
Solution 4
If discovery takes longer than the configured discovery frequency, it may timeout in the first run but will succeed in later
runs.
Adjust the discovery frequency (see, Modify frequency of discovery and metric collection)
Add stricter tags to reduce the monitored instances per collector configuration.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 333
AI Operations Management - Containerized 24.4
Note
If the " ops-monitoring-ctl get collector-status " command returns " NA " as the status, it indicates that the collector has not
run yet.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 334
AI Operations Management - Containerized 24.4
Cause 1
Either the protocol, host, or port that you have specified in the URL is wrong.
Cause 2
IDM username or password is wrong.
Cause 3
The syntax in the YAML file is wrong.
Cause 4
You require a proxy to connect to the AI Operations Management server from the ops-monitoring-ctl CLI.
Solution 1
1. Make sure the URL is in the following format: https://<host>:<port>
2. Make sure that the values specified for protocol, hostname, and port are correct.
3. Run the command to set URL:
Solution 2
Run the command to set the correct IDM username and password:
Examples:
Solution 3
You may get the following error: "mapping values are not allowed in this context"
Use any online YAML validator tool to check and correct the syntax of the yaml file.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 335
AI Operations Management - Containerized 24.4
Solution 4
Use the ops-monitoring-ctl CLI from a server that has direct access to the AI Operations Management.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 336
AI Operations Management - Containerized 24.4
Cause 1
AWS allows multiple AWS resources to have the same name. If you have defined multiples resources on AWS running with
the same name then you would see corresponding CIs with the same names.
Cause 2
When AWS resources are terminated, sometimes uCMDB doesn't delete the corresponding CI. This happens if the uCMDB
probe restarts unexpectedly and the AWS resource gets terminated before the next scheduled discovery run (By default, it is
one hour).
Solution 1
The CIs are not duplicate and represent actual running instances on AWS. This is the expected behavior.
Solution 2
The terminated CI (that are not deleted in uCMDB) will not be displayed in PD views as the views filter for the attribute
MonitoredBy=MonitoringService in CIs. The AWS collector removes the attribute MonitoredBy=MonitoringService in
CIs that are terminated.
After the default aging period, the terminated CIs are deleted automatically in uCMDB.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 337
AI Operations Management - Containerized 24.4
Cause
The threshold configurations, or the JsonLogic expression, or the classifications that you have used to generate events are
wrong.
Solution
Before deploying a newly created threshold configuration, use the ms-helper-util tool to test the threshold
configurations and the JsonLogic expressions.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 338
AI Operations Management - Containerized 24.4
Cause
The Data Broker Container is not able to connect to the Operations Bridge Manager (OBM)
Solution
Try to post a sample data to databroker and see events are getting generated:
<root>
<EventDetails>
<Title>Sample message to test connectivity</Title>
<Description></Description>
<MsgKey>Config1:CI1</MsgKey>
<Category></Category>
<Subcategory></Subcategory>
<ETI></ETI>
<Node></Node>
<RelatedCI></RelatedCI>
<SubComponent></SubComponent>
<SourceCI></SourceCI>
<SourceEventId></SourceEventId>
<eventDrillDownUrl></eventDrillDownUrl>
<CloseEventPattern>^Config1:CI1$</CloseEventPattern>
<SendWithClosedStatus>false</SendWithClosedStatus>
</EventDetails>
<ViolationDetails>
<Severity>Critical</Severity>
</ViolationDetails>
</root>
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 339
AI Operations Management - Containerized 24.4
Cause
Modifications to default threshold configuration files get overridden if you restart the monitoring-admin pod.
Solution
Don't modify the default threshold configuration files. You can either save the default thresholds with a different name and
then modify them, or create new thresholds. For details, see Create your own thresholds for Hyperscale Observability
collectors.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 340
AI Operations Management - Containerized 24.4
Cause
Change in the default dashboards. The default dashboards will change if you deploy a Management Pack after deploying
Hyperscale Observability content pack.
Solution
Follow the steps:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 341
AI Operations Management - Containerized 24.4
Cause
Not known.
Solution
You can resolve the issue by using the '*' wildcard, like ECS/*.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 342
AI Operations Management - Containerized 24.4
Cause
You may have configured Hyperscale Observability to collect only specific metrics.
Solution
Follow the steps:
Note
Certain metrics occur only for a specific configuration of instances in AWS. For example: If an EBS volume is created, only a few
common metrics are collected for all the volume types and few metrics are collected for specific types of volumes.
1. Run the following command to check the list of services that you are monitoring or the metrics that you are collecting:
You will see a list of services under the metricConfig section along with the metrics (if defined) which are required for
the collector. For a list of metrics that are required by Performance Dashboards, see Metrics available for
visualization on Performance Dashboard (PD) section on the AWS collector configuration page.
2. Update the collector configuration to include the metrics required for PD visualization, see AWS collector configuration.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 343
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 344
AI Operations Management - Containerized 24.4
time="2024-01-16T09:22:31Z" level=error msg="error loading graphQL mapping file for service: datafactoryservices" container=azure
-collector file="service_mapping.go:205" func="mapping.LoadGraphQLMappingDynamically()"
time="2024-01-16T09:22:31Z" level=error msg="error loading graphQL mapping file for service: servicebus" container=azure-collector
file="service_mapping.go:205" func="mapping.LoadGraphQLMappingDynamically()"
time="2024-01-16T09:22:31Z" level=error msg="error loading graphQL mapping file for service: datalakestores" container=azure-colle
ctor file="service_mapping.go:205" func="mapping.LoadGraphQLMappingDynamically()"
Cause
The discovery fails because of error while loading the GraphQL mapping file for discovery service.
Solution
1. Go to the config volume NFS location. For example, /var/vols/itom/<opsbvol1>/azure-collector/content.
2. Enter the below commands to remove these 3 topic mapping files from the above folder:
rm -rf topic_mapping_servicebus.json
rm -rf topic_mapping_datafactoryservices.json
rm -rf topic_mapping_datalakestores.json
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 345
AI Operations Management - Containerized 24.4
ops-monitoring-ctl get ms
...
azur1e [azure] ENABLED discovery recurring Discovery Collection Partially Completed on 01 Oct
24 11:07 IST
When you try to view the Azure service monitoring status by running the command:
....
state: |-
Resources discovered Partially
...
Cause
This issue occurs when the Discovery Collector tries to discover a deprecated Microsoft Azure service.
Solution
You can safely ignore this error.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 346
AI Operations Management - Containerized 24.4
Cause
The underlying APIs that fetch the data for PT graphs fail.
Solution
Refresh the browser.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 347
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 348
AI Operations Management - Containerized 24.4
Cause
The events aren't triggered due to a known issue with default static thresholds for Kubernetes infrastructure objects.
Solution
To resolve this issue, follow these steps:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 349
AI Operations Management - Containerized 24.4
pod_limits_cpu_cores
pod_limits_mem_b
Cause
This issue occurs if the value of pod_limits_cpu_cores or pod_limits_mem_b metric is set to zero.
Solution
To resolve this issue, you must omit ootb-kubernetes-pods-cpu-util-vs-cpu-limits and ootb-kubernetes-pods-mem-util-vs-
mem-limits thresholds from the Kubernetes collector configuration. Follow these steps:
1. Edit the Kubernetes collector configuration file to remove ootb-kubernetes-pods-cpu-util-vs-cpu-limits and ootb-kubernetes-po
ds-mem-util-vs-mem-limits thresholds:
Example:
apiVersion: core/v1
type: collector
metadata:
tenant: public
namespace: default
name: <unique_collector_name>
displayLabel: <display_label>
description: <description>
spec:
subType: k8s
enabled: <true or false>
targets:
- <k8s-target_name>
thresholds:
- ootb-kubernetes-daemonset-misscheduled-count
- ootb-kubernetes-pods-status-phase-window-base
- ootb-kubernetes-clusters-cpu-util-vs-cpu-allocatable
- ootb-kubernetes-namespaces-cpu-util-vs-cpu-limits
- ootb-kubernetes-nodes-mem-util-mem-allocatable
- ootb-kubernetes-daemonset-current-vs-desired-scheduled-daemon-pod
- ootb-kubernetes-nodes-cpu-util-vs-cpu-allocatable
- ootb-kubernetes-nodes-memory-pressure-status
- ootb-kubernetes-pods-status-phase
- ootb-kubernetes-nodes-pid-pressure-status
- ootb-kubernetes-nodes-disk-pressure-status
- ootb-kubernetes-pvc-status-phase-window-base
- ootb-kubernetes-pv-status-phase-window-base
- ootb-kubernetes-nodes-net-unavailable-status
- ootb-kubernetes-namespaces-mem-util-vs-mem-limits
- ootb-kubernetes-clusters-mem-util-vs-mem-allocatable
- ootb-kubernetes-pvc-status-phase
- ootb-kubernetes-nodes-kubelet-ready-status
- ootb-kubernetes-pv-status-phase
- ootb-kubernetes-deployment-unavailable-replica-count
collectionModes:
- collectionType: pull
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 350
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 351
AI Operations Management - Containerized 24.4
x509: certificate is valid for kubernetes, kubernetes.default, kubernetes.default.svc, kubernetes.default.svc.cluster.local, openshift, ope
nshift.default, openshift.default.svc, openshift.default.svc.cluster.local, 172.30.0.1, not example.net.net
Cause
This issue occurs if the endpoint specified in the Kubernetes target configuration doesn't match the subject name in the
certificate of the Kubernetes cluster that you want to monitor.
Solution
Follow these steps to resolve this issue:
1. Log in to the Kubernetes node and run the following command to get the certificate:
2. Run the following command to list the subject name and alternate name of the saved certificate:
Example output:
Subject: CN = <DNS>
X509v3 extensions:
X509v3 Key Usage: critical
Digital Signature, Key Encipherment
X509v3 Extended Key Usage:
TLS Web Server Authentication
X509v3 Basic Constraints: critical
CA:FALSE
X509v3 Subject Key Identifier:
76:5B:75:F0:B5:30:C6:D4:6F:A3:1D:7C:E7:59:70:DD:62:64:88:33
X509v3 Authority Key Identifier:
keyid:F5:F5:E1:33:86:C9:05:7A:A5:38:5A:0E:24:3C:78:09:3E:8F:2C:FA
3. Open the target credential file and add serverName parameter. You can specify any of the subject alternative names
listed in the previous step as the value of serverName .
Example:
Based on the example in step 2, you can specify any of the following as the serverName :
kubernetes, kubernetes.default, kubernetes.default.svc, kubernetes.default.svc.cluster.local, openshift, openshift.default, openshift.de
fault.svc, openshift.default.svc.cluster.local .
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 352
AI Operations Management - Containerized 24.4
---
apiVersion: core/v1
type: target
metadata:
tenant: public
namespace: default
name: k8s-openshift
spec:
subType: k8s
endpoint: <endpoint_FQDN>
credential: openshift
context:
tls:
serverName: openshift
proxy:
url:myproxy.net:8080
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 353
AI Operations Management - Containerized 24.4
Solution
To rectify the error in InnoDB Reads, Writes and Fsync graph , perform the following steps:
1. Go to MySQL Instance Overview page (on the Kubernetes Application Summary page, select MySQL service
page and then drill down to any one instance) and navigate to the MYSQL innodb data Read, Write, and Fsync graph and
click More Actions.
2. Click Edit and expand the Predefined query dropdown and select Edit the Predefined Query.
3. In the SQL Query space, replace the existing query with the query below:
Note
After running the query if you see the error that the query results are empty, go to the Adminstration > Dashboards &
Reports > Predefined queries, search for Time Period(Calendar), click Edit and change the Defaults value from 12
hours setting to 12 months and click Save.
5. In the Visualization dropdown, remove the default Table option and select Time-series Line instead.
6. In the Time field add dt and in the Metric values field add all the remaining values from the dropdown, namely 'Fsyncs
', 'Pending Fsyncs', 'Pending Reads', 'Pending Writes’, ‘Reads’, ‘Writes’.
7. Click Close.
8. On the Kubernetes Application Summary page, click and select Save. The graph will now display the correct
values.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 354
AI Operations Management - Containerized 24.4
Solution
To rectify the error in the count on widget Total Namespaces Count in the Kubernetes Cluster Instance Overview
page perform the following steps:
1. Navigate to Total Namespaces Count widget (go to Kubernetes Cluster Instance Overview page, click on any
available cluster on the summary page) and click More Actions.
2. Click Edit and expand the Predefined query dropdown and select Edit the Predefined Query.
3. In the SQL Query space, replace the existing query with the query below:
SELECT dt, cnt FROM ( SELECT TIME_SLICE(to_timestamp(timestamp_utc_s), 300) AS dt, count(distinct(resource_name)) as cnt FR
OM kubernetes_namespaces
WHERE
timestamp_utc_s > (EXTRACT(epoch FROM ${${Calendar:start}}::TIMESTAMPTZ))
AND timestamp_utc_s < (EXTRACT(epoch FROM ${${Calendar:end}}::TIMESTAMPTZ))
AND ${cluster_name IN (${monSvcClusterName})} AND ${collection_policy_name IN (${monSvcK8SConfig})}
AND ${labels LIKE '%' || ((${monSvcK8SClusterTags})) || '%'} group by dt) as op1 order by dt desc limit 1
5. Click More Actions and then select Refresh. The graph will now display the correct count.
Perform the same steps to rectify the wrong count on the Total Count widget in the Kubernetes Namespace List page.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 355
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 356
AI Operations Management - Containerized 24.4
Example:
/var/vols/itom/opsbvol4/ucmdb/probe/vcenter-probe/communicationLog/633c58e2-d6e9-4aa8-b492-c39b375b30e3/VMware VirtualCent
er Topology with scope by VIM/ex_135b59795d3261db4a2a08b0b863f93f.record
Probe log
Path to access the probe logs:
/var/vols/itom/<log-volume>/ucmdb/probe/vcenter-probe/log
Example:
/var/vols/itom/opsbvol4/ucmdb/probe/vcenter-probe/log
probe-error.log
RemoteProcesses.log
Server log
Path to access the server logs:
/var/vols/itom/<log-volume>/ucmdb/server/itom-ucmdb-0
Example:
/var/vols/itom/opsbvol4/ucmdb/server/itom-ucmdb-0
cmdb.reconciliation.log
cmdb.reconciliation.error.log
mam.autodiscovery.log
error.log
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 357
AI Operations Management - Containerized 24.4
Log volume
kubectl get pv
Job ID
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 358
AI Operations Management - Containerized 24.4
time="2024-03-28T11:23:16Z" level=error msg="Failed to activate zone. Error : [Action: activate zone , Resource: , Status Code: 500,
Request Status: Server internal error, Response: {\r\n \"errorCode\" : 500,\r\n \"errorSource\" : null,\r\n \"message\" : {\r\n \"code\" :
11001014,\r\n \"parameter\" : null,\r\n \"description\" : \"Zone based discovery is not enabled.\",\r\n \"errorParametersValues\" : nu
ll,\r\n \"errorMap\" : null,\r\n \"parametrizedError\" : false\r\n },\r\n \"details\" : null,\r\n \"recommendedActions\" : null,\r\n \"nested
Errors\" : null,\r\n \"data\" : null\r\n}]" JobID=3456721c-71e1-410b-bb56-ba6b7b648b70 JobName=sg-vcenter-collector-001 JobType=di
scovery JobUnitID=84c62084-93a6-4210-9dbd-274784fae1e2 container=vcenter-discovery-collector file="zone.go:286" func="ud.Switc
hZoneState()"
Cause
This error occurs when you have not enabled the zone-based discovery in the UCMDB Web UI.
Solution
You can resolve the error by setting the zone-based discovery in the UCMDB Web UI following these steps:
Note
Only administrators have permission to enable zone-based discovery solution in UCMDB Web
UI.
Caution
Once the UCMDB Web UI zone-based discovery solution is enabled, the existing discovery running in the UCMDB Web UI will no
longer take effect and you must start using UCMDB Web UI for configuring discovery. The existing discovery configuration (apart
from the Data Flow Probe Setup) won't be migrated and needs to be re-created in the new discovery solution.
You can enable the UCMDB Web UI zone-based discovery solution from either of the following:
You may have to log in with a user name (default: sysadmin) and password.
Parameter Value
name appilog.collectors.enableZoneBasedDiscovery
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 359
AI Operations Management - Containerized 24.4
Parameter Value
4. Click Invoke.
Task Generator reads the flag and switches to follow the new discovery logic
Result Processing reads the flag and switches to follow the new discovery logic
Key words to check if the discovery works in the new discovery logic:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 360
AI Operations Management - Containerized 24.4
Solution
You can only create, enable/disable or delete a configuration using the Ops-monitoring-ctl command. Updating or modifying
collector configuration using parameters isn't currently supported. To update or modify the configuration, you must use yaml
input file.
Example:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 361
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 362
AI Operations Management - Containerized 24.4
Cause
The issue occurs when the AEC content pack isn't upgraded yet.
Solution
To resolve the issue, perform the following steps to upgrade the Automatic Event Correlation (AEC) content pack:
./reloadAecContent.sh
Enter the namespace in which Operations Bridge is running: opsbridge-helm
Please enter the password for user 'admin'
Enter password: ***********
..
Content pack was added successfully.
After you have upgraded the AEC content pack, you can launch the URLs again.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 363
AI Operations Management - Containerized 24.4
Cause
This issue occurs if the automatic upgrade of AEC schema fails. For more information, see Post-upgrade configurations.
Solution
Follow these steps to update the schema manually:
SELECT MAKE_AHM_NOW();
ALTER TABLE itom_analytics_provider_default.aiops_internal_root_cause_score DROP COLUMN id CASCADE;
Example output:
2. Execute the commands on a node where kubectl is configured to control the AI Operations Management deployment
and restart the itom-analytics-root-cause pod using the following sequence:
b. Restart pod:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 364
AI Operations Management - Containerized 24.4
1. itom-analytics-aec-pipeline-jm- : This pod is the Job Manager that handles Pulsar connectors and overall pipeline setup and
teardown.
2. itom-analytics-aec-pipeline-tm- : This pod is the Task Manager where the main pipeline logic runs.
3. itom-analytics-flink-controller- : This pod is the Flink controller that provides an improved Kubernetes integration to handle
upgrades and self-healing logic.
Note
The itom-analytics-aec-pipeline-jm- and itom-analytics-aec-pipeline-tm- pods constitute an Apache Flink cluster for
distributed data processing.
1. On a Kubernetes control plane node, execute the following command to edit the pipeline's Config Map (CM):
kubectl -n <SUITE_NAMESPACE> edit cm itom-analytics-aec-flink-cm
To undo the scale down of the AEC pipeline, edit the CM again and remove the annotation.
For example,
If an installation selects /var/vols/itom/opsbvol1 as the NFS log volume, the NFS path is:
/var/vols/itom/opsbvol1/itom-analytics/aec-pipeline/
job_manager.log: This log file will contain information about the AEC pipeline's internal checkpoints.
task_manager_0.log: This log file will contain information about the Pulsar connectors.
aec_pipeline.log: Under normal operation, this log file conta AEC pipeline start-up and shut-down information.
If the pods are failing, these log files provide additional insight into the cause.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 365
AI Operations Management - Containerized 24.4
aec:
deployment:
flinkPipeline:
jobManagerResources:
memory: "512"
taskManagerResources:
memory: "2048"
The values for memory must be surrounded with double quotes and must not include units. They are assumed to be
mebibytes (MiB).
You can perform the Helm upgrade with the modified YAML file. This will ensure that the values are propagated to the
necessary Kubernetes resources.
Advanced tuning
Since the AEC pipeline runs on top of Flink, most of its configuration properties can be adjusted, except the following
properties. For more information on flink configuration properties, see Apache Flink Configuration documentation.
You can add the other properties, as described in the Apache Flink Configuration documentation, in the YAML file under aec.d
eployment.flinkPipeline.additionalConf.
For example, you can perform the tuning Akka's internal frame size by providing the below properties in the YAML file:
aec:
deployment:
flinkPipeline:
additionalConf:
akka.framesize: "20971520b"
Related topics
Configure values.yaml
Apache Flink Configuration
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 366
AI Operations Management - Containerized 24.4
Cause
For a non-default UI Foundation context root, the AEC Explained content upload fails. Therefore, the aec_admin role isn't
uploaded to UI Foundation and not created in IDM.
Solution
To create the aec_admin role in IDM, ensure that the content upload to UI Foundation is successful.
Verify if UI Foundation is uploaded successfully by checking the logs of the aec-explained-ui-uploader container in the itom-anal
ytics-aec-explained deployment and then do the following, as required:
a. Access the AEC Explained UI by directly accessing UI Foundation via the URL.
b. Identify a user who is part of the Administrators group with admin rights and privileges.
Note: A user without admin rights and privileges can still access the AEC Explained UI in this release but can't
see any events or AEC data.
c. If you can see the AEC Explained UI in UI Foundation, then there was a problem with UI Foundation creating
the aec_admin role in IDM.
e. If you can't find the issue in the bvd-explore logs, you can create the aec_admin role manually in IDM.
a. Verify that the bvd explore context root has a forward slash (/).
2022-01-10T19:49:38.356Z INFO Trying to delete bvd explore config files with aec- prefix
b. If the bvd explore context root has no forward slash, edit bvd-config configmap and add a forward slash to the exploreC
ontextRoot . Ensure that the bvd explore context root appears as follows:
bvd.exploreContextRoot: /preview
c. You can also set the contextRoot to its default value (/ui) by upgrading your deployment:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 367
AI Operations Management - Containerized 24.4
Cause
After you launch UI Foundation, the AEC Explained UI isn't available. This can happen if you're using Internet Explorer
which isn't supported. If you're using any other browser to access the UI, you can see the AEC Explained category on the
left side under the Search icon.
Solution
If the AEC Explained category is missing, verify the aec-explained-ui-uploader container logs in the itom-analytics-aec-
explained pod. You can find the URL where the AEC Explained UI is uploaded and whether there are any upload errors. You
can restart uploading the configuration files by deleting the pod, itom-analytics-aec-explained .
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 368
AI Operations Management - Containerized 24.4
Cause
You can't see the translation resources because UI Foundation can't query the localization files from the static-files-container in
the itom-opsb-resource-bundle pod.
Solution
You can check whether the pod, itom-opsb-resource-bundle is reachable or ignore the error. Then you can see the UI in the
default language, which is English.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 369
AI Operations Management - Containerized 24.4
Cause
The AEC Explained UI shows the AEC Explained page information but without any data. This signals that the AEC
Explained back end has problems querying your Vertica database or processing data.
Solution
You can check the logs of the aec-explained-service container in the itom-analytics-aec-explained pod for any errors. If there are
no errors in the back end, you can also trigger a redeployment of the UI by deleting the itom-analytics-aec-explained pod.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 370
AI Operations Management - Containerized 24.4
Cause
When you try to cross launch the AEC Explained UI from the OBM Event Browser, the AEC Explained UI isn't launched.
This means that the AEC Explained URL tools aren't properly configured in OBM. This can come from problems during the
AEC integration or suite upgrade.
Solution
Perform the following steps in OBM:
1. From the OBM menu, navigate to Administration > Operations Console and click Tools.
2. Click ConfigurationItem, edit Show Correlation Group Details (AEC Explained), Show Occurrence Details
(AEC Explained), and click Launch AEC Explained URL tools that are integrated with the AEC content pack.
Note
If the setting variable does not exist, you can edit the URL tools directly. Replace${setting.integrations.uif.url} with the
URL of your UI Foundation deployment (including the UI Foundation context root). The result should look like
this: https://my.hostname.com:443/ui/aec-overview
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 371
AI Operations Management - Containerized 24.4
Solution
You must manually clear the <OPSB_DATA_VOLUME>/itom-analytics/aec-pipeline/state directory (delete Flink's state) if
Automatic Event Correlation (AEC) fails to recover from the checkpoint and keeps restarting.
AEC's Flink periodically clears the data but if disk space utilization increases (for example, 10+ GB), activate a CRON job to
restart the pipeline once a day.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 372
AI Operations Management - Containerized 24.4
Cause
To configure Automatic Event Correlation (AEC), you set a user name while configuring the connected server (to forward
events to OPTIC Data Lake) and while configuring the endpoints using the call-analytics-datasources tool. AEC will fail if the
user name entered in the connected server for OPTIC Data Lake does not match the user name entered in the call-analytics-da
tasources tool.
Solution
Run the following command to check if there is a mismatch in the user names:
For example:
Sample output:
Related topics
To configure automatic event correlation, see the OBM Configurator Tool.
For more details about the call-analytics-datasources tool, see Configure endpoints for Automatic Event Correlation.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 373
AI Operations Management - Containerized 24.4
Cause
A timeout error occur in an environments that contains a larger correlation graph with 5 million or more entries in the itom_an
alytics_provider_default.aiops_internal_correlation_graph table.
Solution
To solve the issue, you need to increase the timeout of the job by adding the batch-job.timeout-minutes key to the itom-analytic
s-config config map with a value greater than 60 (default).
For example:
apiVersion: v1
data:
batch-job.timeout-minutes: "120"
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 374
AI Operations Management - Containerized 24.4
itom-analytics-text-clustering-server
itom-analytics-root-cause
itom-analytics-aec-pipeline-tm
Solution
Create two new resource pools for AEC by following the steps to resolve this issue:
CREATE RESOURCE POOL aec_interactive_pool_provider_default MAXCONCURRENCY NONE MEMORYSIZE '<memory size>' MAXM
EMORYSIZE '<max memory size>' PRIORITY 0 RUNTIMEPRIORITY HIGH;
CREATE RESOURCE POOL aec_background_pool_provider_default MAXCONCURRENCY NONE MAXMEMORYSIZE '<max memory siz
e>' PRIORITY 0 RUNTIMEPRIORITY MEDIUM;
GRANT USAGE ON RESOURCE POOL aec_interactive_pool_provider_default to <read-write user name> WITH GRANT OPTION;
GRANT USAGE ON RESOURCE POOL aec_background_pool_provider_default to <read-write user name> WITH GRANT OPTION;
For example:
4. Update the valuesBackup.yaml file with the parameters to include the resource pools for AEC:
aec:
deployment:
vertica:
aecBackgroundResourcepool: aec_background_pool_provider_default
aecInteractiveResourcepool: aec_interactive_pool_provider_default
helm upgrade <deployment name> -n <suite namespace> <suite deploy chart with version.tgz> -f valuesBackup.yaml
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 375
AI Operations Management - Containerized 24.4
1.18.13. itom-analytics-opsbridge-notification
pod fails with OOMKilled error
Post upgrade (from 2023.05 to 23.4), the itom-analytics-opsbridge-notification pod has an OOMKilled error and and the
aiops_correlation_event topic has a high backlog.
Cause
During the upgrade, if the post-upgrade task isn't performed on time, the root-cause service sends too many correlation
events to the Notification Container, resulting in a high backlog that blocks the component.
Solution
Follow the steps to clear the backlog:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 376
AI Operations Management - Containerized 24.4
time="2024-03-04T19:44:59Z" level=warning msg="could not get partitions metrics" component=server error="could not get partition
s metrics from database: could not get partitions: ERROR 6999: [42V13] The library [VFunctionsLib] for the function [L ││ ISTAGG(varch
ar)] was compiled with an incompatible SDK Version [11.0.1]"
Problem
During the Vertica upgrade few packages aren't upgraded successfully.
Solution
Force install the packages by running the following command:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 377
AI Operations Management - Containerized 24.4
itom-analytics-ea-config-
itom-analytics-datasource-registry-
When you check their logs, you may see lines similar to the following example:
Cause
This issue occurs when the entropy is low.
To confirm entropy is low, use SSH to run the following command in each Kubernetes worker:
cat /proc/sys/kernel/random/entropy_avail
If the number returned by the command is below 1000 in any of the Kubernetes worker, entropy is too low.
Solution
The solution depends on your Kubernetes setup.
The Kubernetes cluster isn't an OpenShift cluster: Use the package manager of your operating system to install the rng-
tools package in each Kubernetes worker.
The Kubernetes cluster is an OpenShift cluster using Red Hat Enterprise Linux (RHEL) for the workers: Use the yum
package manager to install the rng-tools package on each Kubernetes worker.
The Kubernetes cluster is an OpenShift cluster using Red Hat CoreOS (RHCOS) for the workers: Please contact Red Hat
support to know the available options to enhance the entropy pool. You can deploy custom solutions, for example using
"haveged", but you must understand the security implications in that case.
1. Depending on your Kubernetes setup, start the service that you have added to the workers. For example, if you
installed rng-tools then execute the following commands:
2. To check that the entropy pool has increased, run the following command again:
cat /proc/sys/kernel/random/entropy_avail
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 378
AI Operations Management - Containerized 24.4
Cause
Since AEC relies on the topology pushed by the data flow probe to the Optic data lake, the topology partitions in AEC don't
match the RTSM.
Solution
Clear the current topology and reload it from the Optic Data Lake:
1. Delete the forwarded topology from the Optic data lake and the AEC partitions:
2. Do a full topology sync from RTSM/UCMDB. Go to Data Flow Management > Integration Studio and click Full
Synchronization for the integration point that forwards topology to Optic Data Lake.
After 10–20 minutes, the AEC partitions reload and will be visible in the AEC Explained UI.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 379
AI Operations Management - Containerized 24.4
Cause
The flink-controller controls the AEC pipeline pods ( itom-analytics-aec-pipeline-jm and itom-analytics-aec-pipeline-tm ). It scales
down the pods during suite upgrades and should scale them back up again. If they fail to scale back up, the flink controller
may scale them down again when manually scaling them up.
Solution
Edit the itom-analytics-aec-flink-cm configmap to delete the flink.aiops.microfocus.com/suspend: "true" line:
Once the config map is saved, the flink-controller should scale up the pipeline pods. If not, scale them back up manually.
Note
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 380
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 381
AI Operations Management - Containerized 24.4
Solution
Overall scaling requirement for Edge is high. To ensure optimum resource utilization, scale down the the following pods that
aren't required for OBM agent proxy:
itom-monitoring-admin-xxxx
itom-opsbridge-cs-redis-xxxx
itom-monitoring-snf-xxxx
itom-monitoring-collection-manager-xxxx
itom-monitoring-job-scheduler-xxxx
credential-manager-xxxx
itom-postgresql-xxxx
itom-vault-xxxx
itom-resource-bundle-xxxx
itom-ingress-controller-xxxx
itom-ingress-controller-xxxx
itom-reloader-xxxx
itom-monitoring-job-scheduler-xxxx
itom-idm-xxxx
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 382
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 383
AI Operations Management - Containerized 24.4
1. Run the following command to copy the proxy file and port file to data broker container:
Example:
Example:
Example:
Example:
Sample output:
NumDataSources = 2
SCOPE
CODA
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 384
AI Operations Management - Containerized 24.4
Cause
Authorization or node discovery may fail due to any of the following reasons:
Solution
To identify the reason for the failure, run the following command:
kubectl exec -it -c itom-monitoring-oa-discovery-collector -n $(kubectl get pods -A | awk '/itom-monitoring-oa-discovery/ {print $1, $2}')
-- bash
Sample commands:
If the authentication succeeds, the following output appears with the token :
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 385
AI Operations Management - Containerized 24.4
{
"token" : "eyJ0eXAiOiJKV1QiUzI1NiJ9.eyJ1bmlxdWVfc2FsdCI6InBsImV4cCI6MTY4MjUwNzc4MiwicmVwb3NpdG9yeSI6IlVDT
UR.r25PGw-_tBw-YBzu-uB78kKZ5NKvh1vaEvpO9Uu2LU8",
"isProviderCustomer" : true
}
Note down the token from the output and run the following command:
{
"cis" : [ {
"ucmdbId" : "4b849c138fb6bb2aa18fa4576ed",
"globalId" : "4b849c138fb6bb2aa18fa4576ed",
"type" : "ip_address",
"properties" : {
"display_label" : "1XX.1X.X.1XX",
"authoritative_dns_name" : "CI Name",
"ip_address_type" : "IPv4"
},
"attributesDisplayNames" : null,
"attributesQualifiers" : null,
"displayLabel" : null,
"label" : "IpAddress"
....
}
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 386
AI Operations Management - Containerized 24.4
{
"cis" : [ {
"ucmdbId" : "4b849c138fb6bb2aa18fa4576ed",
"globalId" : "4b849c138fb6bb2aa18fa4576ed",
"type" : "ip_address",
"properties" : {
"display_label" : "1XX.1X.X.1XX",
"authoritative_dns_name" : "CI Name",
"ip_address_type" : "IPv4"
},
"attributesDisplayNames" : null,
"attributesQualifiers" : null,
"displayLabel" : null,
"label" : "IpAddress"
....
}
If the authentication doesn't succeed or node aren't discovered, resolve the issue as follows:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 387
AI Operations Management - Containerized 24.4
Cause
When the data is missing in aggregate tables.
When the data isn't copied from the raw tables to aligned tables or not aggregated.
Solution
Before you begin
Check if the data is available in the raw table for the required duration:
If the query specified in the previous step returns records, download and run the process_historic_raw_data.sh script, see Script
to copy and reaggregate data for sysinfra and event tables.
Important
You must run the process_historic_raw_data.sh script as the same user that installed OMT. Running the script as a different
user will result in errors.
If the script displays an error message when uploading the content, manually upload the content. For more information,
see CLI to manage content.
After the script runs (immediately or at the scheduled time), wait for a few minutes and then verify if the missing data is
available in the sysinfra tables:
Check the historical task status for the metric that you specified when running the process_historic_raw_data.sh script.
Example:
In this example, the metric specified is avail .
Note
If you have specified more than one metric, you must check all the respective
taskflows.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 388
AI Operations Management - Containerized 24.4
5. Make sure all the historical tasks are in FINISHED state. For example, based on the example in previous step, check the
state of tasks like opsb_agtnodesh_id, opsb_sysavldlyh, and opsb_sysinfra_node_hist_1h.
6. If the historical tasks are in FINISHED state, run the query to check if sysinfra table has all the records.
Example:
7. Repeat the query specified in the previous step to check the records in hourly and daily tables.
If the historical tasks are in READY , SCHEDULED , DISPATCHED , or RUNNING state for a long time, follow the steps mentioned
in Aggregate table has missing or no data and System Infrastructure Availability data is missing in reports topics.
If the data is still missing in sysinfra tables, collect debug information from the process_historic_raw_data.sh.<timestamp>.log file
and contact Support.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 389
AI Operations Management - Containerized 24.4
quexserv.error.query.nosuch.host
Cause
This issue is because the OPTIC Data Lake Vertica database isn't connected to Operations Cloud and due to this the query for
the report doesn't run.
Solution
Follow these steps to update and validate the database connection:
2. Open the side navigation panel and click Administration > Dashboards & Reports > Predefined Queries.
Host name: You can connect to either a single Vertica host or a Vertica cluster. If you want to connect to a Vertica
cluster, enter the host names as a comma separated list. This ensures that the user interface uses the live node from
the cluster. If you are using an embedded Vertica, use itom-di-vertica-svc .
Port: You set the Vertica port in the vertica.port parameter in the values.yaml file during the application installation. The
default port is 5433.
Security: The Enable TLS for secure communication check box gets cleared if you set the vertica.tlsEnabled
parameter to false in the values.yaml file during the application installation. For more information, see the section
Vertica in the Configure values.yaml page.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 390
AI Operations Management - Containerized 24.4
Database name: You set the Vertica database name in the vertica.db parameter in the values.yaml file during the
application installation. The default is itomdb .
Login: You may set the Vertica read only user login name in the vertica.rouser parameter in the values.yaml file during
the application installation. The default is vertica_rouser .
Password: Set to the password of the Vertica read only user.
Confirm password: Type the Vertica read only user password again to confirm.
5. Click TEST CONNECTION to test the connection. TEST CONNECTION must be successful.
6. Click SAVE SETTINGS.
7. Refresh the report to see the data.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 391
AI Operations Management - Containerized 24.4
Possible causes
Cause 1: The queries for corresponding reports aren't returning any data.
Cause 2: The tables required for corresponding report queries don't have data.
Cause 3: Data is present in raw tables, the issue is with OPTIC DL pods.
Cause 4: Data isn't available in OPTIC DL Message Bus topics.
Solution
Enter the following URL on a browser and log in using the credentials: https://<external_access_host>:<external_access_port>/<ui
>
1. From the side navigation panel, click Administration > Dashboards & Reports > Stakeholder Dashboards &
Reports > Dashboard Management.
2. Click the dashboard from which you didn't see data and click .
3. Click on the widget for which there is no data and then copy the Data channel name.
4. From the side navigation panel, click Administration > Dashboards & Reports > Predefined Queries.
5. Search for the Data Channel name that you copied.
1. From the side navigation panel, type the report name in search or navigate to the report.
2. Click on the icon on the widget from which you didn't see data and then click .
3. Expand the PREDEFINED QUERY section and then click below the query name.
4. The query appears. Scroll down to the end of the query and click RUN QUERY. The query result appears with data.
1. From the side navigation panel, click Administration > Dashboards & Reports > Predefined Queries.
2. Click and then click DB CONNECTION. The SET UP DB CONNECTION pane appears.
3. Give the Vertica database details:
Hostname: If Vertica is a cluster, enter the host names of the cluster nodes separated by commas
Port: The default port is 5433
TLS: The Use TLS check box is cleared if you set the vertica.tlsEnabled parameter to false in the values.yaml file
during the suite installation. The default is that TLS is enabled. See the section 'Vertica' in the Configure
Values.yaml page.
DB name: The Vertica database name is configured in the vertica.db parameter in the values.yaml file during the
suite installation. The default is itomdb .
Login: The Vertica read-only user login name is configured in the vertica.rouser parameter in the values.yaml file
during the suite installation. The default is vertica_rouser .
Password: Set to the password of the Vertica read-only user.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 392
AI Operations Management - Containerized 24.4
4. Click TEST CONNECTION to test the connection. A confirmation message appears as shown:
5. If the connection isn't successful, provide the correct details, and then test the connection. If the connection is
successful, click SAVE SETTINGS.
Check the health of bvd pods and resolve the issues according to the log message:
Command to
Command to describe
Pod Container Description check the Log files
pods
status of pods
<bvd-www-deployment-PO
D>_<opsbridge-namespace
bvd-www kubectl get pod kubectl describe pod <bv
Provides web UI and real-time >_bvd-www-*.log
bvd-www- kubernete s --all-namespa d-www-deployment-POD
push to browser for BVD <bvd-www-deployment-PO
deployment s-vault-ren ces -o wide | gr > -n <opsbridge-namesp
dashboards D>_<opsbridge-namespace
ew ep "bvd" ace>
>_ kubernetes-vault-renew-
*.log
<bvd-redis-POD>_<opsbrid
ge-namespace>_bvd-redis-*
bvd-redis
.log
bvd-redis- In the memory database for kubectl get pod
kubectl describe pod <bv <bvd-redis-POD>_<opsbrid
stunnel statistics and session data, s --all-namespa
bvd-redis d-redis-POD> -n <opsbrid ge-namespace>_bvd-redis-s
message bus for server ces -o wide | gr
kubernete ge-namespace> tunnel-*.log
process communication ep "bvd"
s-vault-ren
<bvd-redis-POD>_<opsbrid
ew
ge-namespace>_kubernete
s-vault-renew-*.log
<bvd-quexserv-POD>_<op
bvd-quexs
kubectl get pod sbridge-namespace>_bvd-q
erv Query execution service for kubectl describe pod <bv
bvd- s --all-namespa uexserv-*.log
kubernete executing Vertica on demand d-quexserv-POD> -n <op
quexserv ces -o wide | gr
queries sbridge-namespace> <bvd-quexserv-POD>_<op
s-vault-ren ep "bvd"
sbridge-namespace>_kuber
ew
netes-vault-renew-*.log
<bvd-receiver-deployment-
bvd-receiv POD>_<opsbridge-namespa
er kubectl get pod kubectl describe pod <bv ce>_bvd-receiver-*.log
bvd-
Receive incoming messages s --all-namespa d-receiver-deployment-P
receiver- kubernete <bvd-receiver-deployment-
(data items) ces -o wide | gr OD> -n <opsbridge-name
deployment s-vault-ren POD>_<opsbridge-namespa
ep "bvd" space>
ew ce>_kubernetes-vault-rene
w-*.log
<bvd-ap-bridge-POD>_<op
bvd-ap-bri
kubectl get pod sbridge-namespace>_bvd-a
dge Talks to Autopass server and kubectl describe pod < b
bvd-ap- s --all-namespa p-bridge-*.log
kubernete calculates # of allowed vd-ap-bridge -POD> -n <
bridge ces -o wide | gr <bvd-ap-bridge-POD>_<op
s-vault-ren dashboards opsbridge-namespace>
ep "bvd" sbridge-namespace>_kuber
ew
netes-vault-renew-*.log
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 393
AI Operations Management - Containerized 24.4
<bvd-controller-deploymen
bvd-contr t-POD>_<opsbridge-names
oller kubectl get pod kubectl describe pod <bv pace>_bvd-controller-*.log
bvd-
Does aging of old data items s --all-namespa d-controller-deployment-
controller- kubernete <bvd-controller-deploymen
and bootstrap of database ces -o wide | gr POD> -n <opsbridge-nam
deployment s-vault-ren t-POD>_<opsbridge-names
ep "bvd" espace>
ew pace>_kubernetes-vault-re
new-*.log
<bvd-explore-deployme
bvd-explor nt-POD>_<opsbridge-name
e kubectl get pod kubectl describe pod <bv space>_ bvd-explore-*.log
bvd-
Provides web UI and back end s --all-namespa d-explore-deployment-PO
explore- kubernete <bvd-explore-deployme
services for BVD explore ces -o wide | gr D> -n <opsbridge-names
deployment s-vault-ren nt-POD>_<opsbridge-name
ep "bvd" pace>
ew space>_kubernetes-vault-re
new-*.log
Enter the following URL on a browser and log in using the credentials: https://<external_access_host>:<external_access_port>/<ui
>
1. From the side navigation panel, click Administration > Dashboards & Reports > Stakeholder Dashboards &
Reports > Dashboard Management.
2. Click the dashboard from which you didn't see data and click .
3. Click on the widget for which there is no data and then copy the Data channel name.
4. From the side navigation panel, click Administration > Dashboards & Reports > Predefined Queries.
5. Search for the Data Channel name that you copied.
1. From the side navigation panel, type the report name in search or navigate to the report.
2. Click on the icon on the widget from which you didn't see data and then click .
3. Expand the PREDEFINED QUERY section and then click below the query name.
4. The query appears. Verify if the widget is mapped to the correct query. If not, correct the query and click RUN QUERY.
The query result appears with data.
5. Click SAVE.
Verify the latest data is present in the following tables in the mf_shared_provider_default schema:
opsb_rum_action
opsb_rum_crash
opsb_rum_event
opsb_rum_page
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 394
AI Operations Management - Containerized 24.4
opsb_rum_request
opsb_rum_session
opsb_rum_tcp
opsb_rum_trans
opsb_rum_page_1h
opsb_rum_page_1d
Verify the latest data is present in the following tables in the mf_shared_provider_default schema:
opsb_rum_action
opsb_rum_crash
opsb_rum_event
opsb_rum_page
opsb_rum_request
opsb_rum_session
opsb_rum_tcp
opsb_rum_trans
itom-di-receiv
er-cnt It receives the data from sources in JSON format over HTTP and kubectl describe pod <itom-di-receiv
itom-di-
sends the data to relevant OPTIC DL Message Bus topics. er-dpl-POD> -n <opsbridge-namespa
receiver-dpl kubernetes-v ce>
ault-renew Dependency - pulsar-itomdipulsar-proxy
itom-di-admi
kubectl describe pod <itom-di-admin
itom-di- nistration It helps to configure the data process, data ingestion to Vertica,
istration-POD> -n <opsbridge-names
administration and receiver tasks.
kubernetes-v pace>
ault-renew
certificate-re
It helps to produce and consume messages. kubectl describe pod < itomdipulsar-
itomdipulsar- new
broker-POD> -n <opsbridge-namesp
broker Dependency - itomdipulsar-zookeeper , itomdipulsar-bookkeep
itomdipulsar- ace>
er
broker
certificate-re
new kubectl describe pod < itom-di-sched
itom-di-
It schedules the processed data and loads to Vertica. uler-udx-POD> -n <opsbridge-names
scheduler-udx itom-di-udx-s
pace>
cheduler-sche
duler
itom-di-meta
itom-di- It receives the data from the OPTIC DL Message Bus and sends kubectl describe pod < itom-di-meta
data-server
metadata- it to Vertica. It manages table creation and streaming data-server-POD> -n <opsbridge-na
server kubernetes-v configuration in Vertica. mespace>
ault-renew
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 395
AI Operations Management - Containerized 24.4
certificate-re
kubectl describe pod <itomdipulsar-z
itomdipulsar- new It stores the metadata of itomdipulsar-bookkeeper and OPTIC
ookeeper- POD> -n <opsbridge-name
zookeeper DL Message Bus pods.
itomdipulsar- space>
zookeeper
2. Execute the following scripts to list the OPTIC DL Message Bus Topics topics:
pulsar@itomdipulsar-bastion-0:/pulsar/bin> ./pulsar-admin topics list-partitioned-topics public/default |grep rum
Perform the next solution steps if the data is present in the topics and the issue persists.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 396
AI Operations Management - Containerized 24.4
Possible causes
Cause 1: The queries for corresponding reports aren't returning any data.
Cause 2: The raw tables required for corresponding report queries don't have data.
Cause 3: Data isn't available in OPTIC DL Message Bus topics.
Solution
https://<external_access_host>:<external_access_port>/<ui>
1. From the side navigation panel, click Administration > Dashboards & Reports > Stakeholder Dashboards &
Reports > Dashboard Management.
2. Click the dashboard from which you didn't see data and click .
3. Click on the widget for which there is no data and then copy the Data channel name.
4. From the side navigation panel, click Administration > Dashboards & Reports > Predefined Queries.
5. Search for the Data Channel name that you copied.
1. From the side navigation panel, type the report name in search or navigate to the report.
2. Click on the icon on the widget from which you didn't see data and then click .
3. Expand the PREDEFINED QUERY section and then click below the query name.
4. The query appears. Scroll down to the end of the query and click RUN QUERY. The query result appears with data.
1. From the side navigation panel, click Administration > Dashboards & Reports > Predefined Queries.
2. Click and then click DB CONNECTION. The SET UP DB CONNECTION pane appears.
3. Give the Vertica database details:
Hostname: If Vertica is a cluster, enter the host names of the cluster nodes separated by commas.
Port: The default port is 5433.
TLS: The Use TLS check box isn't selected if you set the vertica.tlsEnabled parameter to false in the values.yaml file
during the application installation. The default is that TLS is enabled. See the section 'Vertica' in the Configure
Values.yaml page.
DB name: Specify the name of the database to which you want to connect the UI.
Login: Specify the user name of the read-only user. For information about creating a read-only user, see the
Related topics.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 397
AI Operations Management - Containerized 24.4
5. If the connection isn't successful, give the correct details and then test the connection. If the connection is successful,
click SAVE SETTINGS.
Check the health of bvd pods and resolve the issues according to the log message:
Command to
Command to describe
Pod Container Description check the Log files
pods
status of pods
<bvd-www-deployment-PO
<bvd-redis-POD>_<opsbrid
ge-namespace>_bvd-redis-
bvd-redis
*.log
kubectl get pod
bvd-redis- In the memory database for
s --all- kubectl describe pod <bv <bvd-redis-POD>_<opsbrid
stunnel statistics and session data,
bvd-redis namespaces -o d-redis-POD> -n <opsbrid ge-namespace>_bvd-redis-
message bus for server
kubernete wide | grep "bvd ge-namespace> stunnel-*.log
process communication
s-vault-ren "
<bvd-redis-POD>_<opsbrid
ew
ge-namespace>_kubernete
s-vault-renew-*.log
<bvd-quexserv-POD>_<op
bvd-quexs kubectl get pod sbridge-namespace>_bvd-q
erv Query execution service for s --all- kubectl describe pod <bv
bvd- uexserv-*.log
kubernete executing Vertica on demand namespaces -o d-quexserv-POD> -n <op
quexserv <bvd-quexserv-POD>_<op
s-vault-ren queries wide | grep "bvd sbridge-namespace>
" sbridge-namespace>_kuber
ew
netes-vault-renew-*.log
<bvd-receiver-deployment-
bvd-receiv kubectl get pod POD>_<opsbridge-namesp
er kubectl describe pod <bv ace>_bvd-receiver-*.log
bvd- s --all-
Receive incoming messages d-receiver-deployment-P
receiver- kubernete namespaces -o <bvd-receiver-deployment-
(data items) OD> -n <opsbridge-name
deployment s-vault-ren wide | grep "bvd POD>_<opsbridge-namesp
space>
ew " ace>_kubernetes-vault-ren
ew-*.log
<bvd-ap-bridge-POD>_<op
bvd-ap-bri kubectl get pod sbridge-namespace>_bvd-a
dge Talks to Autopass server and s --all- kubectl describe pod < b
bvd-ap- p-bridge-*.log
kubernete calculates # of allowed namespaces -o vd-ap-bridge -POD> -n <
bridge <bvd-ap-bridge-POD>_<op
s-vault-ren dashboards wide | grep "bvd opsbridge-namespace>
" sbridge-namespace>_kuber
ew
netes-vault-renew-*.log
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 398
AI Operations Management - Containerized 24.4
Command to
Command to describe
Pod Container Description check the Log files
pods
status of pods
<bvd-controller-deploymen
bvd-contr kubectl get pod t-POD>_<opsbridge-names
oller kubectl describe pod <bv pace>_bvd-controller-*.log
bvd- s --all-
Does aging of old data items d-controller-deployment-P
controller- kubernete namespaces -o <bvd-controller-deploymen
and bootstrap of database OD> -n <opsbridge-name
deployment s-vault-ren wide | grep "bvd t-POD>_<opsbridge-names
space>
ew " pace>_kubernetes-vault-re
new-*.log
<bvd-explore-deployment-
bvd-explor kubectl get pod POD>_<opsbridge-namesp
e kubectl describe pod <bv ace>_ bvd-explore-*.log
bvd- s --all-
Provides web UI and back end d-explore-deployment-PO
explore- kubernete namespaces -o <bvd-explore-deployment-
services for BVD explore D> -n <opsbridge-names
deployment s-vault-ren wide | grep "bvd POD>_<opsbridge-namesp
pace>
ew " ace>_kubernetes-vault-ren
ew-*.log
https://<external_access_host>:<external_access_port>/<ui>
1. From the side navigation panel, click Administration > Dashboards & Reports > Stakeholder Dashboards &
Reports > Dashboard Management.
2. Click the dashboard from which you didn't see data and click .
3. Click on the widget for which there is no data and then copy the Data channel name.
4. From the side navigation panel, click Administration > Dashboards & Reports > Predefined Queries.
5. Search for the Data Channel name that you copied.
1. From the side navigation panel, type the report name in search or navigate to the report.
2. Click on the icon on the widget from which you didn't see data and then click .
3. Expand the PREDEFINED QUERY section and then click below the query name.
4. The query appears. Verify if the widget is mapped to the correct query. If not, correct the query and click RUN QUERY.
The query result appears with data.
5. Click SAVE.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 399
AI Operations Management - Containerized 24.4
Verify the latest data is present in the following tables in the opsbridge_store schema :
opsb_synthetic_trans
The raw tables required for corresponding report queries don't have
data
Verify the latest data is present in the following tables in the mf_shared_provider_default schema:
opsb_synthetic_trans
opsb_synthetic_trans_trc_hop
opsb_synthetic_trans_trc_path
opsb_synthetic_trans_errors
opsb_synthetic_trans_components
itom-di-receiver-
itom-di- cnt Receives the data from the source and kubectl describe pod <itom-di-receiver-dp
receiver-dpl kubernetes-vault passes it to the OPTIC DL Message Bus. l-POD> -n <opsbridge-namespace>
-renew
itom-di-administ
itom-di- ration kubectl describe pod <itom-di-administrat
administration ion-POD> -n <opsbridge-namespace>
kubernetes-vault
-renew
certificate-
itomdipulsar- renew kubectl describe pod < itomdipulsar-brok
broker er-POD> -n <opsbridge-namespace>
itomdipulsar-bro
ker
certificate-
Responsible for getting data from OPTIC
itom-di- renew kubectl describe pod < itom-di-scheduler-
DL Message Bus and streaming it to
scheduler-udx udx-POD> -n <opsbridge-namespace>
itom-di-udx-sche Vertica.
duler-scheduler
itom-di-metadat
itom-di-
a-server Responsible for meta data configuration kubectl describe pod < itom-di-metadata-
metadata-
(table creation) server-POD> -n <opsbridge-namespace>
server kubernetes-vault
-renew
certificate-
itomdipulsar- renew kubectl describe pod <itomdipulsar-zooke
zookeeper eper- POD> -n <opsbridge-namespace>
itomdipulsar-zoo
keeper
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 400
AI Operations Management - Containerized 24.4
3. Open one more session and execute the following consumer command, where you should see the same message 'hi'
typed in the producer console:
2. Execute the following consumer command to check the data in the OPTIC DL Message Bus topic:
Execute the following scripts to list the OPTIC DL Message Bus topics:
pulsar@itomdipulsar-bastion-0:/pulsar/bin> ./pulsar-admin topics list-partitioned-topics public/default |grep synthetic_trans
"persistent://public/default/opsb_synthetic_trans"
"persistent://public/default/opsb_synthetic_trans_trc_hop"
"persistent://public/default/opsb_synthetic_trans_trc_path"
"persistent://public/default/opsb_synthetic_trans_errors"
"persistent://public/default/opsb_synthetic_trans_components"
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 401
AI Operations Management - Containerized 24.4
Cause
There must be some error in the collection or configuration.
Solution
To resolve this issue, follow these solutions in the same order and check if the data appears on the report.
Perform the following steps to check if you have installed Operations Agent:
Option 1
Run the following commands if you want to install and integrate Operations Agent (on a BPM server) with OBM:
On Linux: ./oainstall.sh -i -a -s <OBM load balancer or gateway server> -cert_srv <OBM load balancer or gateway server>
If the master node (OBM node) is in HA, run the following command: ./oainstall.sh -i -a -s <HA_VIRTUAL_IP's FQDN> -cert_srv <CDF
master node FQDN>
On Windows: cscript oainstall.vbs -i -a -s <OBM load balancer or gateway server> -cert_srv <OBM load balancer or gateway server>
If the master node (OBM node) is in HA, run the following command: cscript oainstall.vbs -i -a -s <HA_VIRTUAL_IP's FQDN> -cert_srv
<CDFmaster node FQDN>
Option 2
If Operations Agent is already installed, follow the steps to integrate Operations Agent with OBM:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 402
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 403
AI Operations Management - Containerized 24.4
Administration
Edit the following file and change the level " INFO " to " DEBUG "
Receiver
Edit the following file and change the level " INFO " to " DEBUG "
File Name:
<NFS-conf-volume>/di/receiver/conf/logback.xml
<root level="INFO">
Data-processor
Edit the following file and change the level " INFO " to " DEBUG "
<root level="INFO">
Vertica-ingestion
Edit the following file and change the level " INFO " to " DEBUG "
<root level="INFO">
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 404
AI Operations Management - Containerized 24.4
Cause
Predefined queries for corresponding reports ( opsb_sysinfra_* and opsb_sysinfra_SysExecSumm* ) aren't returning any data.
Tables required for the corresponding ( opsb_agent* or opsb_agentless* or opsb_sysinfra* ) report queries aren't created.
The tables required for corresponding report queries don't have data.
Task flow ( opsb_sys* ) required for populating corresponding tables used in reports aren't running.
There are errors during the execution of task flows.
Solution
If the copy scripts were run, and there are no errors in the aggregate log files, check if the widget that's not showing data is
connected to the correct queries. Follow these steps:
https://<external_access_host>:<external_access_port>/<ui>
1. From the side navigation panel, click Administration > Dashboards & Reports > Stakeholder Dashboards &
Reports > Dashboard Management.
2. Click the dashboard from which you didn't see data and click .
3. Click on the widget for which there is no data and then copy the Data channel name.
4. From the side navigation panel, click Administration > Dashboards & Reports > Predefined Queries.
5. Search for the Data Channel name that you copied.
1. From the side navigation panel, type the report name in search or navigate to the report.
2. Click on the icon on the widget from which you didn't see data and then click .
3. Expand the PREDEFINED QUERY section and then click below the query name.
4. The query appears. Scroll down to the end of the query and click RUN QUERY. The query result appears with data.
If the query doesn't return any data then follow the steps:
1. Check if data from the monitored nodes are sent to the raw tables ( opsb_agent_* and opsb_agentless_* tables) from
the integrated sources. If data is sent, continue with the following steps, else see if you have configured the data
sources correctly. For more information to configure data sources, see Configure reporting.
2. The flow or sequence to check for the data is given below:
Operations Agent data flow across tables:
opsb_agent_* > opsb_sysinfra_*
SiteScope or Agentless data flow across tables:
opsb_agentless_* > opsb_sysinfra_*
3. If there is no data in the opsb_sysinfra_* tables, check the copy scripts.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 405
AI Operations Management - Containerized 24.4
Note
Copy scripts copy data from the raw tables to respective Aligned sysinfra tables. Copy scripts are executed once every 5
minute interval.
tablename : Displays for the source table for which the data is getting copied
processed_time : Displays till when data was considered for aggregation
execution_time : Displays the last time the copy script was executed
last_processed_epoch : Displays the Vertica epoch value of the last processed row from the raw tables.
If the copy scripts were run, check the copy script log files:
Note
Make sure you change the logging level back toERROR after completing the analysis. Leaving it inINFO leads to log file locking
issues and may further result in hanging the task script run.
In the subsequent run of the task script, you will find detailed logs in the log file of the respective scripts with more
details like the query runs and the number of records updated.
If the copy scripts were run, and there are no errors in the log files, check for the data flow across the aggregate tables:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 406
AI Operations Management - Containerized 24.4
3. For example in case of Vertica database connectivity issues following error statements can be observed:
2020-08-26 19:12:39.995 [TFID:{opsb_sysdisk_taskflow} TID:{opsb_agentless_disk_1d_id}] RID:{d0993d10-92f9-4be4-b326-29
a5d4582e2b}-output] []- ERROR DbCommons::ERROR /taskexecutor/conf-local/bin/enrichment/DbCommons.pm (34) 34 Unable t
o open config.properties file to read vertica db credentials
2020-08-26 19:12:40.518 [TFID:{opsb_sysdisk_taskflow} TID:{opsb_agent_disk_1d_id}] RID:{1583ee75-67ec-4b56-b3bb-9dd13
b3be1e5}-output] []- Invalid connection string attribute: SSLMode (SQL-01S00)1ERROR ReadHistory::try {...} /taskexecutor/conf
-local/bin/enrichment/ReadHistory.pm (36) 36 opsb_agent_disk_1d : Can't connect to database: FATAL 3781: Invalid username o
r password
Solution: Check and fix the Vertica Database connectivity.
Check if there is data in the raw tables, if there are no data in the agent raw tables then see System Infrastructure report
widget displays partial data for metrics collected by Operations Agent, and if no data in the agentless raw tables then
see System Infrastructure report widget displays partial data for metrics collected by SiteScope
If copy scripts weren't run as scheduled (for more than 30 minutes) then see Aggregate tables aren't updated, data in the
system infrastructure, or event reports aren't refreshed.
Related topics
See How to recover itomdipulsar-bookkeeper pods from read-only mode if the itomdipulsar-bookkeeper pods are in read-
only mode.
Data Processor Postload task flow not running
Aggregate functionality isn't working as expected
Reporting data and task flows
Vertica database isn't reachable
Failed to connect to host
Trace issue in Aggregate and Forecast
No logs for aggregate and forecast
Aggregate not happening after upgrade
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 407
AI Operations Management - Containerized 24.4
Possible causes
Cause 1: The queries for corresponding reports aren't returning any data.
Cause 2: The daily or hourly or aligned tables required for corresponding report queries don't have data.
Cause 3: Data is present in raw tables, but the copy script hasn't run.
Cause 4: Data isn't present in raw tables, the issue is with OPTIC DL pods.
Cause 5: Data isn't available in OPTIC DL Message Bus Topics.
Solution
1. From the side navigation panel, click Administration > Dashboards & Reports > Stakeholder Dashboards &
Reports > Dashboard Management.
2. Click the dashboard from which you didn't see data and click .
3. Click on the widget for which there is no data and then copy the Data channel name.
4. From the side navigation panel, click Administration > Dashboards & Reports > Predefined Queries.
5. Search for the Data Channel name that you copied.
1. From the side navigation panel, type the report name in search or navigate to the report.
2. Click on the icon on the widget from which you didn't see data and then click .
3. Expand the PREDEFINED QUERY section and then click below the query name.
4. The query appears. Scroll down to the end of the query and click RUN QUERY. The query result appears with data.
1. From the side navigation panel, click Administration > Dashboards & Reports > Predefined Queries.
2. Click and then click DB CONNECTION. The SET UP DB CONNECTION pane appears.
3. Give the Vertica database details:
Hostname: If Vertica is a cluster, enter the host names of the cluster nodes separated by commas
Port: The default port is 5433
TLS: The Use TLS check box isn't selected if you had set the vertica.tlsEnabled parameter to false in
the values.yaml file during the suite installation. The default is that TLS is enabled. See the section 'Vertica' in
the Configure Values.yaml page.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 408
AI Operations Management - Containerized 24.4
DB name: The Vertica database name is configured in the vertica.db parameter in the values.yaml file during the
suite installation. The default is itomdb .
Login: The Vertica read-only user login name is configured in the vertica.rouser parameter in the values.yaml file
during the suite installation. The default is vertica_rouser .
Password: Set to the password of the Vertica read-only user.
4. Click TEST CONNECTION to test the connection. A confirmation message appears as shown:
5. If the connection isn't successful, provide the correct details, and then test the connection. If the connection is
successful, click SAVE SETTINGS.
Check the health of bvd pods and resolve the issues according to the log message:
Command to
check the Command to describe
Pod Container Description Log files
status of pods
pods
<bvd-www-deployment-P
OD>_<opsbridge-namespa
bvd-www kubectl get po kubectl describe pod <b
Provides web UI and real-time ce>_bvd-www-*.log
bvd-www- kubernete ds --all-namesp vd-www-deployment-PO
push to browser for BVD <bvd-www-deployment-P
deployment s-vault-ren aces -o wide | g D> -n <opsbridge-name
dashboards OD>_<opsbridge-namespa
ew rep "bvd" space>
ce>_ kubernetes-vault-ren
ew-*.log
<bvd-redis-POD>_<opsbri
dge-namespace>_bvd-redi
bvd-redis
s-*.log
bvd-redis- In the memory database for kubectl get po
kubectl describe pod <b <bvd-redis-POD>_<opsbri
stunnel statistics and session data, OPTIC ds --all-namesp
bvd-redis vd-redis-POD> -n <opsbr dge-namespace>_bvd-redi
DL Message Bus for server aces -o wide | g
kubernete idge-namespace> s-stunnel-*.log
process communication rep "bvd"
s-vault-ren
<bvd-redis-POD>_<opsbri
ew
dge-namespace>_kuberne
tes-vault-renew-*.log
<bvd-quexserv-POD>_<o
bvd-quexs
kubectl get po psbridge-namespace>_bvd
erv Query execution service for kubectl describe pod <b
bvd- ds --all-namesp -quexserv-*.log
kubernete executing Vertica on demand vd-quexserv-POD> -n <o
quexserv aces -o wide | g <bvd-quexserv-POD>_<o
s-vault-ren queries psbridge-namespace>
rep "bvd" psbridge-namespace>_kub
ew
ernetes-vault-renew-*.log
<bvd-receiver-deployment
bvd-receiv -POD>_<opsbridge-names
er kubectl get po kubectl describe pod <b pace>_bvd-receiver-*.log
bvd-
Receive incoming messages ds --all-namesp vd-receiver-deployment-
receiver- kubernete <bvd-receiver-deployment
(data items) aces -o wide | g POD> -n <opsbridge-na
deployment s-vault-ren -POD>_<opsbridge-names
rep "bvd" mespace>
ew pace>_kubernetes-vault-re
new-*.log
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 409
AI Operations Management - Containerized 24.4
Command to
check the Command to describe
Pod Container Description Log files
status of pods
pods
<bvd-ap-bridge-POD>_<o
bvd-ap-bri
kubectl get po kubectl describe pod < psbridge-namespace>_bvd
dge Talks to Autopass server and
bvd-ap- ds --all-namesp bvd-ap-bridge -POD> -n -ap-bridge-*.log
kubernete calculates # of allowed
bridge aces -o wide | g <opsbridge- <bvd-ap-bridge-POD>_<o
s-vault-ren dashboards
rep "bvd" namespace> psbridge-namespace>_kub
ew
ernetes-vault-renew-*.log
<bvd-controller-deployme
nt-POD>_<opsbridge-nam
bvd-contr
kubectl get po kubectl describe pod <b espace>_bvd-controller-*.l
bvd- oller
Does aging of old data items and ds --all-namesp vd-controller-deploymen og
controller- kubernete bootstrap of database aces -o wide | g t-POD> -n <opsbridge-n <bvd-controller-deployme
deployment s-vault-ren rep "bvd" amespace> nt-POD>_<opsbridge-nam
ew
espace>_kubernetes-vault
-renew-*.log
<bvd-explore-deployment-
bvd-explor POD>_<opsbridge-namesp
e kubectl get po kubectl describe pod <b ace>_ bvd-explore-*.log
bvd-
Provides web UI and back end ds --all-namesp vd-explore-deployment-P
explore- kubernete <bvd-explore-deployment-
services for BVD explore aces -o wide | g OD> -n <opsbridge-nam
deployment s-vault-ren POD>_<opsbridge-namesp
rep "bvd" espace>
ew ace>_kubernetes-vault-ren
ew-*.log
3. Click the dashboard from which you didn't see data and click .
4. Click the widget for which there is no data and then copy the Data channel name.
5. From the side navigation panel, click Administration > Dashboards & Reports > Predefined Queries.
6. Search for the Data Channel name that you copied, select the Data channel name, and click .
7. The query is displayed in the right pane. Verify if the widget is mapped to the correct query. If not, correct the query and
click RUN. If the query is executed successfully, the query result is displayed:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 410
AI Operations Management - Containerized 24.4
Verify the latest data is present in the following tables in the mf_shared_provider_default schema:
opsb_sysinfra_netif_1d
opsb_sysinfra_avail_1d
opsb_sysinfra_cpu_1d
opsb_sysinfra_disk_1d
opsb_sysinfra_filesys_1d
opsb_sysinfra_node_1d
Verify if the latest data is present in the following tables in the mf_shared_provider_default schema:
opsb_sysinfra_netif_1h
opsb_sysinfra_avail_1h
opsb_sysinfra_cpu_1h
opsb_sysinfra_disk_1h
opsb_sysinfra_filesys_1h
opsb_sysinfra_node_1h
Perform the next solution to check if the data isn't available in these hourly tables.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 411
AI Operations Management - Containerized 24.4
If the data isn't present in tables, run the SQL queries from a database tool like DB Visualizer:
Verify if the latest data is present in the following tables in the mf_shared_provider_default schema:
opsb_sysinfra_netif
opsb_sysinfra_avail
opsb_sysinfra_cpu
opsb_sysinfra_disk
opsb_sysinfra_filesys
opsb_sysinfra_node
Perform the next solution to check if the data isn't available in these aligned tables.
1. In OPTIC Data Lake, go to the itom_di_postload_provider_default and look for the itom_di_postload_provider_default.ROLLUP_CO
NTROL table.
2. In the ROLLUP_Name column, look for the aggregation name (For example: opsb_sysinfra_node_1h ) and then look for the L
AST_EXE_TIME (to know the last run time) and then look for the MAX_EPOCH_TIME (to know the time till when the records
were aggregated).
Daily aggregations are expected to run every 12 minutes. If it's beyond this time check for errors in the aggregation log
and report the error.
3. Run the following command to view the log files:
/<log nfs path>/<opsb namespace>/<opsb namespace>__itom-di-postload-taskexecutor- <pod name>__postload-taskexecutor__<w
orker node where task executor is running>
For example: /var/vols/nfs/vol5/opsb/opsb__itom-di-postload-taskexecutor-7d9b99d899-fwv7p__postload-taskexecutor__host.mycom
puter.net
If the data isn't present in tables, run the SQL queries from a database tool like DB Visualizer:
Verify if the latest data is present in the following tables in the mf_shared_provider_default schema:
opsb_agent_netif
opsb_agent_cpu
opsb_agent_disk
opsb_agent_filesys
opsb_agent_node
Perform the next solution if the data is present in the raw tables and the issue persists.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 412
AI Operations Management - Containerized 24.4
Data is present in raw tables, but the copy script hasn't run
The copy scripts copy data from the raw tables to respective Aligned sysinfra tables. You can check the opsb_internal_reports_s
chedule_config_1h table in the mf_shared_provider_default schema in the Vertica database and look for columns:
tablename - displays the source table for which the data is getting copied
processed_time - displays till when data was considered for aggregation
execution_time - displays the last time the copy script was executed
In the subsequent run of the task script, you will find detailed logs in the log file of the respective scripts with more
details like the query runs and the number of records updated.
If the copy scripts were run, and there are no errors in the log files, check the section "Agent Metric Collector Errors".
Data isn't present in raw tables, the issue is with OPTIC DL pods
itom-di-receiv
er-cnt It receives the data from sources in JSON format over HTTP and kubectl describe pod <itom-di-receiv
itom-di-
sends the data to relevant OPTIC DL Message Bus topics. er-dpl-POD> -n <opsbridge-namespa
receiver-dpl kubernetes-v ce>
ault-renew Dependency - pulsar-itomdipulsar-proxy
itom-di-admi
kubectl describe pod <itom-di-admin
itom-di- nistration It helps to configure the data process, data ingestion to Vertica,
istration-POD> -n <opsbridge-names
administration and receiver tasks.
kubernetes-v pace>
ault-renew
certificate-re
It helps to produce and consume messages. kubectl describe pod < itomdipulsar-
itomdipulsar- new
broker-POD> -n <opsbridge-namesp
broker Dependency - itomdipulsar-zookeeper , itomdipulsar-bookkeep
itomdipulsar- ace>
er
broker
certificate-re
new kubectl describe pod < itom-di-sched
itom-di-
It schedules the processed data and loads to Vertica. uler-udx-POD> -n <opsbridge-names
scheduler-udx itom-di-udx-s
pace>
cheduler-sche
duler
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 413
AI Operations Management - Containerized 24.4
itom-di-meta
itom-di- It receives the data from the OPTIC DL Message Bus and sends kubectl describe pod < itom-di-meta
data-server
metadata- it to Vertica. It manages table creation and streaming data-server-POD> -n <opsbridge-na
server kubernetes-v configuration in Vertica. mespace>
ault-renew
certificate-re
kubectl describe pod <itomdipulsar-z
itomdipulsar- new It stores the metadata of itomdipulsar-bookkeeper and OPTIC
ookeeper- POD> -n <opsbridge-name
zookeeper DL Message Bus pods.
itomdipulsar- space>
zookeeper
You can also execute the following command to get the log:
Retrying.
"persistent://public/default/opsb_agent_filesys"
"persistent://public/default/opsb_agent_node"
"persistent://public/default/opsb_agent_disk"
"persistent://public/default/opsb_agent_cpu"
"persistent://public/default/opsb_agent_netif"
Perform the next solution steps if the data is present in the topics and the issue persists.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 414
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 415
AI Operations Management - Containerized 24.4
Possible causes
Cause 1: Agent nodes aren't discovered.
Cause 2: The nodes may have failed to connect to the Agent Metric Collector.
Cause 3: Agent connected to the metric collector, but no metrics are pulled.
Solution
To analyze and resolve this issue, follow the applicable solutions:
If the collection configuration isn't present, deploy the configuration. For steps, see Manage Agent Metric Collection.
If you have disabled the collection configuration, enable the configuration. Run the command: ops-monitoring-cli enable col
lector -n <collector-name>
If you have enabled the collection configuration, check the self health of the collection. Run the command: ops-monitorin
g-cli get ms -o yaml -r
Perform the following steps according to the error messages that appear for the collection health check:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 416
AI Operations Management - Containerized 24.4
1. Run the following command to get the discovery collector pod name:
For example:
2. Run the following command to get inside the discovery collector pod:
kubectl exec -it <pod name> -c <container name> -n <application namespace> <command>
For example:
ls
For example:
agent-collector-sysinfraNodes.log
Only the node names present in this file are considered for metric collection. You can also check the following two log
files:
<config_name>MissingNodes.log : This file has the node names present in the node allow filter list but weren't present in
the TQL response.
<config_name>FaultyNodeFromTqlNodes.log : The file has the faulty nodes from TQL response. This means that TQL
response for the node didn't have the FQDN or the short name.
For further verification, run the following command to enter the discovery pod to access oa-discovery-collector.log log:
kubectl exec -it <pod name> -c <container name> -n <application namespace> <command>
For example:
Open the oa-discovery-collector.log file and check the log file messages. If you observe errors, contact Software Support.
The nodes may have failed to connect to the Agent Metric Collector
1. /opt/OV/bin/ovcert -list . The command should list the node certificate and the trust certificates.
2. /opt/OV/bin/ovcoreid . The command output should match the alias name in the Certificates section of the ovcert -list
command.
3. Run the following command to check the connection:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 417
AI Operations Management - Containerized 24.4
/opt/OV/bin/ovconfchg -ns bbc.http -set PROXY_CFG_FILE proxy file;/opt/OV/bin/ovconfchg -ns bbc.cb.ports -set CB_PORTS_CFG_FI
LE port file;/opt/OV/bin/bbcutil -ping <agent node name or IP>
You should use Host IP if you have used the host IP mapping. The command output should contain status=eServiceOK If it
shows status=eSSLError , then there is an issue with trust between metric collector container and agent nodes.
The below errors indicate that the metric collector is unable to fetch the configuration files like port, proxy, and hosts.
Run the following curl command in the metric collector container to check if the file retrieval is working. It should display the
file contents.
"Failed to fetch etc hosts file. Collection from some agent nodes may fail"
"Failed to fetch connection configuration file port or proxy file name. Collection from some agent nodes may fail"
1. If you have used host IP mapping, then IP for the node name isn't given. Run ops-monitoring-ctl.exe get file -n hosts file nam
e and check if it lists the file. The file name appears if it's configured.
2. For proxy and port files configuration, run the following command to check the connection to the agent node. It will
display the target node IP address and OV Communication Broker port if configured correctly.
/opt/OV/bin/ovconfchg -ns bbc.http -set PROXY_CFG_FILE proxy file;/opt/OV/bin/ovconfchg -ns bbc.cb.ports -set CB_PORTS_CFG_FI
LE port file;/opt/OV/bin/bbcutil -gettarget <agent hostname>
3. During the execution of a job schedule, run ovconfget bbc.http and ovconfget bbc.cb.ports to check if PROXY_CFG_FILE and
CB_PORTS_CFG_FILE are set.
Note
: The configurations are emptied at the end of every scheduled run of the
job.
1. This means that the agent node communication broker is running on a non default port (other than 383). Run ops-monito
ring-ctl.exe get file -n port file name
2. Port file should be present in /var/opt/OV/conf/bbc the container. During the execution of a job schedule /opt/OV/bin/ovconf
get bbc.cb.ports will display CB_PORTS_CFG_FILE=port file name .
Note
The configurations are emptied at the end of every scheduled run of the
job.
3. Run the following commands to check the connection to the agent node:
/opt/OV/bin/ovconfchg -ns bbc.cb.ports -set CB_PORTS_CFG_FILE port file;/opt/OV/bin/bbcutil -getcbport <agent hostname>
/opt/OV/bin/ovconfchg -ns bbc.cb.ports -set CB_PORTS_CFG_FILE port file;/opt/OV/bin/bbcutil -ping <host or IP of the agent machin
e>
4. Verify metric collection status. Run the command: ops-monitoring-ctl.exe get collector-status . It should display " Metric collect
ion succeeded on timestamp" .
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 418
AI Operations Management - Containerized 24.4
Location of binary
If the oacore process is in a stopped or aborted state, run the following command to start it: ovc -start oacore
If metrics aren't collected, check the parm file for the following line:
If a metric class is missing, then add the same and restart oacore
On UNIX: /var/opt/perf/parm
On Windows: "%OvDataDir%parm.mwc
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 419
AI Operations Management - Containerized 24.4
The Watchdog mechanism monitors the hpsensor process. Watchdog runs once in every hour and logs the status of
the hpsensor process in the hpcswatch.log file located at:
On Windows:
%OvDataDir%\hpcs\hpcswatch.log
On Linux:
/var/opt/OV/hpcs/hpcswatch.log
While installing Operations Agent, the Watchdog mechanism adds a cron tab entry on Unix and Linux systems and a
schedule task entry on Windows systems.
Note
The hpcswatch.log file rolls over when it reaches a maximum size of 1 MB. During the roll over a
new hpcswatch.log file replaces the existing file.
Note
: For enhanced logging in itom-monitoring-oa-metric-collector ( agent-collector-sysinfra ), you can edit the collector
configuration and set the metricCollectorLogLevel parameter to DEBUG.
For example:
2. Edit the collector configuration yaml file. Set the metricCollectorLogLevel parameter to DEBUG .
For more troubleshooting on the data flow and for the self health monitoring, see the OA - Monitoring service overview
dashboard.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 420
AI Operations Management - Containerized 24.4
Cause
There must be some error in the Metric Streaming Policy collection or configuration.
Solution
To resolve this issue, follow these solutions in the same order and check if the data appears on the report.
CA_<OVRG_CORE_ID_XYZ.FQDN>_<ASYMMETRIC_KEY_LENGTH>
MF CDF RE CA on XYZ.FQDN
MF CDF RIC CA on XYZ.FQDN
MF CDF RID CA on XYZ.FQDN
MF CDF di-integration CA on XYZ.FQDN
OvCoreId set : OK
Private key installed : OK
Certificate installed : OK
Certificate valid : OK
Trusted certificates installed : OK
Trusted certificates valid : OK
The status should be OK. Make sure to grant the certificate request on the OBM. If OMT certificates are missing, execute
the command to update the trusted certificates on the Operations Agent node: ovcert -updatetrusted
If the Operations Agent node isn't listed then follow the steps:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 421
AI Operations Management - Containerized 24.4
Note
: If the master node (OBM node) is in HA, run the following command: ./oainstall.sh -i -a -s <HA_VIRTUAL_IP's FQDN> -c
ert_srv <CDFmaster node FQDN>
On Windows: cscript oainstall.vbs -i -a -s <CDF master node FQDN> -cert_srv <CDFmaster node FQDN>
Note
: If the master node (OBM node) is in HA, run the following command: cscript oainstall.vbs -i -a -s <HA_VIRTUAL_IP's FQ
DN> -cert_srv <CDFmaster node FQDN>
On Linux:
/opt/OV/bin/OpC/install/opcactivate -srv <FQDN of OBM>
On Windows:
<C:>\HP\HP BTO Software\bin\win64\OpC\install>cscript opcactivate.vbs -srv <FQDN of OBM>
This step sends the certificate request from the SiteScope server (Operations Agent node) to OBM.
2. If you didn't configure a secure connection between OBM and OPTIC Data Lake, run the following command on all
Operations Agent nodes to update the trusted certificates:
ovcert -updatetrusted
On Windows:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 422
AI Operations Management - Containerized 24.4
Location of binary
If the oacore process is in a stopped or aborted state, run the following command to start it: ovc -start oacore
If metrics aren't collected, check the parm file for the following line:
If a metric class is missing, then add the same and restart oacore
On UNIX: /var/opt/perf/parm
On Windows: "%OvDataDir%parm.mwc
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 423
AI Operations Management - Containerized 24.4
The Watchdog mechanism monitors the hpsensor process. Watchdog runs once in every hour and logs the status of
the hpsensor process in the hpcswatch.log file located at:
On Windows:
%OvDataDir%\hpcs\hpcswatch.log
On Linux:
/var/opt/OV/hpcs/hpcswatch.log
While installing Operations Agent, the Watchdog mechanism adds a cron tab entry on UNIX and Linux systems and a
schedule task entry on Windows systems.
Note
The hpcswatch.log file is rolled over when it reaches a maximum size of 1 MB. During roll over the existing file is replaced
with a new hpcswatch.log file.
hpsensor
Change the value of the hpcs.trace parameter, in the hpcs.conf file, from " INFO " to " DEBUG ":
On Windows: %OvDataDir%\hpcs\
On Linux: /var/opt/OV/hpcs/
Administration
In OPTIC Data Lake, edit the following file and change the level from " INFO " to " DEBUG ":
Receiver
In OPTIC Data Lake, edit the following file and change the level from " INFO " to " DEBUG ":
<root level="INFO">
Data-processor
In OPTIC Data Lake, edit the following file and change the level from " INFO " to " DEBUG ":
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 424
AI Operations Management - Containerized 24.4
<root level="INFO">
Vertica-ingestion
In OPTIC Data Lake, edit the following file and change the level "from " INFO " to " DEBUG ":
<root level="INFO">
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 425
AI Operations Management - Containerized 24.4
Cause
Predefined queries for the forecast in the System Executive Summary ( opsb_sysinfra_SysExecSumm_forecast* ) and System
Resource Detail ( opsb_sysinfra_SysResourceDetail_forecast*) isn't returning any data.
Forecast tables for node and filesystem aren't created.
There is no data in the forecast tables.
Task flow ( opsb_sysavl/opsb_sysfs ) required for the forecast isn't running.
There are errors during the execution of the forecast task flows.
Solution
If the Forecast data isn't shown in the System Executive Summary or the System Usage Details reports, check for the data on
the reports as follows:
Check if the widget that isn't showing data is connected to the correct queries. Enter the following URL on a browser:
https://<external_access_host>:<external_access_port>/<ui>
1. From the side navigation panel, click Administration > Dashboards & Reports > Stakeholder Dashboards &
Reports > Dashboard Management.
2. Click the dashboard from which you didn't see data and click .
3. Click on the widget for which there is no data and then copy the Data channel name.
4. From the side navigation panel, click Administration > Dashboards & Reports > Predefined Queries.
5. Search for the Data Channel name that you copied.
1. From the side navigation panel, type the report name in search or navigate to the report.
2. Click on the icon on the widget from which you didn't see data and then click .
3. Expand the PREDEFINED QUERY section and then click below the query name.
4. The query appears. Scroll down to the end of the query and click RUN QUERY. The query result appears with data.
Check for the data in forecast tables in the database in the schema mf_shared_provider_default . The flow or sequence to check
for the data is as follows:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 426
AI Operations Management - Containerized 24.4
If copy scripts weren't run as scheduled (for more than 30 minutes) then see Troubleshoot Forecast/Aggregate data flow.
Related topics
Data Processor Postload task flow not running
Reporting data and task flows
Vertica database isn't reachable
Failed to connect to host
Trace issue in Aggregate and Forecast
No logs for aggregate and forecast
Aggregate not happening after upgrade
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 427
AI Operations Management - Containerized 24.4
Cause
Predefined queries for availability ( opsb_sysinfra_Avail* and opsb_sysinfra_SysExecSumm_avail* ) aren't returning any data.
Tables required for availability reports ( opsb_agent_node or opsb_agentless_node or opsb_sysinfra_node or opsb_sysinfra_avai
l* ) aren't created.
The tables required for availability reports aren't having data.
The task flow ( opsb_sysavl ) required for availability isn't running.
There are errors during the execution of the availability task flow.
Late arrival data isn't getting captured in the Availability tables.
Solution
Check if the widget that isn't showing data is connected to the correct queries. Enter the following URL on a browser:
https://<external_access_host>:<external_access_port>/<ui>
1. From the side navigation panel, click Administration > Dashboards & Reports > Stakeholder Dashboards &
Reports > Dashboard Management.
2. Click the dashboard from which you didn't see data and click .
3. Click on the widget for which there is no data and then copy the Data channel name.
4. From the side navigation panel, click Administration > Dashboards & Reports > Predefined Queries.
5. Search for the Data Channel name that you copied.
1. From the side navigation panel, type the report name in search or navigate to the report.
2. Click on the icon on the widget from which you didn't see data and then click .
3. Expand the PREDEFINED QUERY section and then click below the query name.
4. The query appears. Scroll down to the end of the query and click RUN QUERY. The query result appears with data.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 428
AI Operations Management - Containerized 24.4
Note
Make sure you change the logging level back toERROR after completing the analysis. Leaving it inINFO leads to log file locking
issues and may further result in hanging the task script run.
1. On the NFS server, the script logs are present at: /<mount path for log directory>/reports/system_infra/opsb_sys*.log (* can
be uptm or avlhly or avldly , or avlla ). Check the following logs:
Note
For the late arrival data which aren't captured in the availability table, check for errors in the opsb_sysavlla.log
file.
In the subsequent run of the task script, you will find detailed logs in the log file of the respective scripts with more
details like the query runs and the number of records updated.
3. If there are no errors found in the logs and data isn't present in the opsb_sysavl* tables, verify if data is present in the op
sb_sysinfra_* tables.
4. If there is no data in the opsb_sysinfra_* tables, and check the copy scripts.
Note
Copy scripts copy data from the raw tables to respective Aligned sysinfra tables. Copy scripts are executed once every 5
minutes interval.
tablename - Displays for the source table for which the data is getting copied
processed_time - Displays till when data was considered for aggregation
execution_time - Displays the last time the copy script was executed
last_processed_epoch - Displays the Vertica epoch value of the last processed row from the opsb_sysinfra_node table.
If the copy scripts were run, check the copy script log files:
Note
Make sure you change the logging level back toERROR after completing the analysis. Leaving it inINFO leads to log file locking
issues and may further result in hanging the task script run.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 429
AI Operations Management - Containerized 24.4
To change the log level, go to /<conf vol path on NFS>/di/postload/conf/reports/agent_infra/ or /<conf vol path on NFS>/di/postloa
d/conf/reports/agentless_infra/ and open the <scriptname>log4perl.conf file and change the log4perl.logger.<scriptname> to I
NFO.
In the subsequent run of the task script, you will find detailed logs in the log file of the respective scripts with more
details like the query runs and the number of records updated.
If the copy scripts were run, and there are no errors in the log files, check if there is data in the raw tables, if there is no data
in the agent raw tables then see System Infrastructure report widget displays partial data for metrics collected by Operations
Agent, and if no data in the agentless raw tables then see System Infrastructure report widget displays partial data for
metrics collected by SiteScope.
If copy scripts weren't run as scheduled (for more than 30 minutes) then see Aggregate tables aren't updated, data in the
system infrastructure, or event reports aren't refreshed.
Related topics
Data Processor Postload task flow not running
Aggregate functionality isn't working as expected
Reporting data and task flows
Aggregate not happening after upgrade
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 430
AI Operations Management - Containerized 24.4
Cause
Predefined queries or data channels ( opsb_event_* ) for the widget not returning any data.
One or more of the metrics mapped in the corresponding widgets are null.
Tables required for the corresponding ( opr_event, opsb_event_*_summary_1h or opsb_event_*_summary_1d ) report queries
aren't created.
Data isn't present in one or more event tables - raw tables ( opr_event ), hourly aggregation tables, or daily aggregation
tables.
Task flow ( opsb_evt* ) required for corresponding reports isn't running.
There are errors during the execution of the task flow.
Late arrival data isn't getting captured in the Availability tables
Solution
Check if the widget that isn't showing data is connected to the correct queries. Enter the following URL on a browser:
https://<external_access_host>:<external_access_port>/<ui>
1. From the side navigation panel, click Administration > Dashboards & Reports > Stakeholder Dashboards &
Reports > Dashboard Management.
2. Click the dashboard from which you didn't see data and click .
3. Click on the widget for which there is no data and then copy the Data channel name.
4. From the side navigation panel, click Administration > Dashboards & Reports > Predefined Queries.
5. Search for the Data Channel name that you copied.
1. From the side navigation panel, type the report name in search or navigate to the report.
2. Click on the icon on the widget from which you didn't see data and then click .
3. Expand the PREDEFINED QUERY section and then click below the query name.
4. The query appears. Scroll down to the end of the query and click RUN QUERY. The query result appears with data.
If the query doesn't return any data then follow the steps:
1. Check if the events from the integrated OBM sources are forwarded to the opr_event table. For details about the event
tables, see Event schema. If data is forwarded, continue with the next steps, check if event forwarding is configured
correctly. If you are using classic OBM, see Configure Classic OBM. If you are using containerized OBM, see Configure a
secure connection between containerized OBM and OPTIC Data Lake.
2. The events forwarded from OBM are saved in the raw table ( opr_event ). Task scripts aggregate the data from the opr_ev
ent table and insert it into hourly aggregate tables ( opsb_event_*_summary_1h ). Further, daily aggregations are
computed from the respective hourly tables and inserted into daily aggregate tables ( opsb_event_*_summary_1d )
Task flow: opsb_event > opsb_event_*_summary_1h > opsb_event_*_summary_1d
3. If there is no data present or no recent data updates in the hourly aggregation table, check the task script logs present
in the NFS server. On the NFS server, the task script logs are present at: /<mount path for log directory>/reports/event and
check opsb_evt*.log (* can be cihly or etihly or cithly or polhly or usrhly or grphly or hly ).
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 431
AI Operations Management - Containerized 24.4
Note
For the late arrival data that aren't captured in the availability table, check for errors in the opsb_evt*la.log (* can be cihl
y or etihly or cithly or polhly or usrhly or grphly or hly ) file.
4. If there are no errors in the logs, then change the log level of the task script from ERROR to INFO and check for the
detailed log in the consequent run.
To change the logging level, go to <postload mount path on NFS>/conf/reports/event and edit opsb_evt*log4perl.conf (* can be
cihly or etihly or cithly or polhly or usrhly or grphly or hly ) change the logging level from ERROR to INFO.
In the subsequent run of the task script, you will find detailed logs in the log file of the respective scripts with more
details like the query runs and the number of records updated.
Note
Make sure you change the logging level back to ERROR after completing the analysis. Leaving it inINFO leads to log file
locking issues and may further cause task flows to be in a non responsive state.
If there is no data present or data shown isn't recent in the daily aggregated tables, then verify the respective hourly
aggregate table. If data is present in the hourly aggregate table and not updated in the daily table, then check the aggregate.l
og file for any errors.
3. For example, in case of Vertica database connectivity issues the following error statements appear:
2020-08-26 19:12:40.518 [TFID:{opsb_evtcihly_taskflow} TID:{opsb_event_ci_summary_1d_id}] RID:{43d9115d-4413-43ff-9012-eaad
996b06db}-output] []- Invalid connection string attribute: SSLMode (SQL-01S00)1ERROR ReadHistory::try {...} /taskexecutor/conf-loc
al/bin/enrichment/ReadHistory.pm (36) 36 opsb_event_ci_summary_1d : Can't connect to database: FATAL 3781: Invalid username or
password
Solution: Check and fix the Vertica Database connectivity
Note
For “Events by CI” report, it's a prerequisite that the topology data forwarding is enabled. For information about topology data
forwarding, see Forward topology from classic OBM to OPTIC Data Lake or Forward topology from containerized OBM to OPTIC Data
Lake.
Related topics
Data Processor Postload task flow not running
Aggregate functionality isn't working as expected
Reporting data and task flows
Vertica database isn't reachable
Failed to connect to host
Trace issue in Aggregate and Forecast
No logs for aggregate and forecast
Aggregate not happening after upgrade
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 432
AI Operations Management - Containerized 24.4
Possible causes
Cause 1: The queries for corresponding reports aren't returning any data.
Cause 2: The daily or hourly or aligned tables required for corresponding report queries don't have data.
Cause 3: Data is present in raw tables, but the copy script hasn't run.
Cause 4: Data isn't present in raw tables, the issue is with OPTIC DL pods.
Cause 5: Data isn't available in OPTIC DL Message Bus Topics.
Solution
https://<external_access_host>:<external_access_port>/<ui>
1. From the side navigation panel, click Administration > Dashboards & Reports > Stakeholder Dashboards &
Reports > Dashboard Management.
2. Click the dashboard from which you didn't see data and click .
3. Click on the widget for which there is no data and then copy the Data channel name.
4. From the side navigation panel, click Administration > Dashboards & Reports > Predefined Queries.
5. Search for the Data Channel name that you copied.
1. From the side navigation panel, type the report name in search or navigate to the report.
2. Click on the icon on the widget from which you didn't see data and then click .
3. Expand the PREDEFINED QUERY section and then click below the query name.
4. The query appears. Scroll down to the end of the query and click RUN QUERY. The query result appears with data.
1. From the side navigation panel, click Administration > Dashboards & Reports > Predefined Queries.
2. Click and then click DB CONNECTION. The SET UP DB CONNECTION pane appears.
3. Give the Vertica database details:
Hostname: If Vertica is a cluster, enter the host names of the cluster nodes separated by commas
Port: The default port is 5433
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 433
AI Operations Management - Containerized 24.4
TLS: The Use TLS check box isn't selected if you had set the vertica.tlsEnabled parameter to false in
the values.yaml file during the suite installation. The default is that TLS is enabled. See the section 'Vertica' in
the Configure Values.yaml page.
DB name: The Vertica database name is configured in the vertica.db parameter in the values.yaml file during the
suite installation. The default is itomdb .
Login: The Vertica read-only user login name is configured in the vertica.rouser parameter in the values.yaml file
during the suite installation. The default is vertica_rouser .
Password: Set to the password of the Vertica read-only user.
4. Click TEST CONNECTION to test the connection. A confirmation message appears as shown:
5. If the connection isn't successful, provide the correct details, and then test the connection. If the connection is
successful, click SAVE SETTINGS.
Check the health of bvd pods and resolve the issues according to the log message:
Command to
check the Command to describe
Pod Container Description Log files
status of pods
pods
<bvd-www-deployment-P
OD>_<opsbridge-namespa
bvd-www kubectl get po kubectl describe pod <b
Provides web UI and real-time ce>_bvd-www-*.log
bvd-www- kubernete ds --all-namesp vd-www-deployment-PO
push to browser for BVD <bvd-www-deployment-P
deployment s-vault-ren aces -o wide | g D> -n <opsbridge-name
dashboards OD>_<opsbridge-namespa
ew rep "bvd" space>
ce>_ kubernetes-vault-ren
ew-*.log
<bvd-redis-POD>_<opsbri
dge-namespace>_bvd-redi
bvd-redis
s-*.log
bvd-redis- In the memory database for kubectl get po
kubectl describe pod <b <bvd-redis-POD>_<opsbri
stunnel statistics and session data, OPTIC ds --all-namesp
bvd-redis vd-redis-POD> -n <opsbr dge-namespace>_bvd-redi
DL Message Bus for server aces -o wide | g
kubernete idge-namespace> s-stunnel-*.log
process communication rep "bvd"
s-vault-ren
<bvd-redis-POD>_<opsbri
ew
dge-namespace>_kuberne
tes-vault-renew-*.log
<bvd-quexserv-POD>_<o
bvd-quexs
kubectl get po psbridge-namespace>_bvd
erv Query execution service for kubectl describe pod <b
bvd- ds --all-namesp -quexserv-*.log
kubernete executing Vertica on demand vd-quexserv-POD> -n <o
quexserv aces -o wide | g <bvd-quexserv-POD>_<o
s-vault-ren queries psbridge-namespace>
rep "bvd" psbridge-namespace>_kub
ew
ernetes-vault-renew-*.log
<bvd-receiver-deployment
bvd-receiv -POD>_<opsbridge-names
er kubectl get po kubectl describe pod <b pace>_bvd-receiver-*.log
bvd-
Receive incoming messages ds --all-namesp vd-receiver-deployment-
receiver- kubernete <bvd-receiver-deployment
(data items) aces -o wide | g POD> -n <opsbridge-na
deployment s-vault-ren -POD>_<opsbridge-names
rep "bvd" mespace>
ew pace>_kubernetes-vault-re
new-*.log
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 434
AI Operations Management - Containerized 24.4
Command to
check the Command to describe
Pod Container Description Log files
status of pods
pods
<bvd-ap-bridge-POD>_<o
bvd-ap-bri
kubectl get po kubectl describe pod < psbridge-namespace>_bvd
dge Talks to Autopass server and
bvd-ap- ds --all-namesp bvd-ap-bridge -POD> -n -ap-bridge-*.log
kubernete calculates # of allowed
bridge aces -o wide | g <opsbridge- <bvd-ap-bridge-POD>_<o
s-vault-ren dashboards
rep "bvd" namespace> psbridge-namespace>_kub
ew
ernetes-vault-renew-*.log
<bvd-controller-deployme
nt-POD>_<opsbridge-nam
bvd-contr
kubectl get po kubectl describe pod <b espace>_bvd-controller-*.l
bvd- oller
Does aging of old data items and ds --all-namesp vd-controller-deploymen og
controller- kubernete bootstrap of database aces -o wide | g t-POD> -n <opsbridge-n <bvd-controller-deployme
deployment s-vault-ren rep "bvd" amespace> nt-POD>_<opsbridge-nam
ew
espace>_kubernetes-vault
-renew-*.log
<bvd-explore-deployment-
bvd-explor POD>_<opsbridge-namesp
e kubectl get po kubectl describe pod <b ace>_ bvd-explore-*.log
bvd-
Provides web UI and back end ds --all-namesp vd-explore-deployment-P
explore- kubernete <bvd-explore-deployment-
services for BVD explore aces -o wide | g OD> -n <opsbridge-nam
deployment s-vault-ren POD>_<opsbridge-namesp
rep "bvd" espace>
ew ace>_kubernetes-vault-ren
ew-*.log
4. Click the dashboard from which you didn't see data and click .
5. Click the widget for which there is no data and then copy the Data channel name.
6. From the side navigation panel, click Administration > Dashboards & Reports > Predefined Queries.
7. Search for the Data Channel name that you copied, select the Data channel name, and click .
8. The query is displayed in the right pane. Verify if the widget is mapped to the correct query. If not, correct the query and
click RUN. If the query is executed successfully, the query result is displayed:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 435
AI Operations Management - Containerized 24.4
Verify the latest data is present in the following tables in the mf_shared_provider_default schema:
opsb_sysinfra_netif_1d
opsb_sysinfra_node_1d
opsb_sysinfra_cpu_1d
opsb_sysinfra_disk_1d
opsb_sysinfra_filesys_1d
Verify the latest data is present in the following tables in the mf_shared_provider_default schema:
opsb_sysinfra_netif_1h
opsb_sysinfra_node_1h
opsb_sysinfra_cpu_1h
opsb_sysinfra_disk_1h
opsb_sysinfra_filesys_1h
1. In OPTIC Data Lake, go to the itom_di_postload_provider_default and look for the itom_di_postload_provider_default.ROLLUP_CO
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 436
AI Operations Management - Containerized 24.4
NTROL table.
2. In the ROLLUP_ Name column, look for the aggregation name (For example: system_infra_metric_node_1d ) and then look
for the LAST_EXE_TIME (to know the last run time) and then look for the MAX_EPOCH_TIME (to know the time until when
the records were aggregated).
Daily aggregations are expected to run for every 4:45 minutes. If it's beyond this time check for errors in the
aggregation log and report the error.
3. Run the following command to view the log files:
/var/vols/itom/log-volume/<opsbridge-namespace>/<opsbridge-namespace>__<itom-di-dp-worker-dpl-podname>__dp-worker__<kub
ernetes worker node>
For example:
/var/vols/itom/log-volume/opsbridge-jugcl/opsbridge-jugcl__itom-di-dp-worker-dpl-9bd6b6964-2rfd2__dp-worker__btpvm0785.hpeswla
b.net
Verify the latest data is present in the following tables in the mf_shared_provider_default schema:
opsb_sysinfra_netif
opsb_sysinfra_node
opsb_sysinfra_cpu
opsb_sysinfra_disk
opsb_sysinfra_filesys
1. In OPTIC Data Lake, go to the itom_di_postload_provider_default and look for the itom_di_postload_provider_default.ROLLUP_CO
NTROL table.
2. In the ROLLUP_Name column, look for the aggregation name (For example: system_infra_metric_node_1h ) and then look
for the LAST_EXE_TIME (to know the last run time) and then look for the MAX_EPOCH_TIME (to know the time till when the
records were aggregated).
Daily aggregations are expected to run every 12 minutes. If it's beyond this time check for errors in the aggregation log
and report the error.
3. Run the following command to view the log files:
Go to the respective pod directory in the NFS volumes:
cd <log-volume>/<suite namespace>
ls <suite namespace>__itom-di-postload-taskexecutor-<pod value>__postload-taskexecutor__<node name>
For example:
/var/vols/itom/opsbvol2/opsb-helm/opsb-helm__itom-di-postload-taskexecutor-85d9b6fbcc-r8mjc__postload-taskexecutor__btp-hvm00
779.swinfra.net
Verify the latest data is present in the following tables in the mf_shared_provider_default schema :
opsb_agentless_netif
opsb_agentless_cpu
opsb_agentless_disk
opsb_agentless_filesys
opsb_agentless_node
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 437
AI Operations Management - Containerized 24.4
Data is present in raw tables, but the copy script hasn't run
Copy scripts copy data from the raw tables to respective Aligned sysinfra tables. In the Vertica database, go to opbsbridge_sto
re schema. In the opsb_internal_reports_schedule_config_1h table, look for:
In the subsequent run of the custom task script, you will find detailed logs in the log file of the respective scripts
with more details like the query runs and the number of records updated.
If the copy scripts were run, and there are no errors in the log files, check for Collection or configuration errors.
Data isn't present in raw tables, the issue is with OPTIC DL pods
itom-di-po
stload-task
controller- The task controller takes as input a list of task flows from the administration kubectl describe pod <ito
itom-di-
cnt pod and schedules the eligible tasks for execution. It then sends the task and m-di-postload-taskcontroll
postload-
its associated payload as messages onto the configured task topic in the er-POD> -n <opsbridge-na
taskcontroller kubernete OPTIC DL Message Bus. mespace>
s-vault-ren
ew
itom-di-po
stload-task
executor-c The OPTIC DL Message Bus consumers running in the task executors read kubectl describe pod <ito
itom-di- nt these messages from the task topic and trigger the tasks for execution. It m-di-postload-taskexecuto
postload- then sends the task execution status back to the task controller onto the r-POD> -n <opsbridge-na
taskexecutor kubernete status topic. mespace>
s-vault-ren
ew
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 438
AI Operations Management - Containerized 24.4
Command to check
Pod Container Description
health
itom-di-re
ceiver-cnt kubectl describe pod <ito
itom-di- Receives the data from the source and passes it to the OPTIC DL Message
kubernete m-di-receiver-dpl-POD> -n
receiver-dpl Bus.
s-vault-ren <opsbridge-namespace>
ew
itom-di-ad
ministratio kubectl describe pod <ito
itom-di- n m-di-administration-POD>
administration -n <opsbridge-namespace
kubernete
>
s-vault-ren
ew
certificate
kubectl describe pod < ito
itomdipulsar- -renew
mdipulsar-broker-POD> -n
broker
itomdipuls <opsbridge-namespace>
ar-broker
certificate
-renew kubectl describe pod < ito
itom-di- Responsible for getting data from OPTIC DL Message Bus. and streaming it to m-di-scheduler-udx-POD>
itom-di-ud
scheduler-udx Vertica. -n <opsbridge-namespace
x-schedule
>
r-schedule
r
itom-di-m
etadata-se kubectl describe pod < ito
itom-di-
rver m-di-metadata-server-POD
metadata- Responsible for metadata configuration (table creation)
> -n <opsbridge-namespa
server kubernete
ce>
s-vault-ren
ew
certificate
kubectl describe pod <ito
-renew
itomdipulsar- mdipulsar-zookeeper- POD
zookeeper itomdipuls > -n <opsbridge-namespa
ar- ce>
zookeeper
2. Execute the following scripts to list the OPTIC DL Message Bus topics:
pulsar@itomdipulsar-bastion-0:/pulsar/bin> ./pulsar-admin topics list-partitioned-topics public/default |grep agentless
"persistent://public/default/opsb_agentless_filesys"
"persistent://public/default/opsb_agentless_node"
"persistent://public/default/opsb_agentless_disk"
"persistent://public/default/opsb_agentless_cpu"
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 439
AI Operations Management - Containerized 24.4
"persistent://public/default/opsb_agentless_generic"
"persistent://public/default/opsb_agentless_netif"
2. Execute the following consumer command to check the data in the OPTIC DL Message Bus topic:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 440
AI Operations Management - Containerized 24.4
Cause
There must be some error in the SiteScope collection or configuration.
Solution
To resolve this issue, follow these solutions in the same order and check if the data appears on the report.
Check if the certificate exchange between OPTIC Data Lake and OBM is successful
1. Log on to OBM
2. Run the following command: /opt/OV/bin/ovcert -list
Perform the following steps to check if you have installed Operations Agent:
Option 1
Run the following commands if you want to install and integrate Operations Agent (on a SiteScope server) with OBM:
cd /opt/OV/bin
./ovcert -certreq
If the master node (OBM node) is in HA, run the following command: ./oainstall.sh -i -a -s <HA_VIRTUAL_IP's FQDN> -cert_srv <OMT
master node FQDN>
If the master node (OBM node) is in HA, run the following command: cscript oainstall.vbs -i -a -s <HA_VIRTUAL_IP's FQDN> -cert_srv
<OMTmaster node FQDN>
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 441
AI Operations Management - Containerized 24.4
Option 2
If Operations Agent is already installed, follow the steps to integrate Operations Agent with OBM:
Check if the monitors are enabled and if you have added the OPTIC
Data Lake tag
See task 5: Enable Monitors
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 442
AI Operations Management - Containerized 24.4
2. Go to /var/opt/OV/log . Check for system.txt file for errors. Fix the errors.
taskcontroller-logback-cm
OPTIC DL Postload Processor
taskexecutor-logback-cm
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 443
AI Operations Management - Containerized 24.4
Cause
This issue may occur due to the following reasons:
Solution
1. If the task flows are configured but not running, follow the steps described in Data Processor Postload task flow not
running.
2. If the task flows aren't available, follow the steps described below to resolve the issue on a fresh installation. If you see
this issue in a running setup, or if the issue persists after following the steps below, contact Software Support.
Important
The steps described below may lead to data loss as database tables will get
formatted.
Example:
d. Run the following command to uninstall the previous version of the content :
Example:
e. Run the following command to install the uploaded content with the incremented version:
Example:
f. Verify if the task flows are available after installing the content.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 444
AI Operations Management - Containerized 24.4
Cause
This issue may also occur if the log level gets changed to INFO (from the default option ERROR) and not reverted to ERROR
after analyzing the log. This causes the task flows to lock the log files. The locked files that are present in the log location
don't allow the new process to add entries in the log file. Due to this, though the status of the task flow is RUNNING no data
gets processed.
Solution
Perform the following:
1. On the OPTIC Data Lake Health Insights dashboard: Check the status of the task flows and note down the
taskflowId and taskId of tasks that are running for a long time. If you don't see the task flows listed on the
dashboard, refer the steps described in Task flows are not listed on the OPTIC DL Health Insights dashboard.
2. On the master (control plane) node: Check if the same tasks are running for a long time:
Set the logging level to ERROR: If any processes (within the long-running tasks) are running, and if the task
flows aren't completed even after an hour, check the logging level and set it to ERROR.
3. Check if the aggregate data flow is not working. See Troubleshoot Forecast/Aggregate data flow.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 445
AI Operations Management - Containerized 24.4
Note
If there are many taskexecutor pods, repeat this for each instance of the taskexecutor that's running on the
node.
3. Run the command to get the list of all the processes that are running in the taskexecutor:
ps -ef
4. For all the taskId/s that you noted down from the OPTIC Data Lake Health Insights dashboard, check if the corresponding
tasks (processes within the tasks) are running on any of the taskexecutor pods.
5. If the process for the above task is present in the list then perform the following:
For example:
If you want to check the logging level for an agent netif task flow, go to /mnt/itom/postload/conf/reports/agent_infr
a . You will find the following files:
Open the agtnetiflog4perl.conf file and verify that the value is set to ERROR:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 446
AI Operations Management - Containerized 24.4
4. Terminate the process for which you removed the locked files:
kill -9 <PID of the task>
The process restarts automatically.
For example:
To terminate the opsb_agtnetif.pl process:
1. Run the ps -ef command and get the process id (PID) of the opsb_agtnetif.pl process.
2. Run the command: kill -9 20787
3. The process disappears from the list of running processes in taskexecutor pod and starts automatically in
the next schedule.
5. Verify that the new process for the task triggers per the schedule:
To verify that the process has started in the next schedule, check that the killed process (For example: o
psb_agtnetif.pl ) is shown in the output of the ps-ef command.
Check the database and make sure that the data aggregations are running and data is updated. For
example: You can check if the netif data aggregations are running and data is updated.
If copy scripts were not run as per schedule (for more than 30 minutes) then see Troubleshoot Forecast/Aggregate data flow.
Related topics
To troubleshoot Aggregate table has missing or no data, see Aggregate table has missing or no data.
To troubleshoot Aggregate not happening after upgrade, see Aggregate not happening after upgrade.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 447
AI Operations Management - Containerized 24.4
ERROR 3587: Insufficient resources to execute plan on pool itom_di_postload_respool_provider_default [Timedout waiting for resource r
equest: Request exceeds limits: Memory(KB) Exceeded: Requested = 13895, Free = 0 (Limit = 6918675, Used = 7264791) (que
ueing threshold)]
Cause
This issue is because the postload resource pool default memory (default memory value - 25% ) allocated isn't enough for the
ongoing streaming.
Solution
Perform the following steps to increase resource pool memory:
Get the Limit and Used memory details from the error message. The <memory in percentage> = Used-Limit
For example: 7264791-6918675 = 346,116 (about 338 MB required)
If Vertica memory is 32 GB RAM, and the required additional memory is 338 MB, that means, you must increase the memory
by 1% .
If the <memory in percentage> is more than 5% , check the sizing calculator, and contact Support to validate the size.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 448
AI Operations Management - Containerized 24.4
You should use the COSO_tenant.properties file to configure the tenant id when SiteScope is the data source.
The COSO_tenant.properties is protected with file permissions. Only users with specific permissions can access this file.
On Windows:
If the SiteScope service is running with a local system account, any user in the Administrators group will be able to
modify this file.
If SiteScope is running with a user account, only that specific user will be able to modify this file.
On Linux:
The root user and the user who is running the SiteScope service can modify this file.
Cause 1
COSO_tenant.properties file isn't accessible
Solution 1
Make sure you have the required permissions.
Cause 2
The warning message, tenant_id is not configured, is logged in error.log file if a tenant_id isn't configured or if the tenant_i
d has more than 80 characters,
Solution 2
Follow the steps:
1. Configure the tenant_id in the COSO_tenant.properties and make sure that you do not exceed 80 characters.
On Windows: <SITESCOPE_HOME>\templates.applications\COSO_tenant.properties
On Linux: /opt/HP/SiteScope/templates.applications/COSO_tenant.properties
2. Restart SiteScope.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 449
AI Operations Management - Containerized 24.4
Error: 400 BAD_REQUEST "additionalProperties - $.metadata.lastUpdatedBy: is not defined in the schema and the schema
does not allow additional properties,
Cause
The lastUpdatedBy or lastUpdatedDate from the metadata section isn't removed.
Solution
Remove the lastUpdatedBy or lastUpdatedDate from the metadata section.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 450
AI Operations Management - Containerized 24.4
Name: agent-collector-sysinfra-2760
metric:
timestamp: 1641959364
durationInMilliSec: 236774
discovery:
timestamp: 1641958216
durationInMilliSec: 1171
Cause
The suite is running slow or not responding.
Solution
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 451
AI Operations Management - Containerized 24.4
Cause
You may get the following error: An SSL connection IO error has occurred. This may be due to a network problem or an SSL
handshake error. Possible causes for SSL handshake errors are that no certificate is installed, an invalid certificate is
installed, or the peer doesn't trust the initiator's certificate.
This error occurs if you change the ASYMMETRIC_KEY_LENGTH of the Operations Agent on the OBM server from 2048 to 4096
and not change the ASYMMETRIC_KEY_LENGTH on DBC.
Solution
Update the ASYMMETRIC_KEY_LENGTH on DBC.
Note
The Data Broker Container (DBC) is an Operations Agent node that's managed by OBM. It enables the Agent Metric Collector to
communicate with OBM and receives certificate updates.
Perform the steps to apply the configuration changes on the DBC (Operations Agent node):
ovconfchg -ns sec.cm -set ASYMMETRIC_KEY_LENGTH <RSA Encryption algorithm supported key length>
2. To remove the existing node certificate on the agent, run the following commands:
3. To request a new node certificate from the management server, run the following command:
ovcert -certreq
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 452
AI Operations Management - Containerized 24.4
Symptom
The content upload failed with the status Completed with errors or tables in mf_shared_provider_default schema aren't
populated completely.
Solution
Perform the following to reinstall the content. To configure the CLI, see Administer AMC with CLI.
1. Run the following command in the folder containing the ops-content-ctl tool to list the Name, Version, and Status of all
the content files.
On Linux:
On Windows:
2. Run the following command in the folder containing the ops-content-ctl tool to download the failed content.
On Linux:
On Windows:
Note
3. Before uploading the content zip file increase the version number of the downloaded content. For example.,
On Linux:
mv OpsB_SysInfra_Content_2021.05.003.zip OpsB_SysInfra_Content_2021.05.003.001.zip
On Windows:
4. Run the following command in the folder containing the ops-content-ctl tool to upload the renamed content.
On Linux:
On Windows:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 453
AI Operations Management - Containerized 24.4
5. Run the following command in the folder containing the ops-content-ctl tool to reinstall the uploaded content.
On Linux:
On Windows:
Note
For version 2021.08 and the next versions, you can run the following command in the folder containing theops-content-ctl tool
to force install the content.
For example,
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 454
AI Operations Management - Containerized 24.4
Cause
The issue occurs due to the server taking longer than 60 seconds for POST/PUT API calls for CMDB Content.
Solution
To resolve the issue, run the below helm upgrade with additional setting to increase client timeout:
helm upgrade <helm deployment name> <chart> -n <application namespace> --reuse-values --set opsbcontentmanager.contentman
ager.config.defaultHTTPTimeout=180
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 455
AI Operations Management - Containerized 24.4
Problem
The targets are not displayed for filesystem and node type in the SysInfra content because in the field group the forecast is
selected, by default.
Solution
1. Launch URL: https://<external_access_host>:<external_access_port>/ui/
2. Log in with your IDM username and password.
3. Click select Operations > Performance & Analysis > Troubleshoot Performance from the left side
navigation.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 456
AI Operations Management - Containerized 24.4
1.20.27.
ProducerBlockedQuotaExceededException error
in DI receiver logs
Problem
Data ingestion POST requests sent to receiver fails with the HTTP status code 429 and the following error appears in receiver
log:
ProducerBlockedQuotaExceededException
Cause
Returns the HTTP Status code 429 when the number of requests sent to Receiver are more than the configured throttle limit
in receiver or one of the components in the data flow pipeline triggers backpressure.
Solution
Perform these steps to resolve this issue:
1. Configure backlog quota for each topic in the Message Bus. To get the backlog quota value of topic configured on your
setup, run the following command:
helm get values <pulsar-release-name> -a -n <namespace> | grep backlogQuotaTopicDefaultLimitSize
2. If the backlog quota configured is 200 MB and the topic has 3 partitions, the backlog quota set for each partition is
200/3 = 66 MB. You may check the backlog quota value of a topic partition, using the following command:
kubectl exec -it itomdipulsar-bastion-0 -n <namespace> -c pulsar -- /bin/bash -c "bin/pulsar-admin namespaces get-backlog-quot
as public/default"
Note: Replace <pulsar-release-name> and <namespace> with values corresponding to your deployment.
Messages sent to receiver will be rejected if the backlog quota has exceeded on the topic set in the header of the
request. The following message appears in the receiver logs:
The backlog quota of the topic persistent://public/default/<partition-name> that the producer <producer-name> produces to is excee
ded
Backlog quota indicates a maximum backlog a subscription can have on a topic partition. The earlier message indicates
that one or more consumers on the partition isn't consuming messages at the same rate at which it's ingested. Do the
following steps to identify the subscription that's causing the backlog:
Go to ITOM DI / Pulsar - Topic dashboard and choose affected partition from Topic filter and pick the time range of
the observation.
Check the list of subscriptions for affected partitions in the Local msg batch backlog panel.
Note: Scheduler creates subscriptions that have the following pattern:
<partition-id>_<scheduler-schema-name>_<topic-name>
For example:
0_itom_di_scheduler_provider_default_demotopic
If the subscription that caused the backlog isn't created by the scheduler and created for testing or troubleshooting
or any other purposes, follow up with a respective owner and check whether these are valid subscriptions or not
and identify their purpose. If these are test subscriptions, delete them.
If the subscription belongs to the scheduler, check if itom-di-scheduler-udx pod and Vertica are up and running. Also,
see the following OPTIC DL Message Bus troubleshooting scenarios to resolve the symptoms that caused the
backlog issue:
The itomdipulsar-bookkeeper pods are not accessible
Vertica Streaming Loader dashboard panels have no data loaded
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 457
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 458
AI Operations Management - Containerized 24.4
Solution
If the scripts were run, and there are no errors in the aggregate log files:
1. If the query does not return any data then follow the steps:
1. Check if data from the monitored nodes are sent to the raw tables ( opr_hi_status and opr_kpi_status tables) from the
integrated sources (OBM). If data is sent, continue with the following steps, else see if you have configured the
data sources correctly. See Configure reporting.
2. The flow or sequence to check for the data is:
HI Duration data flow across tables:
opr_hi_status > opr_hi_duration_1h > opr_hi_duration_1d
HI Severity data flow across tables:
opr_hi_status > opr_hi_severity_1h > opr_hi_severity_1d
KPI Status data flow across tables:
opr_kpi_status > opr_kpi_status_1h > opr_kpi_status_1d
2. If there is no data in the opr_hi_duration_* , opr_hi_severity_* , and opr_kpi_status_* tables then check the aggregation
scripts.
3. In the database, go to mf_shared_provider_default schema. In the opsb_internal_service_health_schedule_config_1d and opsb_i
nternal_service_health_schedule_config_1h table and look for the following columns:
Note
Aggregation scripts copy data from the raw tables to the respective aggregate tables *_1h. Daily aggregation copy data from
hourly tables to respective daily tables. Aggregation scripts are run once every 60 minutes.
tablename: displays the source table for which the data is getting copied.
processed_time: displays till when data was considered for aggregation.
execution_time: displays the last time the copy script was executed.
last_processed_epoch: displays the Vertica epoch value of the last processed row from the raw tables.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 459
AI Operations Management - Containerized 24.4
Note
After completing the analysis, change the log level back toERROR. Retaining it as INFO may lead to log file locking issues
and cause the task script to hang.
While running the task script, you can view the log details in the log file with more details including query runs and the
number of records updated.
6. Check if there is data in the raw tables, and if there is no data in the service health raw tables ( opr_hi* or opr_kpi* ) then
see Data and Task flow -
Service health aggregation.
7. If the raw tables don't have data then see Data flow from OBM to raw tables.
Related topics
Data Processor Post-load task flow not running
Aggregate functionality is not working as expected
Reporting data and task flows
Aggregate not happening after upgrade
System Infrastructure Availability data is missing in reports
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 460
AI Operations Management - Containerized 24.4
Cause
DES service is unable to update the downtime and CI enrichment fields.
Solution
To resolve the issue follow the steps below:
Ensure the rules and topology for the content is available in DES
Log into the DES pod and execute the following command:
curl --key /var/run/secrets/boostport.com/server.key --cert /var/run/secrets/boostport.com/server.crt --cacert /var/run/secrets/boostpor
t.com/issue_ca.crt -H "Content-Type: application/json" -X GET https://<externalAccessHost>:30010/v1/itomdes/reconciliation | jq . | te
e
Check the topic name is available under the rules.
Check the view name and attribute available under views from the previous result is matching the view name and CI
Attributes in RTSM.
Verify Redis
Connect to Redis CLI and verify the keys are present by running the below:
kubectlexec <cs-redispodname> -ti bash -c <cs-rediscontainer name> -n <namespace>
Example: kubectl exec itom-opsbridge-cs-redis-dc8948476-mmjsl -ti bash -c cs-redis-n opsbridge-suite
Get the Redis password using the command get_secretredis_pw
>get_secret $REDIS_PWD_KEY
<password of redis instance>
Connect to Redis CLI using the command:
>redis-cli --cert /var/run/secrets/boostport.com/cs-redis.crt --key /var/run/secrets/boostport.com/cs-redis.key --cacert /var/run/secrets/
boostport.com/ca.crt -h cs-redis -p 6380 --tls -a <password of redis instance>
Check if the CI ID is present in Redis using keys ciid* command.
To fetch the properties cached for the CI, execute the command:
HGETALL ciid<Required ID from Keys Result>
Verify RTSM
If Redis Keys doesn't list the CI ID, then verify if the RTSM View is listing the CI.
If Redis Key is available, but the search query Record Count is 0, then verify if the CI Property Value from HGETALL the
result of the CI is matching the RTSM properties of the CI.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 461
AI Operations Management - Containerized 24.4
Cause
Incorrect infrastructure setting in OBM.
Solution
Do the following to resolve the issue:
Make sure the DES endpoint is https://<externalAccessHost>:30010/v1/itomdes, if you are using Classic OBM.
Make sure the DES endpoint is https://itom-opsbridge-des-svc:40009/v1/itomdes, if you are using Containerized OBM.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 462
AI Operations Management - Containerized 24.4
Solution 1
If there is no data in the tables, check for data flow in the cmdb tables. If the data is up to date, then check for the logs in the
custom script log file.
Solution 2
If there are no errors in the logs, change the log level to INFO and verify log messages in the next run.
In the next task script run, the respective script log files will consist the detailed logs with details like the query runs and the
number of records updated.
Note
Make sure you change the log level back to ERROR after completing the analysis. Leaving it in INFO leads to log file locking
issues and may further result in hanging the task script run.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 463
AI Operations Management - Containerized 24.4
Cause
In the case of external OBM, the Data enrichment Service will collect CIs from RTSM at fixed intervals provided with ciCollectio
nIntervalMin suite configuration. By default, the interval is 60 minutes. So any CIs discovered in OBM will take a maximum of
60 minutes to be available for enrichment to DES. The same is the case for the CIs which are redeployed and resolved to new
CI ID.
Solution
If it's required to enrich the metrics with cmdb_id immediately, the ciCollectionIntervalMin set to smaller intervals. The
configuration can be updated in the AI Operations Management values.yaml . Edit the ciCollectionIntervalMin mentioned under i
tomopsbridgedes-cicache in values.yaml and update the chart. For more information, see Configure values.yaml.
Note
Lowering the interval will add an additional load on the RTSM scans from
DES.
itomopsbridgedes:
des:
cicache:
# When external OBM is configured, Data enrichment Service will collect CIs from RTSM at fixed interval provided with ciCollectionIn
tervalMin. Value should be provided in minutes.
ciCollectionIntervalMin: "60"
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 464
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 465
AI Operations Management - Containerized 24.4
Cause
This is because the user role doesn't have the right privileges for the service used. While you deploy the application, make
sure to set the users mapped to the correct roles during user creation.
Solution
Perform these steps to resolve this issue:
Make sure the associated role for the user is as follows that you use for the token authorization:
For the OPTIC Data Lake Administration API calls, the user must have di_admin role.
For the OPTIC Data Lake HTTP Receiver API ingestion calls, the user must have di_ingestion role.
For OPTIC Data Lake Data Access API calls, the user must have di_data_access role.
For steps to create these users and assign roles, see OMT documentation. Set the correct role and perform the API call.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 466
AI Operations Management - Containerized 24.4
{ "statusCode": 401, "reasonPhrase": null, "errorCode": "3001", "errorSource": "Data Access", "message": "Authorization Error"
,
"details": "Failed to authorize request", "recommendedActions": "Please check if the valid token/cert is provided", "data": null, "
nestedErrors": null
}
Cause
This issue is because of the expired token. The token has a limited lifetime. The default is 30 minutes.
Solution
To resolve this issue, you must refresh the IdM token. Send the following content to https://<HOST>:<PORT>/idm-service/v3.0/to
kens to refresh the token id:
{
"refresh_credentials":{ "refresh_token":"<REFRESH_TOKEN>" }
}
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 467
AI Operations Management - Containerized 24.4
Logging details
JAR Name Log File Name Log Location
By default, the log level gets set to DEBUG but you can change it to ERROR, WARN, INFO in log4j2.properties file
present at the <jar directory>.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 468
AI Operations Management - Containerized 24.4
Cause
This error occurs when the <Datasource.xlsx> or Configuration.xlsx files are open while running the CMI tools.
Solution
Close the <Datasource.xlsx>and Configuration.xlsx before running the tools.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 469
AI Operations Management - Containerized 24.4
Cause
This error may occur when you run the exceltojson_schema tool if there is a typo while updating the table names in
<Datasource.xlsx> or Configuration.xlsx.
For example: [ERROR]: Configuration not found for 'opsb_weblogic_cluster_status' in the configuration sheet 'Schema-Raw'
Solution
Correct the name in <Datasource.xlsx>and Configuration.xlsx.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 470
AI Operations Management - Containerized 24.4
Cause
This error may occur when you run the exceltojson_schema tool if there is no input or invalid input for a field in
the Datasource.xlsx or Configuration.xlsx.
For example: [ERROR]: 'type' has invalid input at row 3 in the sheet 'opsb_weblogic_cluster_status'
Solution
Correct the values and run the tool again.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 471
AI Operations Management - Containerized 24.4
For example: [ERROR] : 'groupByFields' has invalid input at row 3 in the configuration sheet 'Aggregation', 'instance_namee_id' metric id d
oesn't exist in sheet 'opsb_ad_ws_servcoll'
Solution
Make sure that the id column value in the table (For example: opsb_ad_ws_servcoll table) in the
<Datasource.xlsx> matches with the value given in the corresponding column (For example, groupByField column) of the
sheet (For example, aggregation sheet) in the Configuration.xlsx.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 472
AI Operations Management - Containerized 24.4
Cause
The data type entered for the respective field in .xlsx isn't correct.
Solution
Correct the values and run the tool again
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 473
AI Operations Management - Containerized 24.4
Solution
Dump the Coda data object to a text file and give the text file as a source to the CodaToExcel binary using the -s parameter.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 474
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 475
AI Operations Management - Containerized 24.4
1. Log in to the OBM RTSM using the Local Client. Ensure that the Local Client is installed on the OBM server. For more
information, see Use Local Client.
2. Go to Users and Groups.
3. Select the user and click Reset Password. Enter the new and final password.
4. In the Roles tab, check that both of the following are shown:
Inherited Role contains "SiteScope Integration Roles"
Parent Groups contains "OBM integration admins"
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 476
AI Operations Management - Containerized 24.4
Cause
This issue is seen in classic OBM integrations with AI Operations Management installed on GCP. The data forwarding fails due
to incorrect OPTIC Data Lake hostnames populated by obm-configurator.jar file.
Solution
1. In the AI Operations Management installation, collect the helm values for service names from the values.yaml file.
2. In the classic OBM, go to Administration > Setup and Maintenance > Infrastructure Settings and select OPTIC DL.
3. Under OPTIC DL - Settings, update the following services with the data collected from the values.yaml file:
4. Go to Administration > Setup and Maintenance > Connected Server > COSO Data Lake and update the Fully
qualified domain name with the Data Enrichment Service Receiver Endpoint for connecting to OPTIC DL.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 477
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 478
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 479
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 480
AI Operations Management - Containerized 24.4
3. Select the duplicate CIs, right-click, and select Merge CIs from the context menu. As the merge target, select the CI
that has the FQDN as a primary domain name.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 481
AI Operations Management - Containerized 24.4
Before you contact support, run the support toolset (for OPTIC Management Toolkit issues) or the diagnoseIdM command (for
IdM issues) to collect diagnostic information that will help support.
Support toolset
OMT provides a support toolset that helps to collect the following information about Containerd, Kubernetes, suites,
commands, directories, and files:
You can view the summary information on a console, and view detailed output information in an encrypted .tar file.
cd $CDF_HOME/tools/support-tool
Note
Example usage
Run the following command to create a dump file with a default file name in a default directory:
# ./support-dump
Run the following command to create a dump file with a specified file name in a specified directory (for example, create
a dump.aes file in /var/test):
./support-dump -c /var/test/dump.aes
Run the following command to create a dump file with a specified user name and password. For example, create a
dump file with a default file name in a default directory with the password abcdef. Connect the suite-installer with
admin as the user and 123456 as the password.
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 482
AI Operations Management - Containerized 24.4
Caution
By selecting the disable output file encryption option, you are disabling or bypassing security features, thereby exposing the
system to increased security risks. By using this option, you understand and agree to assume all associated risks and hold
OpenText harmless for the same.
The support toolset provides a configuration file (conf/supportdump.config) that includes some predefined [commands], [files],
and [dirs] to specify information collection details. You can define your own [commands], [files], and [dirs] in this configuration
file. Additionally, you can create other configuration files in the same directory. When using the configuration files, pay
attention to the following:
The output of the same command will be saved into one file. For example, all the output of the cat command will be
saved to the cat.out file.
All directories, files, and output of commands are stored in the <local_ip>-<NodeType>/os directory.
Wildcard characters can be used in a file name and directory name. For example, /etc/sysconfig/network-scripts/ifcfg-*
Single environment variable is supported. For example, ${CDF_HOME}/log .
A file or files (separated by spaces) following a directory will be excluded from the support toolset collection.
Note
Example usage
The support toolset collects all files and directories in the ${CDF_HOME}/cfg except the * _User.json
file:
${CDF_HOME}/cfg *_User.json
Dump file
The default support dump file is: dmp/support_data_YYYYMMDD-hhmmss.aes . The dump file contains the support_data_YYYYMMDD-
hhmmss.log file of the running support toolset and the ITOM_Core_Platform directory for the dump files. The following table
describes the dump files in the ITOM_Core_Platform directory.
The directory of container information and user defined information on the current node.
workload:
deployment: suite related data, for example, suite feature, embedded suite database data
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 483
AI Operations Management - Containerized 24.4
##############################################
OMT - Support Data Export
----------------------------------------------
Containers in k8s.io namespace
Export: containers.out
Comments: on Master node myhost.mycomany.com
----------------------------------------------
CONTAINER IMAGE RUNTIME
0318347028899bdd7fb30af24ca5fda6a1433a4532519adc39e1a63cc2191a02 itom-image.registry.example.com:port/hpeswitomsand
box/kubernetes-vault-init:0.15.0-0019 io.containerd.runc.v2
04880158976e091d2ff6efa8da436f90104779f289a7a587a982e92df8e85145 itom-image.registry.example.com:port/hpeswitomsandb
ox/opensuse-base:15.3-002 io.containerd.runc.v2
......
----------------------------------------------
Nodes
Export: kube_summary.out
----------------------------------------------
NAME STATUS ROLES AGE VERSION
myhost.mycomany.com Ready control-plane,master,worker 8h v1.21.4
----------------------------------------------
Pods
Export: kube_summary.out
----------------------------------------------
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINA
TED NODE READINESS GATES
core apphub-apiserver-5b555c6896-jftc8 2/2 Running 0 8h 172.16.0.23 myhost.mycomany.com
<none> <none>
core apphub-ui-54959bb964-m6cx6 2/2 Running 0 8h 172.16.0.22 myhost.mycomany.com
<none> <none>
......
----------------------------------------------
POD Containers
Export: containers_by_pod.out
----------------------------------------------
NAMESPACE POD NODE IMAGE
CONTAINER CONTAINER_ID
core apphub-apiserver-5b555c6896-jftc8 myhost.mycomany.com itom-image.registry.example.com:port/hpeswi
tomsandbox/apphub-apiserver:1.1.0-49 apphub-apiserver 701a4434b5b6
core apphub-apiserver-5b555c6896-jftc8 myhost.mycomany.com itom-image.registry.example.com:port/hpeswi
tomsandbox/kubernetes-vault-renew:0.15.0-0019 kubernetes-vault-renew fe298a393acb
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 484
AI Operations Management - Containerized 24.4
......
----------------------------------------------
Suite Deployment
Export: suite_features.out
----------------------------------------------
SUITE VERSION NAMESPACE DEPLOYMENT_STATUS INSTALL_DATE NFS_SERVER NFS_OUTPUT_PATH
demo 2021.11.001 demo-bheco INSTALL_FINISHED null null
----------------------------------------------
Suite Features
Export: suite_features.out
----------------------------------------------
SUITE EDITION SELECTED FEATURE_SET FEATURE
demo <<EDITION_EXPRESS>> true <<FS1_NAME>> <<FS1_DESC>>
<<FS2_NAME>> <<FS2_DESC>>
<<FS3_NAME>> <<FS3_DESC>>
<<EDITION_PREMIUM>> false
<<EDITION_ULTIMATE>> false
diagnoseIdM command
The diagnoseIdM command is a subcommand of the idm.sh script. It collects diagnostic information about IdM, including:
Diagnostic information is saved to a file in the /idmtools/idm-installer-tools/ directory. To run the command, follow these steps:
kubectl exec -it $(kubectl get pod -n <namespace> -ocustom-columns=NAME:.metadata.name |grep idm|head -1) -n <namespac
e> -c idm sh
For example:
kubectl exec -it $(kubectl get pod -n core -ocustom-columns=NAME:.metadata.name |grep idm|head -1) -n core -c idm sh
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 485
AI Operations Management - Containerized 24.4
sh /idmtools/idm-installer-tools/idm.sh diagnoseIdM
To to collect diagnostic information about the SAML configuration (SAML metadata, and the status of signing and
encryption certificates), run the following command:
For example:
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 486
AI Operations Management - Containerized 24.4
This PDF was generated on 12/19/2024 for your convenience. For the latest documentation, always see https://docs.microfocus.com. Page 487