NTFC NM HealthCheck Report
NTFC NM HealthCheck Report
Check Report
Basic Information
Check Item Type: Reliability, NCE Service, Published Precaution Notices, OS, DB, Network, Hardware
Note: The Start Time and End Time uses the server time.
Environment Information
Hostname: NMS-Server
OS: 4.18.0-147.5.2.19.h1152.eulerosv2r10.x86_64
Number of CPUs: 48
Number of
deployed 382
microservices:
Number of running
microservice 97
processes:
Maximum number
15K
of manageable NEs
Number of
797.0000
managed NEs
Management
file:///C:/Users/ALAGUR~1.MUT/AppData/Local/Temp/Rar$EXa6792.2022/task_1685262689960_NMS_Server_10.0.0.182/report.html 1/30
6/5/23, 2:21 PM Check Report
Management
capacity 5.31%
consumption
NCE Protection
DR System
Solution
Historical Patch Source Version: V100R020C10SPC300 Target Version: V100R020C10CP3607 Change Time: 2022-02-22 16:07:52
Records Source Version: V100R020C10CP3607 Target Version: V100R021C10SPC202 Change Time: 2023-05-20 09:27:22
Total 83 check items No go 0 Risky 2 Timed out 0 Failed 0 Manually confirmed 2 Normal 79
1124 Risky Check whether the NCE deployed NetStar O&M components
7253 Risky Check the size of the unused rtsp directory
1147 Manually confirmed Checking for NemgrV8transService Duplicate MO ID
7303 Manually confirmed Checking the Loading Status of AOC Driver Packages
1061 Normal Check for duplicate data of transport NEs
1078 Normal Checking for Duplicate MO ID
1100 Normal Checking AsonSdh for Duplicate MO ID
1122 Normal Check whether the SSL certificate is missing
1135 Normal Check whether any field is missing in the ClassToTableMap table.
1140 Normal Checking AsonOtn for Duplicate MO ID
1155 Normal Check whether the number of NE script directories is too large.
1156 Normal Checking for NemgrV8transService nesvc_v8transDB_1.tTENE Duplicate cNEID
1165 Normal Check whether the electrical cross-connection data of the VRP NE is normal.
6030 Normal Check whether the XMLAgent certificate type matches the certificate file.
7004 Normal Check the host name configuration file
7005 Normal Check whether the other user has the read permission on files in the /etc directory
7022 Normal Check whether multiple default routes are configured for the OS
7023 Normal Check whether temporary routes are configured for the OS
7028 Normal Check whether the directories of the ossadm, ossuser, and dbuser users exist in the /home directory.
7033 Normal Check whether the swap partition usage of the OS is normal
7045 Normal Check whether the OS user has non-root privileged accounts and whether users with duplicate UIDs exist
7147 Normal Check whether the Zenith database user password has expired
7195 Normal Check the owner and group of key file and directory
7267 Normal Check the owner group of files in the Zenith database directory
7269 Normal Check NCE environment variables
7281 Normal Check the SSH trust relationship of the ossadm user between node 0 and all nodes
7294 Normal Check the Key Configuration Files of the Operation System
7296 Normal Check whether there are running tasks in the background
7297 Normal Check whether the user UID and user group GID are occupied.
Check whether the IP address obtained by the southbound network card in the deployment parameters is empty and is consistent with the uniIpOrNic value stored in
2198001 Normal the database
2198002 Normal Check whether lvs-float-ip in the deployment parameter is empty and whether the value of lvs-float-ip is 127.0.0.1
Level Risky
Result:
The NetStar O&M component has been deployed and needs to be upgraded with the NCE upgrade. The current version : 21.1.610.B009
Suggestion:
Please refer to the installation and upgrade guidance for the corresponding version of NetStar O&M to install the components.
Find /opt/oss/envs via Linux directive find /opt/oss/envs -name Product-NetStarOMService | grep -v grep | wc -l to see if the NetStar O&M microservice exists, if the microservice exists, the component is deployed at NCE, alerting the user that the compon
ent needs to be upgrade with the NCE upgrade.
Level Risky
Result:
The unused size of the '${oss_root}/rtsp' directory exceeds the threshold (5 GB).
Suggestion:
Run the following command as the ossadm user on the node 0 (generally, OMP-01):
bash /${opt.dir}/${oss.dir}/manager/apps/SMPAgentService/pyscript/os/clear_rtsp.sh (replace $ with the actual path)
Result:
Suggestion:
If NE management processes take duplicate MO IDs, they will fail to be started. The tool performs the check as follows:
1. Execute the following SQL statement to query the database names of all NE management processes:
select OWNER from SYS.DB_TABLES where OWNER LIKE UPPER('nesvc_v8transDB_%') AND UPPER(TABLE_NAME) = UPPER('tTENE');
2. Perform the following operations for each NE management database. Execute the following SQL statement to query duplicate MO IDs. If results are displayed, a "No go" message is reported.
select cID, count(*) from MitOIDTableMap group by cID having count(cID)>1 order by cID;
Result:
Suggestion:
check whether there are driver packages that fail to be upgraded by calling the interface of RepoMgrService
Level Normal
Result:
In the database of NemgrTransService and NemgrV8transService, check for duplicate data of transport NEs,for example NEs, ports and shelfs.
file:///C:/Users/ALAGUR~1.MUT/AppData/Local/Temp/Rar$EXa6792.2022/task_1685262689960_NMS_Server_10.0.0.182/report.html 5/30
6/5/23, 2:21 PM Check Report
Level Normal
Result:
If NE management processes take duplicate MO IDs, they will fail to be started. The tool performs the check as follows:
1. Execute the following SQL statement to query the database names of all NE management processes:
select OWNER from SYS.DB_TABLES where OWNER LIKE UPPER('nemgr_transDB_%') AND UPPER(TABLE_NAME) = UPPER('tTENE');
2. Perform the following operations for each NE management database. Execute the following SQL statement to query duplicate MO IDs. If results are displayed, a "No go" message is reported.
select cID, count(*) from MitOIDTableMap group by cID having count(cID)>1 order by cID;
Level Normal
Result:
If the ason_sdh database has duplicate MO IDs. The NmlAsonSdhService process cannot be started.
Level Normal
Result:
Check whether the SSL certificates used by all gateway NEs that use SSL connections in the tTEEthGne table are missing on the NCE server
Level Normal
Result:
Check whether any field is missing in the ClassToTableMap table. If yes, the upgrade will fail.
file:///C:/Users/ALAGUR~1.MUT/AppData/Local/Temp/Rar$EXa6792.2022/task_1685262689960_NMS_Server_10.0.0.182/report.html 6/30
6/5/23, 2:21 PM Check Report
Level Normal
Result:
If the ason_otn database has duplicate MO IDs. The NmlAsonOtnService process cannot be started.
Level Normal
Result:
Level Normal
Result:
If nesvc_v8transDB_1.tTENE have duplicate cNEIDs, some NE backup DB will be failed. The tool performs the check as follows:
1. Execute the following SQL statement to query duplicate cNEID. If results are displayed, a "No go" message is reported.
select cNEID, count(cNEID) from nesvc_v8transDB_1.tTENE group by cNEID having count(cNEID)>1;
Level Normal
Result:
Query all SDH services and check whether there are services that lack information such as routes. If no, the check item passes the check. Otherwise, the check item fails the check.
file:///C:/Users/ALAGUR~1.MUT/AppData/Local/Temp/Rar$EXa6792.2022/task_1685262689960_NMS_Server_10.0.0.182/report.html 7/30
6/5/23, 2:21 PM Check Report
(1161) Check whether unique IDs of lower-layer SDH trails of trunk link trails exist in the SDH trail table.
Level Normal
Result:
Check whether the IDs stored in the cUniqueID field in the tTNMOEthSrv table exist in the tSDHTrail SDH trail table.
Level Normal
Result:
Check the alarms related to license resource exhaustion and display the alarm serial numbers.
Level Normal
Result:
Use SQL to Check whether the electrical cross-connection data of the NE is normal.
(1165) Check whether the electrical cross-connection data of the VRP NE is normal.
Level Normal
Result:
Use SQL to Check whether the electrical cross-connection data of the VRP NE is normal.
(6030) Check whether the XMLAgent certificate type matches the certificate file.
file:///C:/Users/ALAGUR~1.MUT/AppData/Local/Temp/Rar$EXa6792.2022/task_1685262689960_NMS_Server_10.0.0.182/report.html 8/30
6/5/23, 2:21 PM Check Report
(6030) C ec w et e t e ge t ce t cate type atc es t e ce t cate e.
Level Normal
Result:
Check whether the value of HTTPSCertType in the XMLAgent configuration file certificate.ini matches the values of HTTPSkeyStore and HTTPStrustStore, and whether the certificate file exists.
Level Normal
Result:
(7005) Check whether the other user has the read permission on files in the /etc directory
Level Normal
Result:
Check whether the "other" user has the read permission for the following files. If the user does not have the read permission, a "No Go" message is reported. If the "/etc/hosts.YaST2save" file does not exist, the file is not checked.
/etc/hosts
/etc/hostname
/etc/networks
/etc/hosts.YaST2save
/etc/resolv.conf
(7022) Check whether multiple default routes are configured for the OS
Level Normal
Result:
Level Normal
Result:
1. A check is added for temporary routes before upgrade to prevent route loss after the server restarts. If temporary routes exist, a "No Go" message is reported.
2. A check is added for temporary routes during routine check and after upgrade to prevent route loss after the server restarts. If temporary routes exist, a "No Go" message is reported.
Level Normal
Result:
Run the 'df -i' command to check whether the partition inode usage is exceeds the system limit.
If the usage rate is greater than or equal to 60%, the risk is reported; the usage rate is greater than or equal to 80%, and the problem is reported.
Level Normal
Result:
If the "/etc/passwd" file does not meet the product requirements, some problems may occur. The specific check method is as follows:
If the "/etc/passwd" file does not exist, a "No Go" message is reported. Otherwise, perform the following check.
Run the "ls -l /etc/passwd" command to check whether the file permission and owner are valid.
If the read and write permissions for the "/etc/passwd" file are not -rw-r--r-- or the owner is not root, a "Risky" message is reported.
Check whether each line (except the comment line) in the "/etc/passwd" file contains 7 fields (including 6 English colons). If not, the format is invalid and a "Risky" message is reported.
If a Chinese colon (:) is contained, the license is invalid and a "Risky" message is reported.
If the "/etc/passwd" file contains duplicate registration names (the first field in each line) or registration name format is invalid, a "No Go" message is reported.
If the "/etc/shadow" file exist, read the user list from the file. If some registration names read from the "/etc/passwd" file are not contained in the "/etc/shadow" file, a "No Go" message is reported.
file:///C:/Users/ALAGUR~1.MUT/AppData/Local/Temp/Rar$EXa6792.2022/task_1685262689960_NMS_Server_10.0.0.182/report.html 10/30
6/5/23, 2:21 PM Check Report
If the "/etc/shadow" file does not exist, a "No Go" message is reported.
(7028) Check whether the directories of the ossadm, ossuser, and dbuser users exist in the /home directory.
Level Normal
Result:
Check whether the ossadm, ossuser, and dbuser user directories exist in the /home directory. If the directories do not exist, the upgrade fails, the system becomes abnormal, and a "No Go" message is reported.
Level Normal
Result:
1. If "/" is an independent partition and the remaining space of "/" is less than 5G, a "No go" message is reported,
Impact: The disk space needs to be cleared in a timely manner. If the disk usage reaches 100%, the process cannot write files. As a result, an error is reported on the service I/O.
2. If "/opt" is an independent partition, estimate the remaining space using the following formula: max (20, Total space x 0.2 – 20) GB. If the actual remaining space is less than 5 GB, a "No go" message is reported. If the estimated remaining space is 20 G
B and the actual remaining space is less than 20 GB, a "Risky" message is reported. If the estimated remaining space is greater than 20 GB and the actual remaining space is less than the estimated value, a "No go" message is reported.
3. If "/var" is an independent partition and the remaining space of "/var" is less than or equal to 2G or the usage of "/var" is greater than 80%, a "No go" message is reported,
Impact: The disk space needs to be cleared in a timely manner. If the disk usage reaches 100%, the process cannot write files. As a result, an error is reported on the service I/O.
4. If "/tmp" is an independent partition, the remaining space of "/tmp" is less than or equal to 2G, a "No go" message is reported.
Impact: The disk space needs to be cleared in a timely manner. If the disk usage reaches 100%, the process cannot write files. As a result, an error is reported on the service I/O.
5. The upgrade of the management plane depends on the available space of the /home/ossadm directory. If the available space is less than 250 MB, a "No Go" message is reported.
Level Normal
Result:
If the swap partition usage of the OS is too high, NCE runs slowly or some system commands time out. You can run the "free -m | grep Swap" command to check the swap partition usage. If the swap partition usage exceeds 85%, a "No Go" message is rep
orted.
(7045) Check whether the OS user has non-root privileged accounts and whether users with duplicate UIDs exist
file:///C:/Users/ALAGUR~1.MUT/AppData/Local/Temp/Rar$EXa6792.2022/task_1685262689960_NMS_Server_10.0.0.182/report.html 11/30
6/5/23, 2:21 PM Check Report
Level Normal
Result:
1. Each OS user must have a unique UID. The user with the UID of 0 is the "root" user. If the UID of a non-root user is 0, the user will run OS commands as the "root" user, which may cause unauthorized operations. In this case, a "Risky" message is repor
ted.
2. If OS users have duplicate UIDs, the UIDs are maliciously modified, which may cause unauthorized operations. In this case, a "Risky" message is reported.
Level Normal
Result:
Level Normal
Result:
1. The /var/log directory is required for printing logs. If the directory does not exist, the permission is incorrect, or the owner group is incorrect, the log printing is unavailable.
Check whether the permission on /var and /var/log is 755 and whether the owner and owner group are root:root.
2. Check the remaining space of the /var/log directory. If the remaining space is less than or equal to 1 GB, a "Risky" message is reported. If the remaining space is less than or equal to 500 MB, a "No go" message is reported.
Level Normal
Result:
Level Normal
Result:
1. Run the "df ${zenith_install}/data/" command to check whether the Zenith database partition is mounted. If the command output contains at least one record, the partition has been mounted. Otherwise, a "No go" message is reported.
2. If a mount point exists but is not configured in the "/etc/fstab" file, a "No go" message is reported.
(Note: "${zenith_install}" is the actual Zenith installation directory.)
Level Normal
Result:
Check whether the LSNR_PORT and REPL_PORT ports of the ${ZENITH_INSTALL}/data/*/cfg/zengine.ini configuration are monitored.
Take the ${ZENITH_INSTALL}/data/cloudsopdbsvr-3-0/cfg/zengine.ini file as an example:
1. View the LSNR_PORT and REPL_PORT ports configured by ${ZENITH_INSTALL}/data/cloudsopdbsvr-3-0/cfg/zengine.ini. The results are as follows:
LSNR_PORT = 32080
REPL_PORT = 26950
2. Run the ss '( sport = :32080 )' -pl -t |awk -F 'pid=' '{print $2}'|awk -F ',' '{print $1}'|grep -v '^s*$'|uniq command to query the process ID of the LSNR_PORT and REPL_PORT ports, respectively. (In the command, 32080 is the port number obtained in st
ep 1.) The command output is as follows:
279858
3. Run the ps -ef|grep 279858 | grep -v grep |awk "{if(\$2==279858) {print \$NF}}" command. (In the command, 279858 is the process ID obtained in step 2.) The command output is as follows:
${ZENITH_INSTALL}/data/cloudsopdbsvr-3-0
If the configuration file in step 1 is not in the instance directory queried in step 3, a "No go" message is reported.
(Note: ${ZENITH_INSTALL} indicates the installation path of the Zenith database.)
file:///C:/Users/ALAGUR~1.MUT/AppData/Local/Temp/Rar$EXa6792.2022/task_1685262689960_NMS_Server_10.0.0.182/report.html 13/30
6/5/23, 2:21 PM Check Report
Level Normal
Result:
Level Normal
Result:
Level Normal
Result:
To check whether the Zenith database process is started, perform the following steps:
1. Run the 'ls ${ZENITH_INSTALL}/data' command to check all instance names of the Zenith database on the node.
2. Run the 'ps -ef | grep zengine' command to check the started database instance process. If the instance name is not displayed in the command output, the instance is not started and a 'No go' message is reported.
file:///C:/Users/ALAGUR~1.MUT/AppData/Local/Temp/Rar$EXa6792.2022/task_1685262689960_NMS_Server_10.0.0.182/report.html 14/30
6/5/23, 2:21 PM Check Report
(Note: ${ZENITH_INSTALL} is the actual installation directory of the Zenith database.)
Level Normal
Result:
1.Check the file /etc/hosts, a "No go" message is reported if it does not exist.
2. Read the contents of the file /etc/hosts.
In an IPv4 environment:
If the IP "127.0.0.1" or the "localhost" field does not exist in the content result, a "No go" message is reported.
In an IPv6 environment:
If the command output does not contain "::1" or the "localhost" field does not exist, a No go message is reported.
In a hybrid networking environment, one of the preceding conditions must be met. Otherwise, a "No go" message is reported.
(7146) Check whether the Zenith database device files are missing
Level Normal
Result:
(7147) Check whether the Zenith database user password has expired
Level Normal
Result:
(7157) Ch k DBA t St t
file:///C:/Users/ALAGUR~1.MUT/AppData/Local/Temp/Rar$EXa6792.2022/task_1685262689960_NMS_Server_10.0.0.182/report.html 15/30
6/5/23, 2:21 PM Check Report
(7157) Check DBAgent Status
Level Normal
Result:
Run the following command on the management node as the ossadm user: ps -ef |grep "DNFW=dbagent" |grep -v grep|wc -l.if the nodes are deployed in active/standby mode, run the command on both the active and standby nodes.
If 1 is displayed in the command output, the service is normal. If 0 is displayed in the command output, the service is abnormal and a "No Go" message is reported.
If the DBAgent service is abnormal, the following faults may occur: The database instances fail to be updated. The status of the master and slave database instances cannot be monitored. The management plane fails to be upgraded.
Level Normal
Result:
1. Check whether the IP address and subnet mask of the NIC are correctly configured.
Run the "cat $OSS_ROOT/manager/var/agent/mcagentid.conf |awk -F= '{print $2}'" command as the "ossadm" user to obtain the current node ID. Locate the NIC information in the "$OSS_ROOT/manager/etc/sysconf/nodelists.json" configuration file bas
ed on the obtained node ID.Check whether the IP address and subnet mask of the NIC are consistent with the NIC information obtained by running the "/sbin/ifconfig" command(The floating IP address is not checked).
2. Check whether the NIC is normal.
Run the "/sbin/ifconfig" command as the "ossadm" user to check the status of all configured NICs. If the NIC status is "UP" and "RUNNING", the NIC is normal. (The check is not required for the lo NIC.)
3. Check whether the IP address of the NIC can be pinged.
The IP addresses of all NICs are pinged except the lo NIC. It is normal if the packet loss rate of each IP address is not 100%.
Run the "ping -c 3 IPV4" command for IPV4.
Run the "ping -6 -c 3 IPV6" command for IPV6 on Euler OS.
Run the "ping6 -c 3 IPV6" command for IPV6 on SUSE OS.
If any of the preceding conditions is not met, a "No go" message is reported.
Impact: If the NIC is faulty, the communication in NCE is abnormal, affecting NCE functions. Confirm the usage of the NIC and evaluate the affected functions based on the NIC information planned during the installation.
Note: "$OSS_ROOT" is the actual installation directory of NCE.
Level Normal
Result:
If the directory contains spaces, the upgrade or patch installation may fail.Traverse the level-1 files and folders in the "$OSS_ROOT", "$OSS_ROOT/manager/apps", and "$OSS_ROOT/NCE/apps" directories. If the directory contains spaces, a "No go" me
ssage is reported.
Note: "$OSS_ROOT" is the actual installation directory of NCE.
file:///C:/Users/ALAGUR~1.MUT/AppData/Local/Temp/Rar$EXa6792.2022/task_1685262689960_NMS_Server_10.0.0.182/report.html 16/30
6/5/23, 2:21 PM Check Report
Level Normal
Result:
Operations will be affected if NCE-related users are locked in the OS. Therefore, check whether the OS users are locked.
Check whether the root, ossuser, ossadm, ftpuser users exist. If the users exist, run the passwd -S user command to check the user status. If the second column is not PS (EulerOS) or P (SUSE), an error is reported.
Level Normal
Result:
Check the availability of the zsql, check the method: switch to the database user, use the command "which zsql" to check whether the zsql tool is available, if there is no return result, and a "No Go" message is reported, if there are multiple return records, a
nd a "Risky" message is reported.
Level Normal
Result:
Check whether the database device file is placed in the NCE directory, check method: execute the sql statement "SELECT * FROM ADM_DATA_FILES" to obtain all device file paths, if the path is in the NCE directory, and a "No Go" message is reporte
d.
Level Normal
Result:
Run the sysctl kernel.randomize_va_space command as the root user and check whether the value of kernel.randomize_va_space is 2. If the value is 2, the check item is passed. If the value is not 2, a "No go" message is reported; Run the umask command
as the ossadm user to check whether the value of umask is 0027. If the value is not 0027, a "No go" message is reported.
file:///C:/Users/ALAGUR~1.MUT/AppData/Local/Temp/Rar$EXa6792.2022/task_1685262689960_NMS_Server_10.0.0.182/report.html 17/30
6/5/23, 2:21 PM Check Report
Level Normal
Result:
Run the bash /etc/profile command as the root user and check whether the command output contains "readonly variable". If yes, a "Risky" message is reported.
Level Normal
Result:
Level Normal
Result:
Level Normal
Result:
Check the non-commented rows (rows do not start with '#') in file '/etc/ssh/sshd_config':
Make sure all 'PasswordAuthentication' lines is configured as yes, otherwise, an error is reported.
Level Normal
Result:
Level Normal
Result:
Check whether the permissions on the following key system files or paths are correct:
1. If the file does not exist, an error message is reported. If the permission of the communication-related file is incorrect, an error message is reported. If the permission of the non-communication-related file is incorrect, a risk message is reported.
2. If the path or file is a link, skip the file.
3. The corresponding permission has two values. The first column is the minimum permission, the second column is the standard permission, and the third column is the fixed standard permission. (NO indicates no fixed permission, and YES indicates fixe
d permission.)
If the permission on the communication file path in the system is not the standard permission, an error is reported. In this case, you are advised to change the permission to the standard permission. If the permission on the path is lower than the minimum pe
rmission, a risk is reported. You are advised to change the permission to the standard permission.
4. If the permission is all ‘-’, we actually just check the existence of the path.
File2Permission = {
'/tmp': ['rwxrwxrwt', 'ug=rwx,o=rwt','NO'],
'/home': ['rwxr-xr-x', 'u=rwx,go=rx','NO'],
'/opt': ['rwxr-xr-x', 'u=rwx,go=rx','NO'],
'/var': ['rwxr-xr-x', 'u=rwx,go=rx','NO'],
'/var/tmp': ['rwxrwxrwt', 'ug=rwx,o=rwt','NO'],
'/etc': ['rwxr-xr-x', 'u=rwx,go=rx','NO'],
'/etc/passwd': ['rw-r--r--', 'u=rw,go=r','NO'],
'/etc/group': ['rw-r--r--', 'u=rw,go=r','NO'],
'/etc/shadow': ['---------', 'u=-,go=-','NO'],
'/usr': ['rwxr-xr-x', 'u=rwx,go=rx','NO'],
'/sbin': ['rwxr-xr-x', 'u=rwx,go=rx','NO'],
'/usr/sbin': ['r-xr-xr-x', 'u=rwx,go=rx','NO'],
'/usr/bin': ['r-xr-xr-x', 'u=rwx,go=rx','NO'],
'/bin': ['rwxr-xr-x', 'u=rwx,go=rx','NO'],
${OSS_ROOT}: ['rwxr-x---', 'u=rwx,g=rx,o=-','NO'],
'/lib': ['r-xr-xr-x', 'a=rx','NO'],
'/usr/lib': ['r-xr-xr-x', 'a=rx','NO'],
'/lib64': ['r-xr-xr-x', 'a=rx','NO'],
'/usr/lib64': ['r-xr-xr-x', 'a=rx','NO'],
'/boot': ['---------', 'u=-,go=-','NO'],
'/etc/sysconfig/network-scripts/': ['---------', 'u=-,go=-','NO']('/etc/sysconfig/network' in SUSE OS),
file:///C:/Users/ALAGUR~1.MUT/AppData/Local/Temp/Rar$EXa6792.2022/task_1685262689960_NMS_Server_10.0.0.182/report.html 19/30
6/5/23, 2:21 PM Check Report
y g p [ , ,g , ]( y g ),
'/opt/oss/NCE': ['rwxr-x---', 'u=rwx,g=rx,o=-', 'NO'],
'/opt/oss/manager': ['rwxr-x---', 'u=rwx,g=rx,o=-', 'NO'],
'/opt/oss/envs': ['rwxr-x---', 'u=rwx,g=rx,o=-', 'NO'],
'/opt/oss/rtsp': ['rwxr-x---', 'u=rwx,g=rx,o=-', 'NO'],
'/home/ossadm/.ssh': ['rwx------', 'u=rwx,go=-','YES']
}
Level Normal
Result:
Check the system handle configuration. For the users in the list ['dbuser', 'ossadm', 'ossuser']: Run the su - User name -c'ulimit -n' command as the root user. If the echo value is less than 65535, report an error.
Level Normal
Result:
For users or groups exist in system, run the following commands one by one:
getent passwd | awk 'BEGIN{FS=":"}{ if(($3==3001)&&($1=="ossadm")) print $0}' | grep -v "^#"
getent passwd | awk 'BEGIN{FS=":"}{ if(($3==4001)&&($1=="ftpuser")) print $0}' | grep -v "^#"
getent passwd | awk 'BEGIN{FS=":"}{ if(($3==3004)&&($1=="ossuser")) print $0}' | grep -v "^#"
getent passwd | awk 'BEGIN{FS=":"}{ if(($3==3002)&&($1=="dbuser")) print $0}' | grep -v "^#"
Level Normal
Result:
For usernames that exist both in the list ['dbuser', 'ossadm', 'ossuser', 'root'] and in the environment (obtained by running the cat /etc/passwd command): Run the su - User name -c'echo HERE' command. If HERE is contained in the command output, the use
r is normal. Otherwise, an error is reported.
file:///C:/Users/ALAGUR~1.MUT/AppData/Local/Temp/Rar$EXa6792.2022/task_1685262689960_NMS_Server_10.0.0.182/report.html 20/30
6/5/23, 2:21 PM Check Report
Level Normal
Result:
Check system services: ['crond', 'sshd', 'haveged'] (The crond service's name in SUSE is cron). Perform the following operations:
1. Run the systemctl is-active service command. If the output is not active, a problem is reported.
2. Run the systemctl is-enabled service command. If the output is not enabled, a risk is reported.
3. Run the sshd -T command to check whether the /etc/ssh/sshd_config file contains incorrect configurations. If any error information is displayed, a problem is reported.
Level Normal
Result:
Check whether the following system files exist. If the files do not exist, an error is reported.
/var/run/utmp, /var/log/wtmp, /var/log/lastlog
Level Normal
Result:
If the system user passwords have validity periods, some problems may occur after the passwords expire. The check method is as follows:
For users in both the system and the list ['dbuser', 'ossadm', 'ossuser', 'webuser','root', 'ftpuser', 'arbiter', 'omm', 'ommdba']:
1. Run the chage -l "$username" | grep -i "^password expires" | awk -F':' '{print$2}' command to check whether the passwords of the following system users have validity periods:
2. If the command output contains never, the validity period of the password is not limited. Otherwise, a risk is reported.
Level Normal
Result:
Check method:
Run the following command to check the owner groups of the dbuser user:
# groups dbuser
The following owner groups are contained: ossgroup and dbgroup
Run the following command to check the owner groups of the ossuser user:
# groups ossuser
The following owner groups are contained: ossgroup, dbgroup and sopgroup
Run the following command to check the owner groups of the ossadm user:
# groups ossadm
The following owner groups are contained: ossgroup, dbgroup, wheel and sopgroup
In otherwise cases, a "No go" message is reported.
(7195) Check the owner and group of key file and directory
Level Normal
Result:
Level Normal
Result:
file:///C:/Users/ALAGUR~1.MUT/AppData/Local/Temp/Rar$EXa6792.2022/task_1685262689960_NMS_Server_10.0.0.182/report.html 22/30
6/5/23, 2:21 PM Check Report
Check results are normal.
Check the IPs configured in /etc/resolv.conf one by one. If there is IP unavailable, report an error. If the file does not exist, it is considered normal.
(7203) Check if non-NCE default components are installed in the Euler system
Level Normal
Result:
Only check the Euler system. Run the "rpm -qa" command to query the installed components and compare the installed components with the default components. If a non-NCE default component exists, a "Risky" message is reported.
Level Normal
Result:
Run the chage -l sopuser and chage -l ossadm commands to check the password expiration date of the sopuser and ossadm users, respectively. If any password has expired or will expire in less than 7 days, a "No go" message is reported. If any password wi
ll expires in less than 30 days, a "Risky" message is reported.
Level Normal
Result:
Run the /etc/profile, /home/ossadm/.bash_profile, and /home/ossadm/.profile scripts to check whether the PS1 environment variable ends with $. If the PS1 environment variable does not end with $, a "No Go" message is reported.
Level Normal
Result:
Check whether the total size of the smpmanagerservicedb database exceeds 20 GB and whether the size of a single table in the database exceeds 2 GB. If the database is too large or the database contains a large table, the backup before an upgrade takes a lo
ng time.
Perform the following operations to check whether the preceding problems exist:
1. Log in to the smpmanagerservicedb database as the readdbuser user.
2. Run the following SQL statement to check whether the total size of the database exceeds 20 GB: "select round(BYTES/power(1024,3),2) AS RESULT FROM ADM_DATA_FILES WHERE TABLESPACE_NAME='SMPMANAGERSERVICEDB';" If
exceeds 20 GB, a "No Go" message is reported.
3. Run the following SQL statement to query tables whose size exceeds 2 GB: "select table_name,round(bytes/power(1024,3),2) from (select a.TABLE_NAME as table_name,a.BYTES+b.BYTES as bytes from (select TABLE_NAME, BYTES from ADM
_TABLES where OWNER = 'SMPMANAGERSERVICEDB') a left join (select TABLE_NAME, sum(BYTES) as BYTES from ADM_INDEXES where OWNER = 'SMPMANAGERSERVICEDB' group by TABLE_NAME) b on a.TABLE_NAME = b.T
ABLE_NAME) where BYTES > 2*1024*1024*1024;" If some data is displayed, a "No Go" message is reported.
4. If some data is displayed in step 2 but no data is displayed in step 3, run the following SQL statement to query the top 3 large tables: "select table_name,round(bytes/power(1024,3),2) from (select a.TABLE_NAME as table_name,a.BYTES+b.BYTES as
bytes from (select TABLE_NAME, BYTES from ADM_TABLES where OWNER = 'SMPMANAGERSERVICEDB') a left join (select TABLE_NAME, sum(BYTES) as BYTES from ADM_INDEXES where OWNER = 'SMPMANAGERSERVICEDB'
group by TABLE_NAME) b on a.TABLE_NAME = b.TABLE_NAME) order by BYTES desc limit 3;".
Level Normal
Result:
1. Check whether the sizes of the directories used by the DFS service exceed 2 GB. If any of the directories exceeds 2 GB, a "Risky" message is reported and the directories that exceed 2 GB are listed. The directories to be checked are as follows:
/${opt.dir}/${oss.dir}/share/manager/SMPAgentService/zipResultDir/logcollect
/${opt.dir}/${oss.dir}/share/manager/SMPManagerService/zipResultDir/logcollect
/${opt.dir}/${backup.dir}/smp/logdata/logcollect
/${opt.dir}/${backup.dir}/smp/logdata/errorslicelog
Impact: If a large directory exists, the backup before the upgrade takes a long time.
2. Check the size of the /opt/zenith/data/${dbInstance} directory (excluding the data directory and archive_log directory). If the size exceeds 10 GB, a "Risky" message reported.Check the size of the /opt/gauss/data/${dbInstance} directory (excluding the b
ase directory and archive_log directory). If the size exceeds 10 GB, a "Risky" message reported.
3. Check the size of the /opt/oss/share directory.If the size exceeds 30 GB, a "Risky" message reported.
4. Check the size of subdirectories in the /opt/oss directory, excluding the envs, rtsp, log and share directories and the directories that are not backed up in the staticapp_NCE.json file. If the size of any subdirectory exceeds 10 GB, a "Risky" message is repo
rted.
5. Check the size of the following directories. If the size of any directory exceeds 5 GB, a "Risky" message is reported.
Directory where the / partition is located
Directory where the /usr partition is located
Directory where the /var partition is located
Directory where the /boot partition is located
Directory where the /home partition is located
6. Check whether the inode usage of the /opt partition exceeds 1 million, whether the inode usage of the /usr partition exceeds 20%, and whether the inode usage of other partitions exceeds 5%. If yes, traverse all partitions (except the partitions whose file s
ystem is tmpfs/devtmpfs) to check the number and size of files. Three layers of file system are traversed by default (the number of layers to be traversed can be set). If the number of files in any directory exceeds 1000 or any file size exceeds 500 MB, a "Ri
sky" message is reported.
Level Normal
Result:
1. Obtain the number of the backup server configured on the Backup and Restore > Configure Backup Parameters page of the management plane.
2. If more than one backup server is configured in the single-management-plane scenario, more than two backup servers are configured in the distributed scenario,"Warning" message is reported. if no backup server if configured, a "No Go" message is repo
rted.
3. Check whether the system is connected to the backup server, if failed, a "No Go" message is reported.
4. The permission on and owner of the /opt/backup directory is 750 and root:ossgroup, respectively.
The permission on and owner of the /opt/backup/backuptmp directory is 750 and ossadm:ossgroup, respectively, and no ACL permission is configured for the directory. If the preceding directory permission and owner requirements are not met, a "No go" m
essage is reported.
Level Normal
Result:
Execute the command # cat /etc/sudoers | grep "secure_path" to check whether the sudoers configuration file lacks the secure_path configuration item. If it is missing, it will report a problem.
Level Normal
Result:
The proper running of distributed NCE depends on the time consistency between nodes. The Chrony/NTP service of the system is used to ensure the time consistency. Perform the following operations to check the service status:
1. Run the "service chronyd status" command to check the status of the Chrony service. If the Chrony service is not in the running state, run the "service ntpd status" command to check the status of the NTP service. If neither the Chrony service nor the NT
P service is in the running state, a "No go" message is reported.
2. If the Chrony service is enabled, run the "chronyc sources" command to view the information about the Chrony service. If the queried information starts with an asterisk (*) or plus sign (+) and the value of last in the information is less than 60s, the Chro
ny service is normal. Otherwise, the service is abnormal and a "No go" message is reported.
3. If the NTP service is enabled, run the "ntpq -pn" command to view the information about the NTP service. If the queried information starts with an asterisk (*) or plus sign (+) and the value of offset in the information is less than 60000, the NTP service i
s normal. Otherwise, the service is abnormal and a "No go" message is reported.
Impact: If the NTP service is abnormal, the time is inconsistent between NCE nodes. As a result, the system functions cannot work properly, and upgrades and patch installations will fail.
(7266) Check whether the mounting configuration of key system partitions in the /etc/fstab file is correct.
Level Normal
Result:
1.Check whether the mounting configuration of the "/ /home /opt /usr /var /boot" partition in the /etc/fstab file is correct. If the partition mounting configuration is lost or commented out, a risk is reported. If the device path to which the partition is mounted
does not exist, a risk is reported. If the partition mounting configurations are duplicate, a risk is reported.
file:///C:/Users/ALAGUR~1.MUT/AppData/Local/Temp/Rar$EXa6792.2022/task_1685262689960_NMS_Server_10.0.0.182/report.html 25/30
6/5/23, 2:21 PM Check Report
2. Check whether the device path mounted to each partition exists in the /etc/fstab file. If the path does not exist, a risk is reported.
(7267) Check the owner group of files in the Zenith database directory
Level Normal
Result:
Check all the .ini files in the ${ZENITH_INSTALL} directory and all subdirectories and files in the ${ZENITH_INSTALL}/app directory. If they are not satisfied: the user is: dbuser, the group is: dbgroup, a "No Go" message is reported. (Not
e: ${ZENITH_INSTALL} is the actual installation path of zenith)
Level Normal
Result:
This check item checks the environment variables of the ossadm, ossuser, and dbuser users.
1. Run the following command as the ossadm and ossuser users, respectively. If an error is reported, a "No go" message is reported.
#sudo -i -u ${user} bash -c "env"
If the ${zenith_install}/app directory exists, perform the preceding check as the dbuser user as well.
2. If the ${zenith_install}/app directory exists, run the following command as the ossadm user:
#sudo -i -u dbuser bash -c "env |grep GSDB_HOME"
If the command output does not contain "GSDB_HOME=${zenith_install}/app", a "No go" message is reported.
3. Check whether the .$HOME/.bash_profile and .$HOME/.bashrc files contain the same environment variable with different values (except PATH), as the ossadm and ossuser users, respectively.
For example, if both "A=xxx" and "A=yyy" exist, a "Risky" message is reported.
If the ${zenith_install}/app directory exists, perform the preceding check as the dbuser user as well.
Impact: If a user is used to log in to the system and the system environment variable is unavailable, the NCE application functions are abnormal, for example, the application fails to be started or the function is invalid.
(Note: ${zenith_install} indicates the installation path of the Zenith database, and ${user} indicates the actual user. Replace them with the actual values before running the commands.)
Level Normal
Result:
In the installation directory of the microservices,use the "find . -xtype l" command to check whether the invalid soft links are exist. If the query result is empty, the check is normal. Otherwise, the check fails.
Level Normal
Result:
Run necessary operating system commands to check whether related commands exist.
Level Normal
Result:
Checking the Owners and Permissions of the Directories Affecting the Upgrade
1. The following directories cannot contain files or directories of the root owner. ${TENANT} indicates the tenant name.
${opt.dir}/oss/manager
${opt.dir}/pub/software
${opt.dir}/oss/tmp
${opt.dir}/oss/share
${opt.dir}/oss/envs
${opt.dir}/oss/${TENANT}/etc/ssl
${opt.dir}/oss/${TENANT}/etc
3. Check the owners of the microservice installation directory, share directory, and log directory.
Detection by microservice granularity:
a) Obtain the microservice list: subdirectory of ${opt.dir}/oss/${TENANT}/apps.
b) Set the user to which the value of RUN_AS_USER in ${opt.dir}/oss/${TENANT}/apps/XX microservice/envs/env.properties as the running user of the microservice.
c) Check whether the owners and running users of the files or directories in the installation directory and share directory are the same.
Installation directory: ${opt.dir}/oss/${TENANT}/apps/XX microservice/ (You must search for the installation directory of the microservice on the service plane with the last slant bar.)
share directory: ${opt.dir}/oss/share/${TENANT}/XX microservice
d) Check the microservice log directory ${opt.dir}/oss/log/${TENANT}/XX. The owner of the directory is the same as the running user.
4. Database directory permission: Only dbuser is allowed. The directory cannot contain files or directories of other owners.
file:///C:/Users/ALAGUR~1.MUT/AppData/Local/Temp/Rar$EXa6792.2022/task_1685262689960_NMS_Server_10.0.0.182/report.html 27/30
6/5/23, 2:21 PM Check Report
${opt.dir}/redis
${opt.dir}/zenith
(7281) Check the SSH trust relationship of the ossadm user between node 0 and all nodes
Level Normal
Result:
Check the SSH trust relationship of the ossadm user between node 0 (generally, OMP-01) and all nodes (including the current node). The check method is as follows:
On node 0, run the following command as the ossadm user.
ssh ${ip} -o StrictHostKeyChecking=no -n "a=b"
Level Normal
Result:
(7283) Check whether the active node of the management plane is node 0
Level Normal
Result:
Check whether the active node of the management plane is node 0 (generally, OMP-01) as follows:
1. Check whether the /opt/oss/manager/apps/OMMHAService/bin/status.sh file exists in the environment.
2. If the file exists, call the file and check whether the HA status of the environment is normal based on the returned result.
3. If the HA status is normal, check whether the /opt/oss/manager/var/share/standby_flag_file file exists on the OMP-01 node. If the file exists, the OMP-01 node is not the active node and an active/standby switchover is required. If an exception is returne
d, the HA status is abnormal and you need to locate the fault.
Level Normal
file:///C:/Users/ALAGUR~1.MUT/AppData/Local/Temp/Rar$EXa6792.2022/task_1685262689960_NMS_Server_10.0.0.182/report.html 28/30
6/5/23, 2:21 PM Check Report
Result:
Check the key configuration files of the operation system:['/etc/passwd','/etc/shadow','/etc/group'].Perform the following operations:
Run the lsattr file's name command to check the file status. If the file has the i permission, a problem is reported.
Level Normal
Result:
(7297) Check whether the user UID and user group GID are occupied.
Level Normal
Result:
1. Check whether the sopgroup user group GID2002 is occupied. grep ".*:2002: "/etc/group
2. Run the egrep '3008 | 2999 | 1346 | 1103' /etc/passwd command to check whether the user UID matches the preset user ID.
Check whether the command output matches "1103": "webuser","1346": "iscript","2999": "secuser","3008": "sopuser".
(2198001) Check whether the IP address obtained by the southbound network card in the deployment parameters is empty and is consistent with the uniIpOrNic value stored in the database
Level Normal
Result:
1. If the IP address obtained by the deployment parameter uniIpOrNic value is empty, it will cause UniCollectAgentService microservice failure and NCE upgrade failure
2. If the deployment parameter uniIpOrNic value is inconsistent with the uniIpOrNic value stored in the database, the NCE upgrade will fail.
(2198002) Check whether lvs-float-ip in the deployment parameter is empty and whether the value of lvs-float-ip is 127.0.0.1
file:///C:/Users/ALAGUR~1.MUT/AppData/Local/Temp/Rar$EXa6792.2022/task_1685262689960_NMS_Server_10.0.0.182/report.html 29/30
6/5/23, 2:21 PM Check Report
( 9800 ) C ec w et e vs oat p t e dep oy e t pa a ete s e pty a d w et e t e va ue o vs oat p s 7.0.0.
Level Normal
Result:
1. If the IP address obtained from the lvs-float-ip parameter is empty, the UniCollectAgentService microservice will be faulty and NCE will fail to be upgraded.
2. If the IP address obtained from the lvs-float-ip parameter is 127.0. 0.1, the UniCollectAgentService microservice will be faulty when LVS is required.
file:///C:/Users/ALAGUR~1.MUT/AppData/Local/Temp/Rar$EXa6792.2022/task_1685262689960_NMS_Server_10.0.0.182/report.html 30/30