Tecucc 3000

Download as pdf or txt
Download as pdf or txt
You are on page 1of 376

Cisco Unified Communications

Manager Serviceability and


Troubleshooting
Paul Giralt - @PaulGiralt
Baha Akman - @mbakman
TECUCC-3000
Session Objectives
• Become familiarized with the various serviceability tools available in Unified CM
to assist in data gathering and analysis
• Learn how to set trace levels to provide sufficient trace data to troubleshoot
issues
• Understand what data to collect to troubleshoot various Cisco IP telephony
problems
• Use collected data to find root cause of some real-world problems

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 3
What You Should Know
• Cisco Unified Communications Manager configuration and operation
• Cisco IOS® voice gateway configuration and operation
• Basic understanding of:
•Skinny Client Control Protocol (SCCP)
•Session Initiation Protocol (SIP)
•Media Gateway Control Protocol (MGCP)
•H.323
•Integrated Services Digital Network (ISDN)

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 4
Today’s Schedule
• 9:00 – 11:00
• 11:00 – 11:15 – Break
• 11:15 – 13:15
• 13:15 – 14:15 – Lunch
• 14:15 – 16:15
• 16:15 – 16:30 – Break
• 16:30 – 18:30pm

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 5
Agenda
• Serviceability Tools Overview
Real-Time Monitoring Tool (RTMT)
Cisco Unified Operating System GUI
Cisco Unified Operating System CLI
Cisco Serviceability Reports Archive and Cisco Unified Reporting
• Troubleshooting Methodology
Problem Description
Information Collection
• Troubleshooting Case Studies
Dropped Call Unable to Place Calls
Delayed Audio Cut-Through Call Drops After Answering
No One Answers the Phone Video Encryption Not Working
• Understanding and Troubleshooting Unified CM Throttling Events
• Troubleshooting Database Replication
Cisco Unified Communications
Manager Serviceability Tools
Overview
Agenda
• Serviceability Tools Overview
Real-Time Monitoring Tool (RTMT)
Cisco Unified Operating System GUI
Cisco Unified Operating System CLI
Cisco Serviceability Reports Archive and Cisco Unified Reporting
• Troubleshooting Methodology
Problem Description
Information Collection
• Troubleshooting Case Studies
Dropped Call Unable to Place Calls
Delayed Audio Cut-Through Call Drops After Answering
No One Answers the Phone Video Encryption Not Working
• Understanding and Troubleshooting Unified CM Throttling Events
• Troubleshooting Database Replication
Unified CM Serviceability Introduction
• Three primary serviceability interfaces into UC manager: Real-Time Monitoring Tool
(RTMT), OS admin GUI, and OS admin CLI
• RTMT essential to serviceability and
monitoring
• Precanned alerts, perfmon, trace and log central
• Some serviceability functionality is duplicated
between Cisco unified OS administration GUI
and CLI and RTMT
• Provides redundancy and resiliency
• Appliance model impacts
• Access to console
• Install and upgrades
• Disk partitioning

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 9
Cisco Unified OS Administration CLI vs. GUI vs. RTMT
CLI GUI
(SSH2 or Console Access)
(Including Serviceability and Platform RTMT
DRS and CU Reporting)
No Dependency to Other Services Depends on Cisco Tomcat Service Depends on Various Services
— just OS and Database Platform Status + CCM Application Summary
Platform Status Summary and Details Platform Status Summary and Details

See all Services’ Status and Control Some See All Services’ Status See All Services’ Status
and Control All Precanned Monitoring Screens
Set all Platform Configuration
Service Activation/Deactivation Performance Counter Collection and
Diagnose Hardware Problems
Set Some Platform Configuration Graphing
Component Utilities i.e. Database
IPSEC Configuration Alert Central
Tech Support Commands
Certificate Management Syslog Viewer
DRS Backup/Restore
Upgrades and Option Installs View / Search / Collect Trace / Log Files
View / Search / Collect Trace / Log Files
DRS Backup/Restore Session Trace
Upgrades and Option Installs
System Reports SIP Trunk Status
NTP Status & Configuration
NTP Status & Configuration

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 10
Unified CM Management Interfaces Overview

SOAP/AXL
SNMP MIB/Trap
SSH2 Platform CLI
HTTPS GUI Web Pages
Alert Emails SMTP AMC Service
Remote Syslog
Services/Syslog Agt
(S)FTP Push
CDR/CMR
Trace & Logs
Backup

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 11
Agenda
• Serviceability Tools Overview
Real-Time Monitoring Tool (RTMT)
Cisco Unified Operating System GUI
Cisco Unified Operating System CLI
Cisco Serviceability Reports Archive and Cisco Unified Reporting
• Troubleshooting Methodology
Problem Description
Information Collection
• Troubleshooting Case Studies
Dropped Call Unable to Place Calls
Delayed Audio Cut-Through Call Drops After Answering
No One Answers the Phone Video Encryption Not Working
• Understanding and Troubleshooting Unified CM Throttling Events
• Troubleshooting Database Replication
Real-Time Monitoring Tool: Overview
• RTMT is the primary serviceability interface for Unified CM
• Linux or Windows-based client
• Downloaded via CCMAdmin  Application  Plugins
• Unsupported OS X version see Paul
• Provides the following serviceability functionality
• Monitor performance counters
• Supports OS, unified CM application, and unity connection
• Both live and historical counter data
• Alert central
• Trace and log central
• Pre-canned screens
• Syslog viewer
• Device search
• Analysis Manager
• Session Trace (SIP)
• Depends on the following network services to function
• Alert Manager Collector (AMC), RISDC, Database, Cisco Tomcat, Cisco CallManager Serviceability RTMT
• You can install multiple instances (Change Install Folder)

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 13
Real-Time Monitoring Tool: Performance Counters
• Performance counters have
classes, counters, and instances
per node
• Counters can be viewed in table
view or in graph view
• Polling rate can be adjusted as low
as 5 sec. - Default is 10 sec
• Counter descriptions can be
accessed by right clicking on them
• Profiles can be used to save
performance categories and
counters created
• Custom alerts could be set up
against any performance counter
given a threshold

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 14
Real-Time Monitoring Tool: Performance Counters
Save your
Performance
Counters and other
RTMT Tabs in Profiles
via (Ctrl+Alt+p)

Use Categories to Monitor


Group of Performance
DB Counters

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 15
Real-Time Monitoring Tool: Performance Counter-
Based Alerting Can be
Changed

Determines


EMAIL
Destination(s)

  
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 16
Real-Time Monitoring Tool: Performance Counter-
Based Alerting

• Once the threshold is  Alert Details are emailed when


reached at the time of AMC
 Enable email option is checked
Polling Interval (30sec) Alert
is raised  SMTP server is configured
 Configured Alert action includes
email destinations

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 17
Real-Time Monitoring Tool: Performance Counter
Collection

Set this before Starting


Counter Logging

• Right click on each category and select start counter(s) logging


• Single CSV file is created per host
• System -> Performance -> Counter Logging Configuration controls file size and count
• RTMT must be running for collection to take place

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 18
Best Practice

Performance Counter Collection Without RTMT


• Alert Manager Collector (AMC) service collects and archives performance counters
• By default, the AMC service collects critical performance counters
• Polling rate is every 30 seconds by default can go down to 15
• AMC has primary and failover collector
•By default, publisher becomes primary collector and failover collector is NOT configured and it should be for RTMT, and
AMC/alerting redundancy
SET THIS!!!

• Logs are saved under active/inactive logs cm/log/amc/PerfMonLog/


• RTMT trace and log collector collect files or remote browse
• Select service name: Cisco AMC Service PerfMonLog

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 19
Best Practice

Performance Counter Collection Without RTMT


• Cisco RIS Data Collector Troubleshooting Perfmon Data Logging
• Enabled by default
• Under RIS data collector service parameters on each server
• Polling rate is every 15 seconds by default can go down to 5 seconds
• File size can be adjusted to cover longer periods of time in each file
• Logs are saved under active/inactive logs cm/log/ris/csv/ on each server
• RTMT Trace & Log Central can collect these files
•Select service name: Cisco RIS Data Collector PerfMonLog
Maximize
it to100

Best Practice
Setup a Trace Collection Job to Collect Cisco RIS Data Collector PerfmonLog
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 20
Real-Time Monitoring Tool: Performance Log Viewer
• RTMT performance log viewer
can load CSV log files from
•RISDC Perfmon Data from
remote server or saved files
• Add/remove multiple counters
from single file
• Zoom in/out
• Limitations/caveats
• Can only view files one at a time
and from one server at a time
• No way to highlight counters
(don’t add too many)

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 21
Unified CM Appliance Physical Memory and CPU
Utilization via RTMT

☠️
😬
Memory Utilization % =
(Total KBytes - Free 👍
Kbytes - Buffers Kbytes -
Cached KBytes + Shared
KBytes) / (Total KBytes +
Total Swap KBytes)

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 22
Reference

CPU & Memory Monitoring


VMWare 7.5k ova VMWare 10k ova
2 Cores 4 Cores
Total CPU Usage < 68 % Good < 68 % Good
_Total % CPU Time 68-79 % Warning 68-79 % Warning
> 80 % Bad > 80 % Bad

Process ccm < 44 % Good < 22 % Good


ccm  % CPU Time

IOWait < 20 % Good < 20 % Good


_Total  IOwait Percentage 20-40 % Warning 20-40 % Warning
> 41 % Bad > 41 % Bad
Process ccm VmSize < 2.1 GB Good < 2.1 GB Good
ccm  VmSize

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 23
Unified CM Alarms vs. Alerts
Alarms Alerts
 Generated by Applications or Services or  They are only generated by the Alert Manager
Platform Collector (AMC) Service
 Alarm library embedded in to the Services Primary or Secondary
/Applications forwards them to destinations  Triggers from
 Alarm definitions and severities are Alarm(s)
predefined
Perfmon Counter(s) State/Value
 Available in the Alarm Catalog System Performance and conditions
 Admin can adjust the Alarm notification CPU, Memory
destinations and filter them based on severity Syslog messages
 !! Alarms can trigger alerts and such alerts
can be logged as alarms !!
ALARM Local Syslog
Alternate Syslog
Perfmon ALERT ALARM
Syslog SNMP Trap
CPU Remote Syslog
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 24
Reference

Alarm and Alert System Flow


CallManager Common IM & P
Alarm Catalog Alarm Catalog Alarm Catalog Cisco
Syslog Trap
Catalogs: alarm definition
Alert
Action
CallManager
Remote EMAIL
Service

Remote syslog
syslog AMC
Common Alarm Event Log Syslog Syslog
Services Lib Daemon Agent
RIS Alert
Central
Unity
Connection
Service

Services CiscoSyslog
AlternateSyslog

RTMT syslog
Alarm Configuration (user controls)
viewer
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 25
Unified CM Serviceability Alarm Configuration and
Definitions
• Alarm configuration
• Alarm event level (Filter)
• Emergency  debug
• Alarm destination
• Local Syslog
• Alternate Syslog
• Remote Syslogs
• SDI/SDL Trace Files

• Alarm definitions catalog


•Provides enum definitions for reason
codes, description, explanation, and most
importantly recommended action

Error and System Messages


http://www.cisco.com/en/US/products/sw/voicesw/ps556/products_system_message_guides_list.html
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 26
Unified CM Serviceability Alarm Destinations

 CLI - activelog syslog/CiscoSyslog or AlternateSyslog


 RTMT – Event Viewer-Application log

 Can not send to another Unified CM server


 Can only send to one remote server prior to Unified CM 9.0
 Defaults to local7 facility, can not change it

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 27
Real-Time Monitoring Tool: Alert Central
• Alerts can be triggered by:
• Performance counter value  
• Application or system log message
• Can be sent out via email
• Depends on primary/secondary AMC service
• Must configure and have access to a
SMTP mail server
• Destinations can be controlled via Alert Actions


• Alerts have severities like alarms
• Alerts can be suspended or disabled per node or
clusterwide
• Thresholds, alert raising interval, severity can be
adjusted
• Set alert properties…
• Can be reset back to default Config
• RTMT alert history keeps last 100 or last 30
minutes worth of alerts
• Cisco AMC service AlertLogs

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 28
Real-Time Monitoring Tool: Alert Central Alert Email
Process
• AMC Service is responsible for mailing out Alert
• !!! Remember to set AMC Failover Collector !!!
• Alert Emails will be from RTMT_Admin@<domainname>
• Can be changed using RTMT Client
• RTMT  System  Tools  Alert  Config Email Server

• Domain name is retrieved from the Platform’s Domain Name configuration


show network <eth0/failover> detail
DNS
Primary : 172.18.106.25 Secondary :
Options : timeout:5 attempts:2
Domain : cisco.com
Gateway : 172.18.106.1 on Ethernet bond0

• No SMTP Authentication support

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 29
Best Practice

Real-Time Monitoring Tool: Alert Central Trace Download


• CodeYellow, CoreDumpFileFound, CriticalServiceDown
alerts’ properties have ability to download traces 
upon trigger
• Trace Download Parameters allows you to download
traces at the time of the alert raising and upload to
a SFTP/FTP server
• Traces collected at the time of an alert could be
essential for troubleshooting

  

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 30
Reference

Cisco AMC Service Alert Logs


• AMC service is responsible for raising alerts
• Primary AMC server (publisher) monitors the whole cluster
• If primary is down secondary AMC server takes over
• Depends on AMC and RISDC service from all nodes

• AMC keeps track of alerts as they are raised in a CSV file


• Duration is hard coded to seven days
• Alert history can be downloaded via OS administration CLI or RTMT trace and log central
• From the primary AMC collector (defaults to publisher node)
• From the failover AMC collector when primary is down (default not set)
• Active/inactive logs cm\log\amc\AlertLog\
• RTMT trace and log collector collect files or remote browse
• Select service name: Cisco AMC Service AlertLog

• Note: if you open the AlertLog CSV file with Excel, you must convert time stamp column (in UTC msec) to
Excel datetime stamp
• = B2/(24*60*60*1000) + DATE(1970,1,1)

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 31
Cisco AMC Service Alert Logs
• All alerts raised by AMC AlertMgr are also logged in to
application logs as alarms
activelog syslog/CiscoSyslog
• The logged Alarms have matching Severity as set in
Alert’s Properties
• The Alarms Logging the Alerts can be sent to remote
syslog
• Cisco AMC Service Alarm Destination Configuration

admin:file search activelog syslog/CiscoSyslog RTMT


Jan 25 04:50:15 vnt-cm1a local7 2 : 155813: vnt-cm1a.cisco.com: Jan 25 2014 09:50:15 AM.397 UTC : %UC_RTMT-2-RTMT_ALERT:
%[AlertName=CallProcessingNodeCpuPegging][AlertDetail=#012 Processor load over configured threshold for configured
duration of time . Configured high threshold is 90 %#012CiscoDRFLocal (42 percent) uses most of the CPU. #012
#015#012Processor_Info: #015#012#012#015 For processor instance 1: %CPU= 99, %User= 19, %System= 11, %Nice= 23, %Idle= 1,
%IOWait= 0, %softirq= 25, %irq= 22. #015#012#012#015 For processor instance _Total: %CPU= 92, %User= 22, %System= 18,
%Nice= 28, %Idle= 8, %IOWait= 0, %softirq= 13, %irq= 11. #015#012#012#015 For processor instance 0: %CPU= 84, %User= 26,
%System= 25, %Nice= 33, %Idle= 15, %IOWait= 0, %softirq= 0, %irq= 0. #015#012#012The alert is generated on Sat Jan 25
04:50:15 EST 2014 on node vnt-cm1a.cisco.com. #012 #015#012 Memory_Info: %Mem Used= 57, %VM Used= 62. #012#015#012
Partition_Info: #015#012Swap: %Disk Used=76. #012Active: %Disk Used=65. #012Common: %Disk Used][AppID=Cisco AMC
Service][ClusterID=][NodeID=vnt-cm1a]: RTMT Alert

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 32
Best Practice

Cisco AMC Service Alert Logs (Cont.)


• AMC Service by default sends Syslog/Alarms at Error Level to Local Syslog
• Some Pre-canned Alerts’ Default Severity is Warning or Below
• LowAttendantConsoleServerHeartbeatRate
• LogFileSearchStringFound
• LowCallManagerHeartbeatRate
• LowTFTPServerHeartbeatRate Change to
Informational
Level
• MediaListExhausted
• RouteListExhausted

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 33
Sending All Syslog Messages to Remote Servers
• Enterprise parameter for Cisco syslog agent
• Covers Platform OS Alarms or Syslogs
• Event Viewer—System log (messages)
• Can forward up to 5 Remote Syslog Servers
• Syslog Messages sent via UDP by default
• TCP Support added as of Unified CM 11.5
utils remotesyslog set protocol tcp
• Potential to duplicate alarms sent to remote syslog server
• If you have also configured remote syslog destinations via alarm configuration
• Event Viewer—Application log (CiscoSyslog/AlternateSyslog)

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 34
Reference
Sending Unified CM Alarms to Remote Servers
via SNMP Traps
• Alarms that route to local syslogs can be sent out via SNMP
traps utilizing CISCO-SYSLOG-MIB and notification destinations configured under
serviceability GUI
• Configuration steps need to be performed on all servers/nodes
1. Configure SNMP V1/V2 or V3 notification destination
2. Configure alarm’s to ensure local syslog is enabled and set the alarm event level to the desired level
3. Use SNMP SET to enable clogNotificationsEnabled
• Object = clogNotificationsEnabled
• OID = 1.3.6.1.4.1.9.9.41.1.1.2
• snmpset –v1 –c <write string> <host-ip> 1.3.6.1.4.1.9.9.41.1.1.2.0 i 1
4. Use SNMP SET to configure clogMaxSeverity to the desired level
• Object = clogMaxSeverity
• OID = 1.3.6.1.4.1.9.9.41.1.1.3
• snmpset –v1 –c <write string> <host-ip> 1.3.6.1.4.1.9.9.41.1.1.3.0 i <level>

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 35
Real-Time Monitoring Tool: Sample Alerts
• SDLLinkOutofService
SDLLinkOOS for CTI or CCM
• SyslogSeverityMatchFound Pay Attention 
Severity 2 or above
• ServerDown
Depends on AMC services
Utilizes server list
• DBReplicationTableOutOfSync
NOT enabled by default Enable This 
Enabled by DBMON service Parameter

• DBChangeNotifyFailure
Depends on DBMON service
Monitors DB CN queues, collect show tech notify when detected
• SyslogStringMatchFound
Event Viewer – Application and System Logs are search for a given list of Strings configurable within Alert
Properties
• SystemVersionMismatched
Raised during upgrades/switchover

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 36
Reference
Real-Time Monitoring Tool:
SyslogSeverityMatchFound Examples
• NTP Status
•Apr 21 14:01:35 ord-pub1 local7 2 : 17: Apr 21 19:01:35.638 UTC : %CCM_RTMT-RTMT-2-RTMT-ERROR-ALERT: RTMT Alert
Name:SyslogSeverityMatchFound Detail: At Tue Apr 21 14:01:35 CDT 2009 on node ORD-PUB1, the following SyslogSeverityMatchFound events generated:
SeverityMatch - Critical ntpRunningStatus.sh: NTP server 10.12.254.33 is inactive. Verify the network to this server, that it is a NTPv4 server and is
operational. SeverityMatch - Alert sshd(pam_unix)[20038]: check pass; user unknown App ID:Cisco AMC Service Cluster ID: Node ID:ord-pub1
• Signal Congestion Entry
• Jul 5 12:03:01 cucm-pub local7 2 : 23813: cucm-pub: Jul 05 2016 04:03:01 PM.688 UTC : %UC_RTMT-2-RTMT_ALERT:
%[AlertName=SyslogSeverityMatchFound][AlertDetail= At Tue Jul 05 12:03:01 EDT 2016 on node 1.2.3.4, the following SyslogSeverityMatchFound events
generated: #012SeverityMatch : Critical#012MatchedEvent : Jul 5 12:02:29 cucm-sub1 local7 2 ccm: 6838: cucm-sub1: Jul 05 2016 16:02:29.795 UTC :
%UC_CALLMANAGER-2-SignalCongestionEntry: %[Thread=SIP Handler Thread] [AverageDelay=22] [EntryLatency=20] [ExitLatency=8]
[SampleSize=10] [TotalSignalCongestionEntry=6752][HighPriorityQueueDepth=0][NormalPriorityQueueDepth=1][LowPriorityQueueDepth=0][AppID=Cisco
CallManager][ClusterID=UCMCluster1][NodeID=cucm-sub1]: Unified CM has detected signal congestion in an internal thread and has throttled activities for
that thread#012AppID : Cisco Syslog Agent#012ClusterID : #012NodeID : cucm-sub1#012 TimeStamp : Tue Jul 05 12:02:2][AppID=Cisco AMC
Service][ClusterID=][NodeID=cucm-pub]: RTMT Alert
• Local Syslog Overload
•Apr 21 13:53:56 ord-pub1 local7 2 : 2: Apr 21 18:53:56.683 UTC : %CCM_RTMT-RTMT-2-RTMT-ERROR-ALERT: RTMT Alert
Name:SyslogSeverityMatchFound Detail: At Tue Apr 21 13:53:56 CDT 2009 on node ORD-PUB1, the following SyslogSeverityMatchFound events generated:
SeverityMatch - Alert nbslogpd[3496]: 3 messages were dropped SeverityMatch - Emergency kernel: [...network console startup...] App ID:Cisco AMC
Service Cluster ID: Node ID:ord-pub1
• Certificate Validation Expiration
•Jun 17 01:00:10 cucm-pub local7 2 : 1217: cucm-pub: Jun 17 2016 05:00:10 AM.988 UTC : %UC_RTMT-2-RTMT_ALERT:
%[AlertName=SyslogSeverityMatchFound][AlertDetail= At Fri Jun 17 01:00:10 EDT 2016 on node cucm-pub, the following SyslogSeverityMatchFound events
generated: #012SeverityMatch : Alert#012MatchedEvent : Jun 17 01:00:00 cucm-pub local7 1 : 19: cucm-pub: Jun 17 2016 05:00:00 AM.128 UTC :
%UC_CERT-1-CertValidLessthanADay: %[Message=Certificate expiration Notification. Certificate name:ecats-uc-test-exp-c-1a.vnt.cisco.com.der
Unit:CallManager-trust Type:own-ce][AppID=Cisco Certificate Monitor][ClusterID=][NodeID=cucm-pub]: Certificate is about to Expire in less than 24 hours or
has Expired #012AppID : Cisco Syslog Agent#012ClusterID : #012NodeID : cucm-pub#012 TimeStamp : Fri Jun 17 01:00:00 EDT 2016][AppID=Cisco AMC
Service][ClusterID=][NodeID=cucm-pub]: RTMT Alert

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 37
Service Manager: Feature vs. Network Services
• Feature services
• Services that can be activated/deactivated
• e.g., CallManager, TFTP
• Network services
• Services that are always activated can not be deactivated
• Servm Started by initrd and maintained by inittab
SERVM
• Can not stop/start/restart it
• Each service has its own restart limit DB

ServM Logs CallManager RISDC AMC


 file get activelog platform/servm_startup.log
 file get activelog platform/log/servm*.log CTFTP IPVMS CTIManager

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 38
Service Manager Alarms and Alerts
Service manager has its own catalog
• See Unified CM Serviceability GUI alarm  definitions  System Alarm Catalog  Service Manager Alarm
Catalog

Nov 25 16:11:21 makman-vmcm1 local7 6 : 30: Nov 25 21:11:21.986 UTC :


%UC_GENERIC-6-ServiceStopped: %[ServiceName=Cisco Tftp][AppID=Cisco Service
Manager][ClusterID=][NodeID=makman-vmcm1]
Nov 25 16:12:17 makman-vmcm1 local7 3 : 35: Nov 25 21:12:17.173 UTC :
%CCM_SERVICEMANAGER-SERVICEMANAGER-3-ServiceExceededMaxRestarts: Service exceeded
maximum allowed restarts. Service Name:Cisco Tftp Reason:3 App ID:Cisco Service Manager Cluster ID:
Node ID:makman-vmcm1
Nov 25 16:12:20 makman-vmcm1 local7 3 : 0: Nov 25 21:12:20.831 UTC : %CCM_RTMT-RTMT-3-
RTMT-ERROR-ALERT: RTMT Alert Name:CriticalServiceDown Detail: Service status is DOWN. Cisco
Tftp. The alert is generated on Sun Nov 25 16:12:20 EST 2007 on node 192.168.1.9. App ID:Cisco AMC
Service Cluster ID: Node ID:makman-vmcm1

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 39
Real-Time Monitoring Tool: Trace and Log Central
• Remote browse
• Allows you to browse trace/log files for services/applications and system logs
• Can download selected files from the browse window
• Collect files
• Allows you to collect log/trace files for service/application and system logs matching the given time range
• Query wizard
• Allows you to query log/trace files for service/application and system logs given a match string and time range
• Schedule collection
• Allows you to create scheduled collection job’s for service/application log/trace files given the time range and collection interval
• Real-time trace
• View real-time data allows you to see log/trace files for service/application and system logs in real time and give basic search
functionality
• Monitor user event allows you to monitor an event in log/trace files for service/application and system logs given a monitoring
time range. Upon a match several actions can be taken such as raise an alert, local syslog, remote syslog, download file.
• Collect crash dump
• Allows you to collect core dump files for a given service/application and matched time range

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 40
Real-Time Monitoring Tool: Trace and Log Central 
Remote Browse
Use to See Files on
Server(s)

Service Logs
System Logs
Audit Logs
Crash Dump Files
Download or Delete

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 41
Real-Time Monitoring Tool: Trace and Log Central 
Collect Files
• Can collect logs/traces from multiple nodes on demand
Unified CM
• Collection done over HTTPS and can be cancelled 10+ RTMT



DO NOT ZIP HERE

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 42
Real-Time Monitoring Tool: Trace and Log Central 
Query Wizard 
• Same selection process as in collect files
• Can save queries for future use
• Can set call processing impact level
• Once query completes, matching file names are displayed similar
to Remote Browse
• Equivalent to platform CLI command
file search

 

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 43
Best Practice
Real-Time Monitoring Tool:
Trace and Log Central  Schedule Collection
• Same selection process as in collect files
• Can choose to collect all files or collect matching ones to
a query string
• Zip files option is done on the server side
• Use job status to monitor current jobs
• Upload to SFTP/FTP Servers
• “Collect files generated in the
last” only applies to the first
collection

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 44
Reference
Trace & Log Central  Schedule Collection
Recommendations
CallProcessing
Publisher TFTP/MOH All Nodes
Subscriber

• Cisco Serviceability • Prog Logs • Cisco TFTP •Cisco Database Layer


Reporter • Cisco CallManager • Cisco IP Voice Media Monitor
• AlertReport • CTI Manager Streaming App •Cisco Database Library
Trace
• CallActivitiesReport
•Cisco Database
• DeviceReport Notification Service
• PPRReport •Cisco Database
• ServerReport Replicator Trace
• ServiceReport •Cisco RIS Data
Collector PerfMonLog
•Service Manager
•Event Viewer-
Application Log
•Event Viewer-System
Log
•SAR Logs
•Cisco Audit Logs

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 45
Real-Time Monitoring Tool:
Trace and Log Central  Schedule Collection
• If trace collection server is down and a scheduled job fails there will be an error-
level alarm raised at the local server which experienced the problem
• When the collection job resumes it will not go back and collect the trace files
since the first failed job, it will only go back up to the scheduled interval

Jun 5 04:49:57 sjc-rfd-pub-1 local7 3 : 2: Jun 05


11:49:57.93 UTC : %CCM_TCT-LPMTCT-3-
ScheduledCollectionError: An error occurred while
executing scheduled collection. JobID:1180808534704
Reason:SFTP server 10.3.2.149 not reachable. Scheduled
run #62 App ID:Cisco Trace Collection Service Cluster ID:
Node ID:sjc-rfd-pub-1

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 46
Real-Time Monitoring Tool: Trace and Log Central 
Real-Time Trace  View Real-Time Data

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 47
Real-Time Monitoring Tool: Trace and Log Central 
Real-Time Trace  Monitor User Event


 

Feb 15 16:09:25 vnt-cm1a local7 4 : 402: vnt-cm1a.cisco.com: Feb 15 2017 21:09:25.302 UTC : %UC_RTMT-4-RTMT_ALERT:
%[AlertName=LogFileSearchStringFound][AlertDetail=#012 At Wed Feb 15 16:09:25 EST 2017 on node , the following LogFileSearchStringFound events generated:
#012SearchString : Cnf string encountered in file SDL002_100_000010.txt.gzo#012AppID : Cisco Trace Collection Service#012ClusterID : #012NodeID : vnt-cm1b#012
TimeStamp : Wed Feb 15 16:09:08 EST 2017][AppID=Cisco AMC Service][ClusterID=][NodeID=vnt-cm1a]: RTMT Alert © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
TECUCC-3000 48
Real-Time Monitoring Tool: Monitor User Event 
Use Case Example
• Problem Statement: I have very crafty UC Admins who manage to create call
routing loops via our SIP Trunks between our SME & Leaf Clusters. How can I
detect these call routing loops and get notified?
• Solution: The Q.850 Cause Code of 25 could be used to detect such
conditions. This code is used when a SIP Call is rejected with a 483 Response
upon depleting the Max-Forwards count.
SIP/2.0 483 Too Many Hops
Via: SIP/2.0/TCP 10.122.224.65:5060;branch=z9hG4bK2638a1fb46f12
From: "Baha Akman" <sip:[email protected]>;tag=172381~7098c01f-c01f-4579-bc5b-6146a650f424-110041506
To: <sip:[email protected]>;tag=13471639
Call-ID: [email protected]
CSeq: 101 INVITE
Reason: Q.850;cause=25
We can setup a Monitor User event against the CallManager Traces to detect it
and get notified via an Alert

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 49
Real-Time Monitoring Tool: Monitor User Event 
Use Case Example

  

Need to Create the


Monitoring Job ONE
Node at a time

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 50
Real-Time Monitoring Tool: Monitor User Event 
Use Case Example

Feb 15 17:08:10 vnt-cm1a local7 4 : 407: vnt-cm1a.cisco.com: Feb 15 2017 22:08:10.962 UTC : %UC_RTMT-4-RTMT_ALERT:
%[AlertName=LogFileSearchStringFound][AlertDetail=#012 At Wed Feb 15 17:08:10 EST 2017 on node vnt-cm1b.cisco.com, the following LogFileSearchStringFound events
generated: #012SearchString : Reason: Q.850;cause=25 string encountered in file SDL002_100_000058.txt.gzo#012AppID : Cisco Trace Collection Service#012ClusterID :
#012NodeID : vnt-cm1b#012 TimeStamp : Wed Feb 15 17:07:52 EST 2017][AppID=Cisco AMC Service][ClusterID=][NodeID=vnt-cm1a]:
TECUCC-3000 RTMTAllAlert
© 2017 Cisco and/or its affiliates. rights reserved. Cisco Public 51
Real-Time Monitoring Tool: Trace and Log Central
IOWait Throttling
• Customized via clusterwide RISDC service parameters
• Warning is displayed on all on-demand operations

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 52
Unified CM Serviceability Trace Configuration
• Cannot change the Trace Destinations
Each service/application has fixed destination under activelogs partition
RTMT trace and log central uses the service’s name to access trace/log files
Virtualized Unified CM Disk Size can be increased via ciscocm.vmware-disk-size-reallocation-1.0.cop.sgn
Required for Unified CM 8.6 & 9.1 NOT required for Unified CM 10+
• Log partition monitor service monitors the common partition where trace/log files are
placed
You can configure the following information parameters in alert central in RTMT:
LogPartitionLowWaterMarkExceeded — disk space utilization level at which log partition monitoring stops
purging log files; level ranges exist from 10 – 90 percent; default equals 80 percent; configuration must be lower than
high watermark
LogPartitionHighWaterMarkExceeded — disk space utilization level at which log partition monitoring starts
purging log files; level ranges exist from 15 – 95 percent; default equals 90 percent
• In order to minimize unnecessary
IO impact avoid hitting the LogPartitionHighWaterMark
Control the maximum number files and maximum file size trace configuration

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 53
Real-Time Monitoring Tool: Syslog Viewer
• System logs
• messages log file contains OS logs, platform agents logs

• Application logs
• CiscoSyslog log file contains Alarms from most Cisco Unified CallManager Alarm Catalogs
• AlternateSyslog (8.6+) log file contains certain Unified CM Alarm Catalogs such as Phones

• Security logs
•secure log file contains security-related messages such as all login attempts to the platform and other internal
process executions at privileged level

• Nbsyslogd is nonblocking meaning it will drop messages if the system is overloaded.


Unified CM 10.X+ utilizes rsyslogd
• Jun 8 17:38:54 azo-cm-uc syslog 1 nbslogpd[4456]: 104 messages were dropped
• Feb 16 04:02:01 vnt-cm1c syslog 6 rsyslogd-2177:imuxsock begins to drop messages from pid 16915 due to rate-
limiting

File Rotation Settings


Each File Can Grow Up to 5 MB and Rotated 4 Times
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 54
Real-Time Monitoring Tool: Syslog Viewer

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Unified CM 10.0+

Real-Time Monitoring Tool: AuditLog Viewer


• AuditApp Logs • vos Logs
• Application Level Audit Logs • Operating System Level Audit Logs
activelog audit/AuditApp/Audit*.log activelog audit/vos/vos-audit.log*

• Enabled via Cisco Unified • Enabled via Admin CLI


Serviceability Tools  Audit Log utils auditd enable
Configuration
• Can send to Remote Syslog • OS Level Audit Logs are also
forwarded to syslog/messages file

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 56
Unified CM 10.0+

Real-Time Monitoring Tool: AuditLog Viewer


Date: 01/08/2015 10:08:54.455
UserID: CCMAdministrator
ClientAddress: 1.2.3.4
Severity: Notice
EventType: GeneralConfigurationUpdate
ResourceAccessed: CUCMAdmin
EventStatus: Success
CompulsoryEvent: No
AuditCategory: AdministrativeEvent
ComponentID: Cisco CUCM Administration
AuditDetails: record in table device with key field name = CSFMSUPHAVA
updated
App ID: Cisco Tomcat
Cluster ID:
Node ID: vnt-cm1b

© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Real-Time Monitoring Tool: Device Search
• Use device search to find out last activities of
devices
•When they last registered, failed over, failed back,
unregistered

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 58
Real-Time Monitoring Tool: Device Search

• Same information about devices can be found via platform CLI commands
show risdb query or show risdb list

• RISDB query can be saved in to a file and can be downloaded or viewed via
“show file view” command
• Timestamp is in RTMT client’s timezone

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 59
Unified CM 10.0+
Real-Time Monitoring Tool: Device Search
SIP Trunk Detailed Service Status

The only way to see a SIP Trunk’s


Real Time Service Status per node
• Click on a Trunk running on a Node to see Detailed • Applicable only to SIP Trunks where OPTIONS
Status
• Status Shown per destination from a Unified CM node’s
Ping is enabled
perspective
• StatusReason maps to SIPTrunkOOS Alarm Definition Reasons • Historical SIP Trunk Status available via
• Local=2  local SIP stack is not able to create a socket
connection with the remote peer
CallManager Alarms
• Remote=503  “503 Service Unavailable" a standard SIP • SIPTrunkOOS, SIPTrunkISV, SIPTrunkPartiallyISV
RFC error code received

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 60
Real-Time Monitoring Tool: Analysis Manager
Overview
• A Client Application in the Real Time Monitoring Tool (RTMT)
• Provides a Single User Interface for Troubleshooting Functions Across the
following UC Products:
Cisco Unified Communications Manager
Cisco Unified Communications Manager Business Edition
Cisco Unified Unity Connection
Cisco Unified Presence
Cisco Unified Contact Center Express
Cisco Unified Voice Portal
Cisco Unified Contact Center Enterprise
Cisco IOS Voice Gateways via ACS

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 61
Real-Time Monitoring Tool: Analysis Manager

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 62
Real-Time Monitoring Tool: Analysis Manager
Features
• Product Nodes Inventory and Grouping
• Configure Trace Settings
• Trace Collection on demand
• Scheduled Trace Collection
• Templates for setting trace levels and Trace File Repository for UCM
• Collect/View System Configuration information
• System Call Tracking
• Upload trace files to FTP/SFTP server and preferences
• Products supported:
•UCM, UCCX, UCCE/ICM, EA, CVP, CUP, UC, UCMBE

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 63
Prime Collaboration Standard with Unified CM 10.0
 Prime Collaboration Standard is included with
UCM 10.0+
 Web based no client install
– Requires Cisco AXL Web Service on Unified CM
 Deployed quickly (10-15min) as vApp
 PCA Standard can only monitor 1 Cluster at
a time
 4 Key RTMT functionality provided
– Alert Central
– Precanned Dashboards
– Performance Counter Monitoring
– Analysis Manager like Trace Collection (PCA
10.6+)

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 64
Agenda
• Serviceability Tools Overview
Real-Time Monitoring Tool (RTMT)
Cisco Unified Operating System GUI
Cisco Unified Operating System CLI
Cisco Serviceability Reports Archive and Cisco Unified Reporting
• Troubleshooting Methodology
Problem Description
Information Collection
• Troubleshooting Case Studies
Dropped Call Unable to Place Calls
Delayed Audio Cut-Through Call Drops After Answering
No One Answers the Phone Video Encryption Not Working
• Understanding and Troubleshooting Unified CM Throttling Events
• Troubleshooting Database Replication
Cisco Unified OS GUI
• Displays basic OS-level information
• List cluster nodes
• Show hardware information (CPU type, installed memory, RAID controller status)
• View IP addressing and network statistics
• List installed software (including all COP files)—shows active and inactive versions
• Display system-level statistics (CPU/memory/disk utilization)
• Displays TCP/IP Port usage (IP Preferences)

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 66
Cisco Unified OS GUI
• Allows configuration of platform-level settings
• IP addressing or Hostname information
• NTP server and time configuration
• SMTP server address (used for OS-level notifications such as certificate expirations)
• Reset, restart, and switch version of the server

• Configuration of platform security settings


• Manage certificates (upload, download, generate)
• Bulk Certificate Management (Import, Export, Consolidate)
• Configure certificate monitor notifications
• Configure IPSEC policies
• Configure Single Sign On
• Upload Customized Logon Message

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 67
Cisco Unified OS GUI
• Software Installation
• Install COP files or upgrade unified CM software
• TFTP file management
• Upload or delete files from the TFTP directory (e.g., RingList.xml)
• Device Load Management available in 11.X+
• Allows easy clean up of Device Loads that are Not In Use

• Ping from the server


• Useful for troubleshooting IP connectivity issues from the server
• Can validate IPSEC connections

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 68
Agenda
• Serviceability Tools Overview
Real-Time Monitoring Tool (RTMT)
Cisco Unified Operating System GUI
Cisco Unified Operating System CLI
Cisco Serviceability Reports Archive and Cisco Unified Reporting
• Troubleshooting Methodology
Problem Description
Information Collection
• Troubleshooting Case Studies
Dropped Call Unable to Place Calls
Delayed Audio Cut-Through Call Drops After Answering
No One Answers the Phone Video Encryption Not Working
• Understanding and Troubleshooting Unified CM Throttling Events
• Troubleshooting Database Replication
Cisco Unified OS Administration CLI (Platform CLI)
Overview
• Command line interface access
SSH2 client remotely
Local keyboard/mouse console access or via KVM/DSview/ILO
Serial console access via COM1
• CLI gives wrapped and controlled interface to several OS/appliance/application functions
• Provides several “show tech” commands
• Provides low level platform/appliance health status and monitoring
• Multiple sessions can be opened at the same time via SSH2 remote connections
• Duplicates some functionality that is available in RTMT
Check services status, performance counter access, RISDB search, etc.
utils service, show perf, show risdb
• All activities are logged with auditing support
• Context-sensitive command syntax help is provided with “?”
admin:set timezone ?
Syntax:
set timezone zone
zone mandatory This is the new timezone. Enter the appropriate string
or zone index id to uniquely identify the timezone.
A list of valid timezones can be obtained via the
following CLI command: show timezone list. TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 70
Sample CLI Commands: Trace and Logs
• file list
Lists files similar to Linux “ls” command
file list activelog
file list inactivelog
file list install
file list partBsalog
file list salog
file list sftpdetails
file list tftp
• file search
Searches files for a given regular expression similar to Linux “grep” command
admin:file search activelog ?
Syntax:
file search activelog file-spec reg-exp [options]
file-spec mandatory file to view
reg-exp mandatory regular expression which is to be searched.
To include "s escape them with \.
options optional reltime days|hours|minutes timevalue
abstime hh:mm:ss mm/dd/yyyy hh:mm:ss mm/dd/yyyy
ignorecase,recurs
• file tail
Tails a given file similar to Linux “tail” command. Has regular expression support. Use ‘recent’ to tail the newest file in the directory
• file get
Uploads a file from the node where command is issued to a remote SFTP server
• file dump
Cats a file to the screen. Enable “set cli pagination off” prior to get a quick dump of an entire file

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 71
Sample CLI Commands: Network
• set network
Allows admin to set IP address, DNS, domain name, MTU, PMTUD, NIC speed/duplex,
default gateway, NIC teaming, etc.
admin:set network
set network cluster publisher
set network dhcp*
set network dns*
set network domain
set network failover
set network gateway
set network hostname
set network ip*
set network ipv6*
set network max_ip_conntrack
set network mtu
set network name-service  Controls Name Service Caching Daemon
set network nic*
set network ntp option
set network pmtud
set network restore
set network status*

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 72
Sample CLI Commands: Network
• show network
Allows admin to see the following network information
admin:show network
show network all
show network cluster
show network dhcp
show network eth0 detail (MAC address)
show network failover (NIC Teaming)
show network ip_conntrack
show network ipprefs*
show network ipv6*
show network max_ip_conntrack
show network name-service*
show network route
show network status
Syntax:show network status [options]
options optional detail,listen,process,all,nodns,search stext
options are:
detail - Display additional information
listen - Display only Listening Sockets
process - Display the process ID and name of the program to
which each socket belongs
all - Display both Listening and Non-Listening Sockets
nodns - Displays Numerical Addresses without any DNS information
search stext - Search for the "stext" in the output

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 73
Sample CLI Commands Network
• utils firewall ipv4/ipv6 list
•Shows the Internal firewall rules that is in place. Each node has to authenticate in to the cluster to get allowed to connect to
certain applications. After successful authentication firewall rules are adjusted to allow connection. Starting with Unified CM 7.X
`All ports are denied by Default. If a Service is not activated ports are not allowed.
• show open ports all/regexp
• Used to see which TCP/UDP and application has open or established ports
admin:show open ports regexp "2000"

Executing.. please wait.


ccm 31097 ccmbase 256u IPv4 43464284 TCP 10.9.30.5:2000 (LISTEN)
ccm 31097 ccmbase 260u IPv4 43464297 TCP 10.9.30.5:2000->10.9.36.204:49516 (ESTABLISHED)

• utils network capture


• Allows admin to sniff network traffic similar to Linux command “tcpdump”
• Can save to a file under activelog platform/cli/*.cap
• utils network capture-rotate
• Enhanced network capture command to allow Local File Rotation
• utils network host
•Allows admin to perform DNS name lookups including SRV records similar to Linux
command “dig”. Can specify which external server to use for lookup.

• utils network name-service hosts/services cache invalidate


• Clears Hosts or Services Entries out of the Name Caching Daemon

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 74
Sample CLI Commands Network
• utils network connectivity
• Used only on the subscriber nodes. Performs a Network Connectivity test between the Subscriber
node and Publisher. Utilizes Cluster Manager and ensures TCP/UDP port 8500 communication is
intact. If there is a failure the following alarm will be logged in the Event Viewer – Application Log.
May 21 13:49:50 bldr-ccm97 local7 6 : 7: May 21 19:49:50.533 UTC
: %CCM_CLUSTERMANAGER-CLUSTERMANAGER-6-CLM_ConnectivityTest: CLM
Connectivity Test Failed. Node's IP:10.94.150.99 Error description
:CLM_TEST_UNABLE_UDP_DATAGRAM App ID:Cisco Cluster Manager Cluster ID:
Node ID:bldr-ccm97

•The same Connectivity test is also ran automatically by the Cluster Manager Service every 3
minutes to proactively detect major intracluster communication problems.

• utils network connectivity [hostname]


• Can be run on any node against any other node. Used to check Intracluster communication.

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 75
Sample CLI Commands: Platform/OS
• show status
Shows the current platform status information such as datetime, uptime, versions, CPU, memory, and disk usage summary
• license management system remove
Removes the Local License Management System (PLM) installation, if you are utilizing a Standalone PLM
• utils vmtools refresh
Performs Interactive Vmware Tools Installation when the Vmtools Installation ISO is mounted
Requires Reboot after successful Install / Update of Vmtools

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 76
Sample CLI Commands: Platform/OS
• show process list
•Lists processes currently running similar to Linux “ps” command with or without details such as threads, file descriptors,
memory usage, etc. Can search for processes using process id, name or userid
• show process using-most cpu/memory
• Shows the top 5 Processes using the most CPU or memory.
• show process load
•Lists top CPU processes currently running similar to Linux “top” command. Top process sort order can be adjusted using
memory, CPU, time. noidle option can indicate which processes are waiting on IOWait

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 77
Sample CLI Commands: Platform/OS
• utils os kerneldump
•Replaces “utils netdump” functionality. Used to collect debug information in the event of a kernel
panic. In case there is catastrophic hardware failure debug information can be sent to a remote SSH
server.
admin:utils os kerneldump
utils os kerneldump disable
utils os kerneldump enable
Best Practice
utils os kerneldump ssh*
utils os kerneldump status

admin:file list install crash/* date detail


09 Jun,2012 14:22:20 <dir> 127.0.0.1-2012-06-09-14:20:03
dir count = 1, file count = 0
admin:file list install crash/127.0.0.1-2012-06-09-14:20:03/ detail
09 Jun,2012 14:20:14 70,926,328 vmcore
dir count = 0, file count = 1

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 78
Sample CLI Commands: Platform/OS
• utils system
• Utility to shutdown, restart or switch versions on the system

• utils os secure
• Utility to switch SELinux mode from enforce (default) to permissive

• utils core active/inactive analyze


•Analyzes a coredump file and records the backtrace information. Essential to pass on to TAC in the unlikely event you experience a
CoreDumpFile found alert. Use file list activelog core first to find out the core filename. IOwait warning

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 79
Sample CLI Commands: Platform/OS
• utils create report security
• Collects SELinux Security related logs, including VOS audit logs

admin:utils create report security


Collecting files...
Security Diagnostic files have been collected: security-diagnostics.tar.gz
To retrieve the security-diagnostic.tar.gz, use CLI command: file get activelog syslog/security-diagnostics.tar.gz
To delete the security-diagnostic.tar.gz, use CLI command: file delete activelog syslog/security-diagnostics.tar.gz

• utils filebeat
• Allows Export of Platform Audit Logs to a LogStash Server
utils filebeat config
utils filebeat disable
utils filebeat enable
utils filebeat status
utils filebeat tls*

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 80
Sample CLI Commands: Platform/OS IOWAIT
• utils fior Best Practice
• utils iostat
• File IO reporting is used to periodically • Equivalent of Linux iostat command
capture IO stats for each process.
• Polling occurs every 10 min. So data is admin:utils iostat
Syntax:
not as granular.
utils iostat
• You must first enable FIOR then start it. interval optional (seconds) Interval
Once enabled it will remain enabled between two iostat readings - mandatory if
iterations is being used
through restarts. iterations optional The number of
admin:utils fior iostat iterations to be performed - mandatory if
utils fior disable interval is being used
utils fior enable filename optional Redirect the output
to a file
utils fior list
utils fior start
utils fior status
utils fior stop
utils fior top

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 81
Cisco Unified OS Administration CLI Tips
• Only CBC Based Ciphers are supported for outbound SFTP Connections prior to Unified CM 11.5
• SFTP Server Side “/etc/ssh/sshd_config” can be modified to allow older CBC based Ciphers
Ciphers [email protected],aes128-ctr,aes192-ctr,aes256-ctr,aes128-
[email protected],[email protected],aes256-cbc,3des-cbc

• Some operations will cause increased CPU utilization and IOWait state. Use with caution
file get, file search, utils dbreplication, etc.

• Watch out for impact of show tech commands. Read documentation first before trying them
show tech all, show tech database, show tech routeplan, etc.

• CTRL + C can break out of many commands


• Some characters are not legal. When pressed you won’t see anything on the screen
Semicolon (;) or backtick (`) or pipe (|) or ampersand (&)

• DNS Reverse Lookup failure or Very high IOWait conditions could significantly delay CLI login times or prevent
logins
• Watch out for CSCuy82773 while logging in to Platform CLI via Vsphere Virtual Machine Console
• Reset CCMAdministrator/CallManager application password using CLI command
utils reset_application_ui_administrator_name
utils reset_application_ui_administrator_password

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 82
Agenda
• Serviceability Tools Overview
Real-Time Monitoring Tool (RTMT)
Cisco Unified Operating System GUI
Cisco Unified Operating System CLI
Cisco Serviceability Reports Archive and Cisco Unified Reporting
• Troubleshooting Methodology
Problem Description
Information Collection
• Troubleshooting Case Studies
Dropped Call Unable to Place Calls
Delayed Audio Cut-Through Call Drops After Answering
No One Answers the Phone Video Encryption Not Working
• Understanding and Troubleshooting Unified CM Throttling Events
• Troubleshooting Database Replication
Best Practice

Unified CM Serviceability Reports Archive


• Data collected by primary/failover AMC service
• Reports are generated by Cisco Serviceability Reporter Service
• Should be activated on the Publisher node Only
• Reports are generated daily and each covers last 24 hours
• Accessible via Cisco Unified Cerviceability  Tools  Serviceability Reports Archive
•Reports are generated at 12:30am by default. Set by Cisco
Serviceability Reporter Service Parameter RTMT Report Report Generation Time*
• Reports can be collected via RTMT or CLI
•Cisco Serviceability Reporter AlertReport, CallActivitiesReport, DeviceReport, PPRReport,
ServerReport, ServiceReport
file get activelog cm/report/rtmtreporter/* recurs
• Archive can keep up to 30 days
•Set by Cisco Serviceability Reporter Service Parameter
RTMT Report Deletion Age*

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 84
Unified CM Serviceability Reports Call Activities Report
• Call activity for the cluster • MGCP gateways
• Calls attempted • FXO—ports In Service
• Calls completed • FXO—ports active
• H323 gateways call activity for the cluster • FXS—ports In service
• Calls attempted • FXS—ports active
• Calls completed • PRI—spans In service
• MGCP gateways call activity for the cluster • PRI—channels active
• T1 CAS—calls completed • T1 CAS—spans In service
• PRI—calls completed • T1 CAS—channels active
• FXS—calls completed • Trunk call activity for the cluster
• FXO—calls completed • H323 trunks—calls attempted
• H323 trunks—calls completed
• SIP trunk—calls attempted
• SIP trunk—calls completed
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 85
Unified CM Serviceability Reports Alert Summary Report
• Number of alerts per severity for the cluster
• Severity—number of alerts
• Number of alerts per server
• Server—number of alerts
• Top 10 alerts in the cluster
• Alerts

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 86
Unified CM Serviceability Reports Device Statistics
Report
• Number of registered phones per server
• Servers, clusterwide
• Number of partially registered phones per server
• Servers, clusterwide
• Number of MGCP gateways registered in the cluster
• Cisco MGCP FXO gateways
• Cisco MGCP FXS gateways
• Cisco MGCP PRI gateways
• Cisco MGCP T1CAS gateways
• Number of H323 gateways in the cluster
• Number of trunks in the cluster
• H323 Trunks
• SIP Trunks

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 87
Unified CM Serviceability Reports Performance
Protection Statistics (1 of 2)
• Call activity for 172.18.106.58
• Calls attempted * hourly rate
• Calls completed * hourly rate
• Calls In progress

• Number of registered phones, MGCP gateway for 172.18.106.58


• Phones
• MGCP gateways

• System resource utilization for 172.18.106.58


• % CPU usage
• % Virtual memory usage
• % Hard disk usage of the common partition
• % Hard disk usage of the swap partition
• % Hard disk usage of the active partition
• % Hard disk usage of the inactive partition

This Report Is Generated per Server and Includes Last Seven Days of Performance Data

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 88
Unified CM Serviceability Reports Performance
Protection Statistics (2 of 2)
• Devices
• Number of IP phones 7212
• Number of unity connection ports 241
• Number of CTI ports 16
• Number of CTI route points 14
• Number of H323 clients 1
• Number of H323 gateways 4
• Number of MGCP gateways 12
• Number of MOH resources 3
• Number of MTP resources 12
• Number of CFB resources 14
• Dial plan
• Number of directory numbers/lines 2609
• Number of route patterns 57
• Number of translation patterns 34

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 89
Unified CM Serviceability Reports Service Statistics
Report
• Cisco CTI manager: number of open devices
• Servers
• Cisco CTI manager: number of open lines
• Servers
• Cisco TFTP: number of requests
• Server
• Cisco TFTP: number of aborted requests
• Server

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 90
Unified CM Serviceability Reports Server Report
• % CPU per server
• Servers
• % Virtual memory usage per server
• Servers
• %Hard disk usage of the common partition
per server
• Server

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 91
Unified CM Serviceability Reports Archive
Sample Alert Summary Report

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 92
Unified CM Serviceability Reports Archive
Sample Alert Summary Report

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 93
Unified CM Serviceability Reports Archive
Sample Alert Summary Report

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 94
Unified CM Serviceability Reports Archive
Sample Call Activities Report

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 95
Unified CM Serviceability Reports Archive
Sample Call Activities Report

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 96
Unified CM Serviceability Reports Archive
Sample Server Reports

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 97
Unified CM Serviceability Reports Archive
Sample Server Reports

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 98
Unified CM Serviceability Reports Archive
Sample Server Performance Protection Statistics

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 99
Cisco Unified Reporting
• Run reports from publisher node to quickly
diagnose common problems
• Reports to run before and after upgrades
Unified CM Data Inventory Summary
Unified CM Data Summary
Unified CM Cluster Overview
Unified CM Database Status
Unified CM Phones with Mismatched Load

• Unified CM data summary


• Could be used to take cluster size snapshot
• Traces to collect if there is a problem
• Cisco Unified Reporting Web Service
• Cisco Tomcat

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 100
Cisco Unified Reporting
• Samples From CM Cluster Overview Report:

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 101
Cisco Unified Reporting
• Samples From CM Device Distribution Summary Report:

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 102
Cisco Unified Reporting
• Samples From CM Database Status Report:

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 103
Agenda
• Serviceability Tools Overview
Real-Time Monitoring Tool (RTMT)
Cisco Unified Operating System GUI
Cisco Unified Operating System CLI
Cisco Serviceability Reports Archive and Cisco Unified Reporting
• Troubleshooting Methodology
Problem Description
Information Collection
• Troubleshooting Case Studies
Dropped Call Unable to Place Calls
Delayed Audio Cut-Through Call Drops After Answering
No One Answers the Phone Video Encryption Not Working
• Understanding and Troubleshooting Code Yellow Events
• Troubleshooting Database Replication
Troubleshooting Methodology
Problem Description
• First step: understand the problem you are troubleshooting
• Make the problem description as detailed as possible
• Stick to factual data and don’t jump to conclusions
• If multiple problems are reported, try to narrow your focus to one
problem at a time

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 106
Problem Description
Some Questions to Ask:
• What happened?
• Who did it happen to?
• When did it happen?
• What were you doing when it happened?
• What device were you using?
• What changed?
• Is it plugged in?

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 107
Problem Description
Egon: Don’t cross the streams.
Venkman: Why?
Egon: It would be bad.
Venkman: I’m fuzzy on the whole good/bad thing. What do you mean “bad”?
Egon: Try to imagine all life as you know it stopping instantaneously, and every
molecule in your body exploding at the speed of light.

Source: Ghostbusters, 1984 TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 108
Problem Description
• Bad: “I was talking to someone and then the call went away”
• Good: “I received a call from Chuck Robbins at 1:52 p.m. on my DX80
TelePresence unit and about five minutes into the call, I could not hear him and
the video stopped.

When I hung up, the DX80 showed Telephony Service was unavailable for a
few seconds then came back. I was able to call Chuck back. Chuck mentioned
that he could hear me saying ‘Can you hear me?’ and could still see my video
at the time the problem happened, but I could not hear or see Chuck.”

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 109
Agenda
• Serviceability Tools Overview
Real-Time Monitoring Tool (RTMT)
Cisco Unified Operating System GUI
Cisco Unified Operating System CLI
Cisco Serviceability Reports Archive and Cisco Unified Reporting
• Troubleshooting Methodology
Problem Description
Information Collection
• Troubleshooting Case Studies
Dropped Call Unable to Place Calls
Delayed Audio Cut-Through Call Drops After Answering
No One Answers the Phone Video Encryption Not Working
• Understanding and Troubleshooting Unified CM Throttling Events
• Troubleshooting Database Replication
Information Collection
• Time synchronization
• Trace configuration
• Trace collection

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 111
Time Synchronization
• Ensure all network devices and applications are using an authoritative time
source (NTP server)
• All Unified CM subscribers are synced to the clock of the publisher
• Sync the publisher to an NTP server from the Cisco Unified OS administration
GUI (Settings > NTP Servers)
admin:utils ntp status
ntpd (pid 20175) is running...

remote refid st t when poll reach delay offset jitter


==============================================================================
*172.18.106.1 72.163.32.43 2 u 860 1024 377 0.571 0.111 0.089

synchronised to NTP server (172.18.106.1) at stratum 3


time correct to within 48 ms
polling server every 1024 s

Current time in UTC is : Wed Jun 29 17:55:13 UTC 2016


Current time in America/New_York is : Wed Jun 29 13:55:13 EDT 2016

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 112
Trace Configuration
• Unified CM 9.0 and later combine SDI and SDL traces into the SDL
traces and sets the Default trace level to Detailed (on new installations)
• Cisco CallManager service trace files (SDL traces) are needed for the
majority of issues
• Trace levels must be set properly before a problem occurs
• Configured from Cisco Unified Serviceability > Trace > Configuration
or by using AnalysisManager
• For pre-9.x systems, look in SDI trace files, not SDL.

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 113
Trace Configuration

Select the
Server

Select Service
Group

Select the Service on


Which Trace Needs to
Be Enabled

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 114
Trace Configuration
1. Press Set Default

Updates All Servers in


this cluster with
these settings

2. Set to Detailed

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 115
Trace Configuration

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 116
Trace Configuration
• Can Also Use the Troubleshooting Trace Settings Page in Cisco Unified
Serviceability (Trace > Troubleshooting Trace Settings)

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 117
Trace Collection
Various Ways to Collect Trace Files
• RTMT Collect Files
• RTMT Analysis Manager
• RTMT Remote Browse
• RTMT Query Wizard
• OS CLI (file get or file tail)

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 118
Troubleshooting Case Studies
Agenda
• Serviceability Tools Overview
Real-Time Monitoring Tool (RTMT)
Cisco Unified Operating System GUI
Cisco Unified Operating System CLI
Cisco Serviceability Reports Archive and Cisco Unified Reporting
• Troubleshooting Methodology
Problem Description
Information Collection
• Troubleshooting Case Studies
Dropped Call Unable to Place Calls
Delayed Audio Cut-Through Call Drops After Answering
No One Answers the Phone Video Encryption Not Working
• Understanding and Troubleshooting Unified CM Throttling Events
• Troubleshooting Database Replication
Case Study 1: Dropped Call
Problem Description
• “A user’s call was dropped”
• What kind of questions would you ask to get additional data?

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 121
Case Study 1: Questions to Ask
Questions to Ask About “A User’s Call Was Dropped”
• Who was the user?
• Chuck Robbins
• What is the directory number on their phone?
• 85551001
• What is the MAC address / device name of their phone?
• SEP00270DBF5B58
• What time did the dropped call occur?
• At 11:43 a.m.

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 122
Case Study 1: Questions to Ask
Questions to Ask About “A User’s Call Was Dropped”
• Who was the user speaking on the call that was dropped (internal vs. external)?
• External—phone number (919) 555-7285
• Was the call inbound or outbound?
• Inbound
• What time was the call placed/received?
• About two minutes before the call was dropped

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 123
Case Study 1: Problem Description
Formulate a Problem Description
• Chuck Robbins received a call around 11:41 a.m. on 06/29/16 from
(919) 555-7285. He received the call on extension 85551001 on the phone
identified as SEP00270DBF5B58. About two minutes into the call, the call was
dropped.

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 124
Case Study 1: Finding the Dropped Call
How Do We Find this Call in the Trace Files?
Our Three Options Are:
• Search for everything that happened on device SEP00270DBF5B58 at the time
of the problem
• Search for calls to extension 85551001
• Search for calls from (919) 555-7285

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 125
Case Study 1: Finding the Dropped Call
• In Unified CM 9.0 and later:
• We will be searching only through the CallManager SDL trace files
• activelog cm/trace/ccm/sdl

• In Unified CM 8.6 and earlier:


• We will be searching through the CallManager CCM (SDI) trace files
• activelog cm/trace/ccm/sdi
• We will use SDL trace files to help us correlate some of the
information in the CCM trace files
• activelog cm/trace/ccm/sdl

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 126
Case Study 1: SCCP Trace Data
SCCP Trace Data in a CCM Trace (9.x and later)

39912282.001 |11:40:23.128 |AppInfo |StationInit: (0000005) SoftKeyEvent


softKeyEvent=11(Answer) lineInstance=1 callReference=63664372.

Field Name Description


Line Number SDL Trace Line Number (and sub-line number)
Timestamp Time the Event Occurred
StationInit = SCCP Device → Unified CM
SCCP Message Direction
StationD = Unified CM → SCCP Device
TCP Handle Unique Identifier for a Device Registered to a Unified CM Server
SCCP Message Name The type of message being sent/received
SCCP Message Data Additional data related to the message

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 127
Case Study 1: Device Name to TCP Handle
Correlating a Device Name to TCP Handle 9.x and later
• For 9.x SDL trace, look at the correlation data for a StationInit signal:

39912282.000 |11:40:23.128 |SdlSig |SdlDataInd |wait


|StationInit(1,100,62,1) |SdlTCPConnection(1,100,14,1062195)
|1,100,14,1062195.2333^172.18.159.160^SEP00270DBF5B58 |*TraceFlagOverrode

39912282.001 |11:40:23.128 |AppInfo |StationInit: (0000005) SoftKeyEvent


softKeyEvent=11(Answer) lineInstance=1 callReference=63664372.

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 128
Case Study 1: Digit Analysis Results
Finding a Call in an SDL Trace
• Look for a digit analysis result:
00172610.007 |11:40:19.950 |AppInfo |Digit analysis: analysis results
00172610.008 |11:40:19.950 |AppInfo ||PretransformCallingPartyNumber=9195557285
|CallingPartyNumber=9195557285
|DialingPartition=1stLine
|DialingPattern=85551001
|FullyQualifiedCalledPartyNumber=+14085264000
|DialingPatternRegularExpression=(85551001)
|DialingWhere=
|PatternType=Enterprise
|PotentialMatches=NoPotentialMatchesExist
|DialingSdlProcessId=(0,0,0)
|PretransformDigitString=85551001
|PretransformTagsList=SUBSCRIBER
|PretransformPositionalMatchList=85551001
|CollectedDigits=85551001
|UnconsumedDigits=
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 129
Case Study 1: Trace Searching Tools
What Do You Use to Search Through Files?
• Platform CLI ‘file search’ command
• RTMT query wizard
• WinGREP (Windows) (http://www.wingrep.com/)
• TextWrangler (MacOS X) (Apple App Store or barebones.com)
• grep / zgrep
• Cisco Voice Log Translator (VLT)
• TranslatorX (translatorx.org)

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 130
Case Study 1: TranslatorX

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 131
Case Study 1: Downloading Tools
• To download TranslatorX, go to https://translatorx.org and click Downloads

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 132
Case Study 1: Finding the Call
Find the TCP Handle for Chuck Robbins’ Phone SEP0012431EB746
• Pick any trace file from the server the phone is registered to and search for
SEP00270DBF5B58 looking for a StationInit message
• Found the following lines:
39912282.000 |11:40:23.128 |SdlSig |SdlDataInd |wait
|StationInit(1,100,62,1) |SdlTCPConnection(1,100,14,1062195)
|1,100,14,1062195.2333^172.18.159.160^SEP00270DBF5B58 |*TraceFlagOverrode

39912282.001 |11:40:23.128 |AppInfo |StationInit: (0000005) SoftKeyEvent


softKeyEvent=11(Answer) lineInstance=1 callReference=63664372.

• TCP handle is (0000005)

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 133
Case Study 1: SCCP Messages
39912240.001 |11:40:19.954 |AppInfo |StationD: (0000005) CallState callState=4 lineInstance=1 callReference=63664372...
39912241.001 |11:40:19.954 |AppInfo |StationD: (0000005) SelectSoftKeys instance=1 reference=63664372...
39912242.001 |11:40:19.954 |AppInfo |StationD: (0000005) DisplayPromptStatus timeOut=0 Status='?9195557285'...
39912243.001 |11:40:19.954 |AppInfo |StationD: (0000005) DisplayPriNotify timeOutValue=10 pri=5 notify='?9195557285'...
39912244.001 |11:40:19.954 |AppInfo |StationD: (0000005) CallInfo callingPartyName='' callingParty=9195557285 ...
39912245.001 |11:40:19.954 |AppInfo |StationD: (0000005) SetLamp mode=5, stim=9 stimInst=1.
39912249.001 |11:40:19.954 |AppInfo |StationD: (0000005) SetRinger ringMode=3(OutsideRing).
39912282.001 |11:40:23.128 |AppInfo |StationInit: (0000005) SoftKeyEvent softKeyEvent=11(Answer) lineInstance=1 ...
39912285.001 |11:40:23.128 |AppInfo |StationD: (0000005) SetRinger ringMode=1(RingOff).
39912286.001 |11:40:23.128 |AppInfo |StationD: (0000005) SetSpeakerMode speakermode=1(On).
39912288.001 |11:40:23.128 |AppInfo |StationD: (0000005) SetLamp mode=2, stim=9 stimInst=1.
39912292.001 |11:40:23.128 |AppInfo |StationD: (0000005) CallState callState=1 lineInstance=1 callReference=63664372...
39912295.001 |11:40:23.128 |AppInfo |StationD: (0000005) ActivateCallPlane lineInstance=1.
39912299.001 |11:40:23.129 |AppInfo |StationD: (0000005) SetRinger ringMode=1(RingOff).
39912312.001 |11:40:23.136 |AppInfo |StationD: (0000005) StopTone.
39912313.001 |11:40:23.136 |AppInfo |StationD: (0000005) CallState callState=5 lineInstance=1 callReference=63664372...
39912314.001 |11:40:23.136 |AppInfo |StationD: (0000005) SelectSoftKeys instance=1 reference=63664372...
39912315.001 |11:40:23.136 |AppInfo |StationD: (0000005) DisplayPromptStatus timeOut=0 Status='?' content='Connected'...
39912326.001 |11:40:23.153 |AppInfo |StationD: (0000005) StopTone.
39912327.002 |11:40:23.153 |AppInfo |StationD: (0000005) OpenReceiveChannel conferenceID=63664372...
39912330.001 |11:40:23.155 |AppInfo |StationD: (0000005) startMediaTransmission conferenceID=63664372...
39912331.001 |11:40:23.236 |AppInfo |StationInit: (0000005) OpenReceiveChannelAck Status=0, IpAddr=IpAddr.type:0...
39912339.001 |11:40:23.240 |AppInfo |StationD: (0000005) CallInfo callingPartyName='' callingParty=9195557285...

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 134
Case Study 1: SCCP Call States

1—Off hook 8—Hold


2—On hook 9—Call waiting
3—Ring out 10—Call transfer
4—Ring in 11—Call park
5—Connected 12—Proceed
6—Busy 13—Call remote multiline
7—Congestion 14—Invalid number

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 135
Case Study 1: Finding the Call
• Find all activity
around 11:41 a.m.
for TCP Handle
(0000005)
• Once you have
found a message,
click on it and filter
by TCP handle
(Control-T)

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 136
New in TranslatorX 11.5

Case Study 1: Finding the Call


• Find all activity
around 11:41 a.m.
for device
SEP00270DBF5B58
• Once you have
found a message,
click on it and filter
by TCP handle
(Control-T)

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 137
Case Study 1: Finding the Call

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 138
Case Study 1: SCCP CallInfo Message
Use Call Info Message to Find Information About This Call
39912339.001 |11:40:23.240 |AppInfo |StationD: (0000005) CallInfo callingPartyName=''
callingParty=9195557285 cgpnVoiceMailbox= alternateCallingParty= 9195557285
calledPartyName='Chuck Robbins' calledParty=85551001 cdpnVoiceMailbox=
originalCalledPartyName='Chuck Robbins' originalCalledParty=85551001 originalCdpnVoiceMailbox=
originalCdpnRedirectReason=0 lastRedirectingPartyName='Chuck Robbins' lastRedirectingParty=85551001
lastRedirectingVoiceMailbox= lastRedirectingReason=0 callType=1(InBound) lineInstance=1
callReference=63664372. version: 8570000c

• Inbound Call
• To Chuck Robbins
• Extension 85551001
• Calling Party Number is 9195557285
• At around 11:41 a.m.

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 139
Case Study 1: Searching for Calling Number
• Can also find the call by searching for 9195557285

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 140
Case Study 1: Finding Originating Device
Where Did This Call Come From?
• Look immediately above the first messages sent to the phone in relation to this
call to see if there is an inbound gateway call
• If you do not see the digit analysis results for this call in the trace file, the call
must have originated from some other node in the cluster
• For 9.x and later, look for SdlSig-O in same SDL trace file
• Pre 9.x, use the SDL trace to help you find which server in the cluster (node) the
call originated on
• NOTE: SIP Session-ID can help correlate SIP calls – more on this later

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 141
Case Study 1: Finding Originating Node
Searching SDL Trace to find Originating Node
• First message to Chuck Robbins’ phone is at timestamp 10:40:19.954, so look
for CcSetupReq signal from another node.

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 142
Case Study 1: SDL Trace File Definitions
SDL Signal Trace Line Example:
39912223.000 |11:40:19.952 |SdlSig-I |CcSetupReq |restart0 |LineControl(1,100,174,10)
|Cdcc(3,100,219,8) |3,100,14,23.2^172.18.106.231^* |[R:N-H:0,N:0,L:0,V:0,Z:0,D:0] CI=63664372…

Field Name Description


Line Number Continuously Incremented Across Files. Related trace lines increment
Line Number
number after decimal point.
Date and Time Date and Time the Event Occurred
Indicates if the Signal Is Local to the Server (SdlSig), Inbound from Another Node in the
SDL Operation Cluster (SdlSig-I), or Out to Another Node in the Cluster (SdlSig-O)
AppInfo indicates SDI trace data in the SDL trace
SDL Signal Name The Signal that Is Being Sent from Source Process to Destination Process
Destination Process State Current State of the Destination Process
Destination Process The Name and Process ID of the Destination Process
Source Process The Name and Process ID of the Source Process

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 143
Case Study 1: SDL Trace File Definitions
What Does Cdcc(3,100,219,8) Mean?
Field Name Description

Node ID Node in the cluster where this process exists

Application ID 100 = CallManager, 200 = CTIManager

In this case 219 means Cdcc. Process IDs are assigned at runtime and may not be
Process ID
the same from one CallManager Service restart to another.

The Instance ID of this Process. In this Case this Is the 8th Cdcc Process that has
Process Instance
been created on this server.

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 144
Case Study 1: Finding SDL Node ID
Node ID Is Found Under System > Cisco Unified CM

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 145
Case Study 1: Finding SDL Node ID
Can Run an SQL Query to Find Node ID

admin:run sql select name, description, ctiid from callmanager order by ctiid
name description ctiid
=========== ==================== =====
CM_VNT-CM1A VNT-CM1A - Publisher 1
CM_VNT-CM1B VNT-CM1B 2
CM_VNT-CM1C VNT-CM1C 3

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 146
Case Study 1: Finding Originating Node
Going Back to the SDL Trace Line
39912223.000 |11:40:19.952 |SdlSig-I |CcSetupReq |restart0
|LineControl(1,100,174,10) |Cdcc(3,100,219,8)

• Cdcc instance 8 on node 3 sent LineControl instance 10 on Node 1 a


CCSetupReq signal
• This means the call originated on node 3
• Look in the SDL trace on node 3 to find the matching trace line
00172622.000 |11:40:19.951 |SdlSig-O |CcSetupReq |NA RemoteSignal
|LineControl(1,100,174,10) |Cdcc(3,100,219,8)

• See what happens just before this event

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 147
Case Study 1: Found Digit Analysis Results
CCM trace at 09:38:13.406
00172610.007 |11:40:19.950 |AppInfo |Digit analysis: analysis results
00172610.008 |11:40:19.950 |AppInfo ||PretransformCallingPartyNumber=9195557285
|CallingPartyNumber=9195557285
|DialingPartition=1stLine
|DialingPattern=85551001
|FullyQualifiedCalledPartyNumber=+14085264000
|DialingPatternRegularExpression=(85551001)
|DialingWhere=
|PatternType=Enterprise
|PotentialMatches=NoPotentialMatchesExist
|DialingSdlProcessId=(0,0,0)
|PretransformDigitString=85551001
|PretransformTagsList=SUBSCRIBER
|PretransformPositionalMatchList=85551001
|CollectedDigits=85551001
|UnconsumedDigits=

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 148
Case Study 1: Found Originating SETUP
• Look just before the digit analysis match and you see:
00172588.002 |11:40:19.943 |AppInfo |In Message -- H225SetupMsg -- Protocol= H225Protocol
00172588.003 |11:40:19.943 |AppInfo |Ie - H225BearerCapabilityIe -- IEData= 04 03 80 90 A3
00172588.004 |11:40:19.943 |AppInfo |Ie - H225CallingPartyIe -- IEData= 6C 0C 21 83 39 31 39 35 35 35 37 32 38 35
00172588.005 |11:40:19.943 |AppInfo |Ie - Q931CalledPartyIe -- IEData= 70 09 C1 38 35 35 35 31 30 30 31
00172588.006 |11:40:19.943 |AppInfo |Ie - H225UserUserIe -- IEData= 7E 03 00 05 20 80 06 00 08 91 4A 00 04 28 00 B5
00 00 12 40 01 3C 05 01 00 00 9E 8C 97 9A 3D 46 11 E6 B6 8D C4 7D 4F B6 1B 00 00 CD 1D 82 80 07 00 AC 12 6A E7
45 5A 11 00 9E 8D 33 C2 3D 46 11 E6 87 32 C5 E7 C3 00 17 06 80 E7 08 13 00 00 00 0C 60 13 80 0B 05 00 01 00 AC…
00172588.007 |11:40:19.943 |AppInfo |MMan_Id= 0. (iep= 0 dsl= 0 sapi= 0 ces= 0 IpAddr=e76a12ac IpPort=17754)
AC = 172
12 = 18
6A = 106
E7 = 231

172.18.106.231

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 149
Case Study 1: Decoding H.225 Messages
Open the Trace Files from Node 3 in TranslatorX

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 150
Case Study 1: Decoding H.225 Messages
Open the Trace Files from Node 3 in TranslatorX

Inbound H225 SETUP message from 172.18.106.231 at timestamp 06/29/2016 11:40:19.943

SETUP, pd = 8, callref = 0x000A, Message Size = 662 bytes

Bearer Capability i = 0x8090A3, ITU-T standard, Speech, Circuit mode, 64k, A-law
Calling Party Number i = '9195557285' - Plan: ISDN, Type: National,
Presentation Allowed, Network provided
Called Party Number i = '85551001' - Plan: ISDN, Type: Subscriber
User-User, i =
0x052080060008914A00042800B500001240013C050100009E8C979A3D4611E6B68DC47D4
FB61B0000CD1D82800700AC126AE7455A11009E8D33C23D4611E68732C5E7C300170680
E708130000000C6013800B05000100AC126AE75917001E400000060401004C6013801215…

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 151
Case Study 1: Call Setup
Call Setup Signaling
UCM Node 3 UCM Node 1
CCSetupReq
SetRinger
ringMode = 2
UCM Cluster (Inside Ring)
H.225
Setup
H.323 GW
(172.18.106.231)
IP Network
Q.931 Setup

PSTN (919) 555-7285

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 152
Case Study 1: Call Disconnected at Gateway
Filter the Call by Call Reference to see all messages about this call

• Inbound call originated at 11:40:19.943 and connected at 11:40:23.138


• Call was disconnected at 11:43:05.269

RELEASE_COMP, pd = 8, callref = 0x800A, Message Size = 46 bytes


Cause i = 0x80A9 - Temporary failure

• Now we know Unified CM sent a disconnect with a cause code of temporary


failure at 11:43:05.269, but why?

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 153
Case Study 1: Call Dropped on IP Phone
Go Back to the IP Phone to See What Happened From the User’s Perspective
• Unified CM sends a SelectSoftKeys and DisplayPromptStatus
message at 11:42:57.606. Click on DisplayPromptStatus to see what
the message sent to the phone was.

StationD: (0000005) DisplayPromptStatus timeOut=0 Status='�#'


content='Temporary failure' line=1 CI=63664372 ver=8570000c.

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 154
Case Study 1: Call Disconnected
Call Being Disconnected

CCM Node 3 CCM Node 1

? DisplayPromptStatus
‘Temporary Failure’
CCM Cluster
H.225 Release
Complete
H.323
GW
IP Network
Q.931
Disconnect

PSTN (408) 555-1234

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 155
Case Study 1: SDL Link OOS
What Happened Between Node 1 and Node 3?
• Look at the CCM trace on Node 1 right before Unified CM tells the phone about
the failure at 11:42:57.606
39913061.003 |11:42:57.602 |AppInfo |SDLLinkOOS - SDL link to remote application is out of service Local
Node ID:1 Local Application ID:100 Remote Application IP Address:10.122.249.15 Remote Node ID:3 Remote
Application ID:100 Unique Link ID:1:100:3:100 App ID:Cisco CallManager Cluster ID:StandAloneCluster Node
ID:collab-ccie-cm2a

39913061.004 |11:42:57.603 |AlarmErr |AlarmClass: CallManager, AlarmName: SDLLinkOOS, AlarmSeverity:


Alert, AlarmMessage: , AlarmDescription: SDL link to remote application is out of service, AlarmParameters:
LocalNodeId:1, LocalApplicationID:100, RemoteIPAddress:10.122.249.15, RemoteNodeID:3,
RemoteApplicationID:100, LinkID:1:100:3:100, AppID:Cisco CallManager, ClusterID:StandAloneCluster,
NodeID:collab-ccie-cm2a,

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 156
Case Study 1: SDL Link OOS
What Happened Between Node 1 and Node 3?
• Look at the CCM trace on node 3 right before Unified CM sends the RELEASE
COMPLETE to the gateway at 11:43:05.269

00173311.003 |11:43:05.262 |AppInfo |SDLLinkOOS - SDL link to remote application is out of service Local
Node ID:3 Local Application ID:100 Remote Application IP Address:10.81.98.205 Remote Node ID:1 Remote
Application ID:100 Unique Link ID:3:100:1:100 App ID:Cisco CallManager Cluster ID:StandAloneCluster Node
ID:collab-ccie-cm2c

00173311.004 |11:43:05.262 |AlarmErr |AlarmClass: CallManager, AlarmName: SDLLinkOOS, AlarmSeverity:


Alert, AlarmMessage: , AlarmDescription: SDL link to remote application is out of service, AlarmParameters:
LocalNodeId:3, LocalApplicationID:100, RemoteIPAddress:10.81.98.205, RemoteNodeID:1,
RemoteApplicationID:100, LinkID:3:100:1:100, AppID:Cisco CallManager, ClusterID:StandAloneCluster,
NodeID:collab-ccie-cm2c,

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 157
Case Study 1: SDL Links
What Is an SDL Link?
• Fully-meshed TCP connections between all nodes in a Unified CM cluster
• Each server establishes a TCP connection to other nodes with a lower node ID
than itself on port 8002

2 3

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 158
Case Study 1: SDL Link OOS
Why Would an SDL Link Go Out of Service?
• Server Hardware Failure / Power / CallManager Service restart
• IP connectivity issues
• Duplex mismatch between Unified CM Server NIC and switch
• Router or switch failure between Unified CM nodes / Routing problems
• Cabling issues
• Network congestion / Errors / Packet Loss

• CallManager Service blocked from processing signals on SDL Link


• Overloaded Unified CM Node
• High CPU due to other process on the system
• High disk I/O / SAN Failure
• Low memory (causing memory to swap to/from disk)
• Hypervisor Host overloaded / Hypervisor blocking VM

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 159
Case Study 1: Proactive Alerts
Leverage Syslog / RTMT Alerts to receive Alerts / Alarms
• Alerts generated in Syslog on Node 1:
11:42:57.604 |SyslogSeverityMatchFound - The configured Syslog Alarm/message severity had matched
SeverityMatch:Alert MatchedEvent:Jun 29 11:42:57 collab-ccie-cm2a local7 1 ccm: 12: collab-ccie-
cm2a.cisco.com: Jun 29 2016 15:42:57.600 UTC : %UC_CALLMANAGER-1-SDLLinkOOS:
%[LocalNodeId=1][LocalApplicationID=100][RemoteIPAddress=10.122.249.15][RemoteNodeID=3][RemoteAp
plicationID=100][LinkID=1:100:3:100][AppID=Cisco
CallManager][ClusterID=StandAloneCluster][NodeID=collab-ccie-cm2a]: SDL link to remote application is out
of service App ID:Cisco Syslog Agent Cluster ID: Node ID:collab-ccie-cm2a

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 160
Case Study 1: Proactive Alerts
Leverage Syslog / RTMT Alerts to receive Alerts / Alarms
• Alerts generated in Syslog on Node 2:
11:43:03.850 |SyslogSeverityMatchFound - The configured Syslog Alarm/message severity had matched
SeverityMatch:Alert MatchedEvent:Jun 29 11:43:03 collab-ccie-cm2b local7 1 ccm: 12: collab-ccie-
cm2b.cisco.com: Jun 29 2016 15:43:03.792 UTC : %UC_CALLMANAGER-1-SDLLinkOOS:
%[LocalNodeId=2][LocalApplicationID=100][RemoteIPAddress=10.122.249.15][RemoteNodeID=3][RemoteAp
plicationID=100][LinkID=2:100:3:100][AppID=Cisco
CallManager][ClusterID=StandAloneCluster][NodeID=collab-ccie-cm2b]: SDL link to remote application is out
of service App ID:Cisco Syslog Agent Cluster ID: Node ID:collab-ccie-cm2b

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 161
Case Study 1: Proactive Alerts
Leverage Syslog / RTMT Alerts to receive Alerts / Alarms
• Alerts generated in Syslog on Node 3:
11:43:05.261 |SyslogSeverityMatchFound - The configured Syslog Alarm/message severity had matched
SeverityMatch:Alert MatchedEvent:Jun 29 11:43:05 collab-ccie-cm2c local7 1 ccm: 9: collab-ccie-
cm2c.cisco.com: Jun 29 2016 15:43:05.259 UTC : %UC_CALLMANAGER-1-SDLLinkOOS:
%[LocalNodeId=3][LocalApplicationID=100][RemoteIPAddress=10.81.98.206][RemoteNodeID=2][RemoteAppli
cationID=100][LinkID=3:100:2:100][AppID=Cisco CallManager][ClusterID=StandAloneCluster][NodeID=collab-
ccie-cm2c]: SDL link to remote application is out of service App ID:Cisco Syslog Agent Cluster ID: Node
ID:collab-ccie-cm2c

11:43:06.362 |SyslogSeverityMatchFound - The configured Syslog Alarm/message severity had matched


SeverityMatch:Alert MatchedEvent:Jun 29 11:43:05 collab-ccie-cm2c local7 1 ccm: 10: collab-ccie-
cm2c.cisco.com: Jun 29 2016 15:43:05.263 UTC : %UC_CALLMANAGER-1-SDLLinkOOS:
%[LocalNodeId=3][LocalApplicationID=100][RemoteIPAddress=10.81.98.205][RemoteNodeID=1][RemoteAppli
cationID=100][LinkID=3:100:1:100][AppID=Cisco CallManager][ClusterID=StandAloneCluster][NodeID=collab-
ccie-cm2c]: SDL link to remote application is out of service App ID:Cisco Syslog Agent Cluster ID: Node
ID:collab-ccie-cm2c

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 162
Case Study 1: SDL Link OOS
How do you prevent the call from being dropped?
• Enable “Allow Peer to Preserve H.323 Calls”

• Enable Call Preservation & Media Inactivity detection on the IOS gateway
voice service voip
h323
call preserve
gateway
timer receive-rtcp 1200

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 163
Agenda
• Serviceability Tools Overview
Real-Time Monitoring Tool (RTMT)
Cisco Unified Operating System GUI
Cisco Unified Operating System CLI
Cisco Serviceability Reports Archive and Cisco Unified Reporting
• Troubleshooting Methodology
Problem Description
Information Collection
• Troubleshooting Case Studies
Dropped Call Unable to Place Calls
Delayed Audio Cut-Through Call Drops After Answering
No One Answers the Phone Video Encryption Not Working
• Understanding and Troubleshooting Unified CM Throttling Events
• Troubleshooting Database Replication
Case Study 2: Delayed Audio Cut-Through
Problem Description
• User reports that occasionally they answer their phone and are not able to hear
the first few seconds of audio after going off hook
• One user, phone SEP003094C3A22F with extension 15593, reported the
problem occurred on 4/19/15 at 1:07 p.m. (per the received calls directory on the
phone). She answered the phone and heard nothing for about 3–4 seconds.
There was no caller ID for the call.

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 165
Case Study 2: Finding the Failed Call
• Gather the CCM and SDL trace file from 1:05 to 1:10
• We do not know the calling party number, so only way to find the call is by
searching for what happened on SEP003094C3A22F around 1:05 p.m.
• Find TCP Handle of SEP003094C3A22F or search for SEP003094C3A22F in
TranslatorX 11.5 or later

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 166
Case Study 2: SCCP Messages
• There appears to be a call that arrived at 1:07:18 and was answered (OffHook)
at 1:07:20

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 167
Case Study 2: Find Call Origination
Look for the Digit Analysis Match
04/19/2015 13:07:18.020 CCM|
|PretransformCallingPartyNumber=62003
|CallingPartyNumber=62003
|DialingPartition=Line1
|DialingPattern=15593
|DialingRoutePatternRegularExpression=(15593)
|DialingWhere=
|PatternType=Enterprise
|PotentialMatches=NoPotentialMatchesExist
|DialingSdlProcessId=(1,37,6)
|IndexOfAnalyzedPattern=0
|PretransformDigitString=15593
|PretransformTagsList=SUBSCRIBER
|PretransformPositionalMatchList=15593
|CollectedDigits=15593
|UnconsumedDigits=
|TagsList=SUBSCRIBER

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 168
Case Study 2: Find Call Origination
• Look in the SDL trace at the timestamp for the last digit analysis result
(13:07:18.020)

000005850| 15/04/19 13:07:18.020| 001| SdlSig-I | SsRedirectCallReq


| wait | Cc(1,100,13,1) | CTIHandler(1,200,*,*) |
(1,200,9,1).241-(vnt1-apps2-aa:14.84.203.20)| [NP - HP: 0, NP: 0, LP: 0, VLP: 0,
LZP: 0]SsType=33554450 SsKey=4 SsNodeId=0 SsParty=16777238
primaryPartitionSearchSpace= primaryDn=62003 secondaryPartitionSearchSpace
= secondaryDn =15593 maskedDisplayName=62003 SsOriginalRedirectReason=4
SsLastRedirectionReason=4 redirectIfSuccess=1 resetOriginalCalled=0

• This indicates a redirect from 62003 to 15593

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 169
Case Study 2: Find Call Origination
• User said call was from the PSTN, however the digit analysis match shows the
call came from 62003. Need to find out what 62003 is.
• You look in Unified CM administration and find that 62003 is a CTI port
registered with an IPCC express server
• Most likely the call originally came in to a CTI route point that was redirected to
62003. 62003 then later redirected the call to 15593
• Need to find the call that originally got redirected to 62003

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 170
Case Study 2: Find Call Origination
• Search backwards through trace for ‘DialingPattern=62003’ to find the digit
analysis result
04/19/2015 13:07:02.473 CCM|
|PretransformCallingPartyNumber=
|CallingPartyNumber=
|DialingPartition=Line1
|DialingPattern=62003
|DialingRoutePatternRegularExpression=(62003)
|DialingWhere=
|PatternType=Enterprise
|PotentialMatches=NoPotentialMatchesExist
|DialingSdlProcessId=(1,37,1)
|IndexOfAnalyzedPattern=0
|PretransformDigitString=62003
|PretransformTagsList=SUBSCRIBER
|PretransformPositionalMatchList=62003
|DisplayName=vnt1-apps2 port 03

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 171
Case Study 2: Find Call Origination
• Look in the SDL trace at the timestamp for the last digit analysis result
(13:07:02.473 )

000005906| 15/04/19 13:07:02.473| 001| SdlSig-I | SsRedirectCallReq


| wait | Cc(1,100,13,1) | CTIHandler(1,200,*,*) |
(1,200,9,1).241-(vnt1-apps2-aa:14.84.203.20)| [NP - HP: 0, NP: 0, LP: 0, VLP: 0,
LZP: 0]SsType=33554450 SsKey=4 SsNodeId=0 SsParty=16777238
primaryPartitionSearchSpace= primaryDn=15591 secondaryPartitionSearchSpace
= secondaryDn =62003 maskedDisplayName=15591 SsOriginalRedirectReason=4
SsLastRedirectionReason=4 redirectIfSuccess=1 resetOriginalCalled=0

• This indicates a redirect from 15591 to 62003

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 172
Case Study 2: Find Call Origination
• Look to see if there is a digit analysis result for 15591
04/19/2015 13:07:02.473 CCM||PretransformCallingPartyNumber=
|CallingPartyNumber=
|DialingPartition=Line1
|DialingPattern=15591
|DialingRoutePatternRegularExpression=(15591)
|DialingWhere=
|PatternType=Enterprise
|PotentialMatches=NoPotentialMatchesExist
|DialingSdlProcessId=(1,75,6)
|IndexOfAnalyzedPattern=0
|PretransformDigitString=15591
|PretransformTagsList=SUBSCRIBER
|PretransformPositionalMatchList=15591
|CollectedDigits=15591
|UnconsumedDigits=
|TagsList=SUBSCRIBER
|PositionalMatchList=15591

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 173
Case Study 2: Find Call Origination
• Now find where that call originated
04/19/2015 13:07:02.458 CCM|In Message -- Pri5essSetupMsg -- Protocol= Pri5essProtocol
04/19/2015 13:07:02.458 CCM|Ie - Ni2BearerCapabilityIe -- IEData= 04 03 80 90 A2
04/19/2015 13:07:02.458 CCM|Ie - Q931ChannelIdIe -- IEData= 18 03 A1 83 8F
04/19/2015 13:07:02.458 CCM|Ie - Q931CalledPartyIe -- IEData= 70 06 A1 31 35 35 39 31

• Translated in TranslatorX
SETUP, pd = 8, callref = 0x2907
Bearer Capability i = 0x0800900A2, ITU-T standard, Speech, Circuit mode, 64k, mu-law
Channel ID i = 0x0A108308F, PRI interface, Preferred channel 15
Called Party Number i = '15591' - Plan: ISDN, Type: National

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 174
Case Study 2: Troubleshooting Summary
• So far we know:
• Call came in on MGCP gateway to CTI route point 15591
• 15591 redirected the call to 62003 at 13:07:02.473
• 62003 redirected the call to 15593 at 13:07:18.020
• 15593 answered the call at 13:07:20.942
• Now need to find out what happened to the audio stream when 15593 answered
the call

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 175
Case Study 2: SCCP and MGCP Signaling
At 13:07:18.176: StationD: 000000005 CallState callState=4 =1 callReference=16777245
StationD: 000000005 CallInfo callingPartyName='vnt1-apps2 port 03' callingParty=62003
originalCalledParty=15593
StationD: 000000005 SetRinger ringMode=2(InsideRing)
StationD: 000000005 DisplayPromptStatus content='From 62003'

At 13:07:20:270: MGCPHandler send msg SUCCESSFULLY to: 14.84.11.4


MDCX 84 S1/DS1-0/15@vnt-3745-r105a MGCP 0.1
C: D0000000010000160000000080002907
I: 135
X: f
M: recvonly
R: D/[0-9ABCD*#]
Q: process,loop

MGCPHandler received msg from: 14.84.11.4


200 84 OK

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 176
Case Study 2: SCCP and MGCP Signaling
MGCPHandler send msg SUCCESSFULLY to: 14.84.11.4
• At 13:07:20:286: MDCX 85 S1/DS1-0/15@vnt-3745-r105a MGCP 0.1
C: D0000000010000160000000080002907
I: 135
X: f
L: p:20, a:PCMU, s:off, t:b8
M: recvonly
R: D/[0-9ABCD*#]
Q: process,loop

v=0

o=- 309 0 IN EPN S1/DS1-0/15@vnt-3745-r105a


s=Cisco SDP 0
t=0 0
c=IN IP4 239.100.100.100
m=audio 16384 RTP/AVP 0

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 177
Case Study 2: SCCP and MGCP Signaling
• At 13:07:20:286:
StationInit: 000000005 OffHook.
StationD: 000000005 SetRinger ringMode=1(RingOff).
StationD: 000000005 CallState callState=1 lineInstance=1 callReference=16777245
StationD: 000000005 ActivateCallPlane lineInstance=1.
StationD: 000000005 StopTone.
StationD: 000000005 CallState callState=5 lineInstance=1 callReference=16777245
StationD: 000000005 CallInfo callingPartyName='vnt1-apps2 port 03' callingParty=62003
calledPartyName='' calledParty=15593
StationD: 000000005 DisplayPromptStatus content='Connected' lineInstance=1

• Then nothing happens for three seconds until the following:


13:07:23.301 CCM|MGCPHandler TransId: 85 Timeout. Retry#1
13:07:23.301 CCM|MGCPHandler received msg from: 14.84.11.4
200 85 OK

• Unified CM was waiting for the acknowledgement of MGCP message 85


(modify connection—MDCX)
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 178
Case Study 2: SCCP and MGCP Signaling
• At 13:07:23.317:

StationD: 000000005 CallInfo callingPartyName='' callingParty= calledParty=15593


StationD: 000000005 OpenReceiveChannel conferenceID=0 passThruPartyID=10000d1
millisecondPacketSize=20
compressionType=4(Media_Payload_G711Ulaw64k)
myIP: 65df540e (14.84.223.101)
StationD: 000000005 StartMediaTransmission conferenceID=0 passThruPartyID=10000d1
remoteIpAddress=40b540e(14.84.11.4) remotePortNumber=19154
milliSecondPacketSize=20
compressType=4(Media_Payload_G711Ulaw64k)
myIP: 65df540e (14.84.223.101)
StationInit: 000000005 OpenReceiveChannelAck Status=0, IpAddr=0x65df540e, Port=24866,
PartyID=16777425

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 179
Case Study 2: SCCP and MGCP Signaling
MGCPHandler send msg SUCCESSFULLY to: 14.84.11.4
• At 13:07:23.317: MDCX 86 S1/DS1-0/15@vnt-3745-r105a MGCP 0.1
C: D0000000010000160000000080002907
I: 135
X: f
L: p:20, a:PCMU, s:off, t:b8
M: sendrecv
R: D/[0-9ABCD*#]
S:
Q: process,loop

v=0

o=- 309 0 IN EPN S1/DS1-0/15@vnt-3745-r105a


s=Cisco SDP 0
t=0 0
c=IN IP4 14.84.223.101
m=audio 24866 RTP/AVP 0

04/19/2005 13:07:23.348 CCM|MGCPHandler received msg from: 14.84.11.4


200 86 OK

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 180
Case Study 2: Root Cause Analysis
So Why Was the Modify Connection (MDCX) Retransmitted?
• MGCP is UDP-based and therefore handles retransmissions at the application
layer
• Each MGCP message must be acknowledged
• Unified CM will wait until the “MGCP timer” expires (configured in CallManager
service parameters in CallManager administration)

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 181
Case Study 2: Root Cause Analysis
So Why Was the MDCX Retransmitted?
• Nearly all cases of MGCP message retransmissions are due to packet loss in
the Network
• MGCP gateway did not send an acknowledgement to the MGCP message

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 182
Case Study 2: Root Cause Analysis
Next Steps?
• Trace the network path between Unified CM and gateway to see if there are any
errors
• If the problem is occurring regularly, place sniffers on the network to find out
where the packet loss is
• Enable ‘debug mgcp packet’ on the gateway to ensure the gateway is sending
the MGCP message

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 183
Agenda
• Serviceability Tools Overview
Real-Time Monitoring Tool (RTMT)
Cisco Unified Operating System GUI
Cisco Unified Operating System CLI
Cisco Serviceability Reports Archive and Cisco Unified Reporting
• Troubleshooting Methodology
Problem Description
Information Collection
• Troubleshooting Case Studies
Dropped Call Unable to Place Calls
Delayed Audio Cut-Through Call Drops After Answering
No One Answers the Phone Video Encryption Not Working
• Understanding and Troubleshooting Unified CM Throttling Events
• Troubleshooting Database Replication
Case Study 3: No One Answers the Phone
Problem Description
• A user reports that every time they call a specific phone number, no one
answers the call, but if they call from their cell phone, the call is answered
immediately every time.
• Calling phone is extension 89919236.
• Called number is 1 (877) 288-8362

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 185
Case Study 3: No One Answers the Phone
Collect Traces
• Problem is reproducible, so generate a test call and then collect traces.

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 186
Case Study 3: No One Answers the Phone
Use TranslatorX to Analyze Traces
• Problem is reproducible, so generate a test call and then collect traces. Drag
and Drop folder into TranslatorX

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 187
Case Study 3: No One Answers the Phone
Use TranslatorX to Analyze Traces
• Try to find call in Call List

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 188
Case Study 3: No One Answers the Phone
Use TranslatorX to Analyze Traces
• Search for called party number

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 189
Case Study 3: No One Answers the Phone
Use TranslatorX to Analyze Traces
• Disable Filters
• Select the INVITE
• Filter by SIP Call ID (control/command – S)

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 190
Case Study 3: No One Answers the Phone
Use TranslatorX to Analyze Traces
03/29/2010 10:36:41.497 |//SIP/SIPUdp/wait_SdlSPISignal: Outgoing SIP UDP message to 172.18.159.231:[5060]:
INVITE sip:[email protected]:5060 SIP/2.0
Via: SIP/2.0/UDP 172.18.106.59:5060;branch=z9hG4bK1515b3154665
From: "Test User 1" <sip:[email protected]>;tag=97903bc0-a3de-4a15-ba27-44c81fe3adcd-45510543
To: <sip:[email protected]>
Date: Mon, 29 Mar 2010 14:36:41 GMT
Call-ID: [email protected]
Supported: timer,resource-priority,replaces
Min-SE: 1800
User-Agent: Cisco-CUCM11.5
Allow: INVITE, OPTIONS, INFO, BYE, CANCEL, ACK, PRACK, UPDATE, REFER, SUBSCRIBE, NOTIFY
CSeq: 101 INVITE
Expires: 180
Allow-Events: presence, kpml
Supported: X-cisco-srtp-fallback
Supported: Geolocation
Call-Info: <sip:172.18.106.59:5060>;method="NOTIFY;Event=telephone-event;Duration=500"
Cisco-Guid: 2081204224-3137452793-0000000466-0996807340
Session-Expires: 1800
P-Asserted-Identity: "Test User 1" <sip:[email protected]>
Contact: <sip:[email protected]:5060>;video;audio
Max-Forwards: 69
Content-Length: 0
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 191
Case Study 3: No One Answers the Phone
Use TranslatorX to Analyze Traces
• Where did the call originate? Try searching for the calling party number

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 192
Case Study 3: No One Answers the Phone
Use TranslatorX to Analyze Traces
• Select the INVITE
• Create New Filter (control/command-N)
• Filter by IP Address (control/command – I)
• Re-enable Filters

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 193
Case Study 3: No One Answers the Phone
Use TranslatorX to Analyze Traces

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 194
Case Study 3: No One Answers the Phone
INVITE from IP Phone w/ SDP
03/29/2010 10:36:33.771 |//SIP/SIPTcp/wait_SdlReadRsp: Incoming SIP TCP message from 172.18.159.152 on port 51682
index 2321 with 1717 bytes:
INVITE sip:[email protected];user=phone SIP/2.0
Via: SIP/2.0/TLS 172.18.159.152:51682;branch=z9hG4bK1636ab61
From: "Test User 1" <sip:[email protected]>;tag=00260bd9669e07147bcb3aac-3cda8f0c
To: <sip:[email protected];user=phone>
Call-ID: [email protected]
Max-Forwards: 70
Date: Mon, 29 Mar 2010 14:36:33 GMT
CSeq: 101 INVITE
User-Agent: Cisco-CP9951/9.0.1
Contact: <sip:[email protected]:51682;transport=tls>
Expires: 180
Accept: application/sdp
Allow: ACK,BYE,CANCEL,INVITE,NOTIFY,OPTIONS,REFER,REGISTER,UPDATE,SUBSCRIBE,INFO
Remote-Party-ID: "Test User 1" <sip:[email protected]>;party=calling;id-type=subscriber;privacy=off;screen=yes
Supported: replaces,join,sdp-anat,norefersub,extended-refer,X-cisco-callinfo,X-cisco-serviceuri,X-cisco-escapecodes,X-
cisco-service-control,X-cisco-srtp-fallback,X-cisco-monrec,X-cisco-config,X-cisco-sis-5.0.0,X-cisco-xsi-9.0.1
Allow-Events: kpml,dialog
Content-Length: 632
Content-Type: application/sdp
Content-Disposition: session;handling=optional
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 195
Case Study 3: No One Answers the Phone
v=0
o=Cisco-SIPUA 26964 0 IN IP4 172.18.159.152
s=SIP Call
t=0 0
m=audio 29254 RTP/SAVP 0 8 18 102 9 116 124 101
c=IN IP4 172.18.159.152
a=crypto:XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:18 G729/8000
a=fmtp:18 annexb=no
a=rtpmap:102 L16/16000
a=rtpmap:9 G722/8000
a=rtpmap:116 iLBC/8000
a=fmtp:116 mode=20
a=rtpmap:124 ISAC/16000
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-15
a=sendrecv
m=video 25466 RTP/AVP 97
c=IN IP4 172.18.159.152
b=TIAS:1000000
a=rtpmap:97 H264/90000
a=fmtp:97 profile-level-id=42801E
a=recvonly
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 196
Case Study 3: No One Answers the Phone
Unified CM Sends a 100 Trying
03/29/2010 10:36:33.773 |//SIP/SIPTcp/wait_SdlSPISignal: Outgoing SIP TCP message to
172.18.159.152 on port 51682 index 2321
SIP/2.0 100 Trying
Via: SIP/2.0/TLS 172.18.159.152:51682;branch=z9hG4bK1636ab61
From: "Test User 1" <sip:[email protected]>;tag=00260bd9669e07147bcb3aac-3cda8f0c
To: <sip:[email protected];user=phone>
Date: Mon, 29 Mar 2010 14:36:33 GMT
Call-ID: [email protected]
CSeq: 101 INVITE
Allow-Events: presence
Content-Length: 0

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 197
Case Study 3: No One Answers the Phone
Unified CM Sends a REFER to play Outside Dialtone
03/29/2010 10:36:33.780 |//SIP/SIPTcp/wait_SdlSPISignal: Outgoing SIP TCP message to
172.18.159.152 on port 51682 index 2321
REFER sip:[email protected]:51682 SIP/2.0
Via: SIP/2.0/TLS 172.18.106.59:5061;branch=z9hG4bK151511c5f04bf
From: <sip:[email protected]>;tag=2144536187
To: <sip:[email protected]>
Call-ID: [email protected]
CSeq: 101 REFER
Max-Forwards: 70
Contact: <sip:[email protected]:5061;transport=tls>
User-Agent: Cisco-CUCM11.5
Expires: 0
Refer-To: cid:[email protected]
Content-Id: <[email protected]>
Require: norefersub
Content-Type: application/x-cisco-remotecc-request+xml
Referred-By: <sip:[email protected]>
Content-Length: 409
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 198
Case Study 3: No One Answers the Phone
<x-cisco-remotecc-request>
<playtonereq>
<dialogid>
<callid>[email protected]</callid>
<localtag>97903bc0-a3de-4a15-ba27-44c81fe3adcd-45510542</localtag>
<remotetag>00260bd9669e07147bcb3aac-3cda8f0c</remotetag>
</dialogid>
<tonetype>DtOutsideDialTone</tonetype>
<direction>user</direction>
</playtonereq>
</x-cisco-remotecc-request>

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 199
Case Study 3: No One Answers the Phone
Unified CM Sends a SUBSCRIBE for KPML
03/29/2010 10:36:33.781 |//SIP/SIPTcp/wait_SdlSPISignal: Outgoing SIP TCP message to 172.18.159.152 on port 51682 index 2321
SUBSCRIBE sip:[email protected]:51682 SIP/2.0
Via: SIP/2.0/TLS 172.18.106.59:5061;branch=z9hG4bK1515232b4e84f
From: <sip:[email protected]>;tag=1976165806
To: <sip:[email protected]>
Call-ID: [email protected]
CSeq: 101 SUBSCRIBE
Date: Mon, 29 Mar 2010 14:36:33 GMT
User-Agent: Cisco-CUCM11.5
Event: kpml; [email protected]; from-tag=00260bd9669e07147bcb3aac-3cda8f0c
Expires: 7200
Contact: <sip:[email protected]:5061;transport=tls>
Accept: application/kpml-response+xml
Max-Forwards: 70
Content-Type: application/kpml-request+xml
Content-Length: 424
<?xml version="1.0" encoding="UTF-8" ?>
<kpml-request xmlns="urn:ietf:params:xml:ns:kpml-request" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="urn:ietf:params:xml:ns:kpml-request kpml-request.xsd" version="1.0">
<pattern criticaldigittimer="1000" extradigittimer="500" interdigittimer="10000" persist="persist">
<regex tag="Backspace OK">[x#*+]|bs</regex>
</pattern>
</kpml-request>

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 200
Case Study 3: No One Answers the Phone
Phone Sends 200 OK for the REFER and SUBSCRIBE
03/29/2010 10:36:33.802 |//SIP/SIPTcp/wait_SdlReadRsp: Incoming SIP TCP message from 172.18.159.152 on port 51682 index 2321 with 453 bytes:
SIP/2.0 200 OK
Via: SIP/2.0/TLS 172.18.106.59:5061;branch=z9hG4bK151511c5f04bf
From: <sip:[email protected]>;tag=2144536187
To: <sip:[email protected]>;tag=00260bd9669e07167c743311-343ee3af
Call-ID: [email protected]
Date: Mon, 29 Mar 2010 14:36:33 GMT
CSeq: 101 REFER
Server: Cisco-CP9951/9.0.1
Contact: <sip:[email protected]:51682;transport=TLS>
Content-Length: 0

03/29/2010 10:36:33.843 |//SIP/SIPTcp/wait_SdlReadRsp: Incoming SIP TCP message from 172.18.159.152 on port 51682 index 2321 with 465 bytes:
SIP/2.0 200 OK
Via: SIP/2.0/TLS 172.18.106.59:5061;branch=z9hG4bK1515232b4e84f
From: <sip:[email protected]>;tag=1976165806
To: <sip:[email protected]>;tag=00260bd9669e07177ee0d51d-14f56f89
Call-ID: [email protected]
Date: Mon, 29 Mar 2010 14:36:33 GMT
CSeq: 101 SUBSCRIBE
Server: Cisco-CP9951/9.0.1
Contact: <sip:[email protected]:51682;transport=TLS>
Expires: 7200
Content-Length: 0

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 201
Case Study 3: No One Answers the Phone
IP Phone Unified CM SIP Gateway
(172.18.159.152) (172.18.159.152) (172.18.159.231)

INVITE
100 Trying
REFER
SUBSCRIBE
200 OK (REFER)
200 OK (SUBSCRIBE)

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 202
Case Study 3: No One Answers the Phone
User Dials a ‘1’
03/29/2010 10:36:34.350 |//SIP/SIPTcp/wait_SdlReadRsp: Incoming SIP TCP message from 172.18.159.152 on port 51682
index 2321 with 896 bytes:
NOTIFY sip:[email protected]:5061 SIP/2.0
Via: SIP/2.0/TLS 172.18.159.152:51682;branch=z9hG4bK1cd529ba
To: <sip:[email protected]>;tag=1976165806
From: <sip:[email protected]>;tag=00260bd9669e07177ee0d51d-14f56f89
Call-ID: [email protected]
Date: Mon, 29 Mar 2010 14:36:33 GMT
CSeq: 1001 NOTIFY
Event: kpml
Subscription-State: active; expires=7200
Max-Forwards: 70
Contact: <sip:[email protected]:51682;transport=TLS>
Allow: ACK,BYE,CANCEL,INVITE,NOTIFY,OPTIONS,REFER,REGISTER,UPDATE,SUBSCRIBE
Content-Length: 209
Content-Type: application/kpml-response+xml
Content-Disposition: session;handling=required
<?xml version="1.0" encoding="UTF-8"?>
<kpml-response xmlns="urn:ietf:params:xml:ns:kpml-response" version="1.0" code="200" text="OK" suppressed="false"
forced_flush="false" digits="1" tag="Backspace OK"/>

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 203
Case Study 3: No One Answers the Phone
Unified CM Replies to NOTIFY With a 200 OK
03/29/2010 10:36:34.352 |//SIP/SIPTcp/wait_SdlSPISignal: Outgoing SIP TCP message to
172.18.159.152 on port 51682 index 2321
SIP/2.0 200 OK
Via: SIP/2.0/TLS 172.18.159.152:51682;branch=z9hG4bK1cd529ba
From: <sip:[email protected]>;tag=00260bd9669e07177ee0d51d-14f56f89
To: <sip:[email protected]>;tag=1976165806
Date: Mon, 29 Mar 2010 14:36:34 GMT
Call-ID: [email protected]
CSeq: 1001 NOTIFY
Content-Length: 0

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 204
Case Study 3: No One Answers the Phone
Unified CM Replies Sends a REFER to Disable Outside Dialtone
03/29/2010 10:36:34.353 |//SIP/SIPTcp/wait_SdlSPISignal: Outgoing SIP TCP message to 172.18.159.152 on port 51682 index
2321
REFER sip:[email protected]:51682 SIP/2.0
Via: SIP/2.0/TLS 172.18.106.59:5061;branch=z9hG4bK151536ea86ab0
From: <sip:[email protected]>;tag=1574166193
To: <sip:[email protected]>
Call-ID: [email protected]
CSeq: 101 REFER
Max-Forwards: 70
Contact: <sip:[email protected]:5061;transport=tls>
User-Agent: Cisco-CUCM11.5
Expires: 0
Refer-To: cid:[email protected]
Content-Id: <[email protected]>
Require: norefersub
Content-Type: application/x-cisco-remotecc-request+xml
Referred-By: <sip:[email protected]>
Content-Length: 401

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 205
Case Study 3: No One Answers the Phone
<x-cisco-remotecc-request>
<playtonereq>
<dialogid>
<callid>[email protected]</callid>
<localtag>97903bc0-a3de-4a15-ba27-44c81fe3adcd-45510542</localtag>
<remotetag>00260bd9669e07147bcb3aac-3cda8f0c</remotetag>
</dialogid>
<tonetype>Dt_NoTone</tonetype>
<direction>user</direction>
</playtonereq>
</x-cisco-remotecc-request>

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 206
Case Study 3: No One Answers the Phone
Phone Replies With 200 OK to REFER
03/29/2010 10:36:34.402 |//SIP/SIPTcp/wait_SdlReadRsp: Incoming SIP TCP message from
172.18.159.152 on port 51682 index 2321 with 453 bytes:
SIP/2.0 200 OK
Via: SIP/2.0/TLS 172.18.106.59:5061;branch=z9hG4bK151536ea86ab0
From: <sip:[email protected]>;tag=1574166193
To: <sip:[email protected]>;tag=00260bd9669e07184b08b96b-796ab86f
Call-ID: [email protected]
Date: Mon, 29 Mar 2010 14:36:33 GMT
CSeq: 101 REFER
Server: Cisco-CP9951/9.0.1
Contact: <sip:[email protected]:51682;transport=TLS>
Content-Length: 0

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 207
Case Study 3: No One Answers the Phone
IP Phone Unified CM SIP Gateway
(172.18.159.152) (172.18.159.152) (172.18.159.231)

INVITE
100 Trying
REFER
SUBSCRIBE
200 OK (REFER)
200 OK (SUBSCRIBE)
NOTIFY
200 OK (NOTIFY)
REFER
200 OK (REFER)

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 208
Case Study 3: No One Answers the Phone
User Dials a ‘8’
03/29/2010 10:36:34.944 |//SIP/SIPTcp/wait_SdlReadRsp: Incoming SIP TCP message from 172.18.159.152 on port 51682
index 2321 with 896 bytes:
NOTIFY sip:[email protected]:5061 SIP/2.0
Via: SIP/2.0/TLS 172.18.159.152:51682;branch=z9hG4bK647d03c1
To: <sip:[email protected]>;tag=1976165806
From: <sip:[email protected]>;tag=00260bd9669e07177ee0d51d-14f56f89
Call-ID: [email protected]
Date: Mon, 29 Mar 2010 14:36:34 GMT
CSeq: 1002 NOTIFY
Event: kpml
Subscription-State: active; expires=7195
Max-Forwards: 70
Contact: <sip:[email protected]:51682;transport=TLS>
Allow: ACK,BYE,CANCEL,INVITE,NOTIFY,OPTIONS,REFER,REGISTER,UPDATE,SUBSCRIBE
Content-Length: 209
Content-Type: application/kpml-response+xml
Content-Disposition: session;handling=required
<?xml version="1.0" encoding="UTF-8"?>
<kpml-response xmlns="urn:ietf:params:xml:ns:kpml-response" version="1.0" code="200" text="OK" suppressed="false"
forced_flush="false" digits="8" tag="Backspace OK"/>

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 209
Case Study 3: No One Answers the Phone
Unified CM Replies to NOTIFY With a 200 OK
03/29/2010 10:36:34.352 |//SIP/SIPTcp/wait_SdlSPISignal: Outgoing SIP TCP message to
172.18.159.152 on port 51682 index 2321
SIP/2.0 200 OK
Via: SIP/2.0/TLS 172.18.159.152:51682;branch=z9hG4bK1cd529ba
From: <sip:[email protected]>;tag=00260bd9669e07177ee0d51d-14f56f89
To: <sip:[email protected]>;tag=1976165806
Date: Mon, 29 Mar 2010 14:36:34 GMT
Call-ID: [email protected]
CSeq: 1001 NOTIFY
Content-Length: 0

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 210
Case Study 3: No One Answers the Phone
User Dials Remaining Digits

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 211
Case Study 3: No One Answers the Phone
IP Phone Unified CM SIP Gateway
(172.18.159.152) (172.18.159.152) (172.18.159.231)

INVITE
100 Trying
REFER
SUBSCRIBE
200 OK (REFER)
200 OK (SUBSCRIBE)
NOTIFY
200 OK (NOTIFY)
REFER
200 OK (REFER)
NOTIFY
200 OK (NOTIFY)
NOTIFY / 200 OK
Repeats 10 Times

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 212
Case Study 3: No One Answers the Phone
Unified CM Unsubscribes From KPML
03/29/2010 10:36:41.490 |//SIP/SIPTcp/wait_SdlSPISignal: Outgoing SIP TCP message to 172.18.159.152 on port 51682
index 2321
SUBSCRIBE sip:[email protected]:51682;transport=TLS SIP/2.0
Via: SIP/2.0/TLS 172.18.106.59:5061;branch=z9hG4bK1515a5e1d5a4c
From: <sip:[email protected]>;tag=1976165806
To: <sip:[email protected]>;tag=00260bd9669e07177ee0d51d-14f56f89
Call-ID: [email protected]
CSeq: 102 SUBSCRIBE
Date: Mon, 29 Mar 2010 14:36:41 GMT
User-Agent: Cisco-CUCM11.5
Event: kpml; [email protected]; from-
tag=00260bd9669e07147bcb3aac-3cda8f0c
Expires: 0
Contact: <sip:[email protected]:5061;transport=tls>
Max-Forwards: 70
Content-Length: 0

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 213
Case Study 3: No One Answers the Phone
Digit Analysis Match
10:36:41.486 |Digit analysis: match(pi="2", fqcn="+19194769236", cn="89919236",plv="5", pss="1stLine:RTP_AbbrDial:Cisco:US Local:US
RTP Local:US Long Distance:US International:VMPilotPartition", TodFilteredPss="1stLine:RTP_AbbrDial:Cisco:US Local:US RTP
Local:US Long Distance:US International:VMPilotPartition", dd="918772888362",dac="1”)
10:36:41.486 |Digit analysis: analysis results
10:36:41.486 ||PretransformCallingPartyNumber=+19194769236
|CallingPartyNumber=+19194769236
|DialingPartition=GDP_GlobalE164_PSTN
|DialingPattern=\+1.[2-9]XX[2-9]XXXXXX
|FullyQualifiedCalledPartyNumber=+18772888362
|DialingPatternRegularExpression=(+1)([2-9][0-9][0-9][2-9][0-9][0-9][0-9][0-9][0-9][0-9])
|DialingWhere=
|PatternType=Enterprise
|PotentialMatches=NoPotentialMatchesExist
|DialingSdlProcessId=(0,0,0)
|PretransformDigitString=+18772888362
|PretransformTagsList=ACCESS-CODE:SUBSCRIBER
|PretransformPositionalMatchList=+1:8772888362
|CollectedDigits=+18772888362
|UnconsumedDigits=
|TagsList=ACCESS-CODE:SUBSCRIBER
|PositionalMatchList=+1:8772888362
|VoiceMailbox=
|VoiceMailCallingSearchSpace=1stLine:RTP_AbbrDial
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 214
Case Study 3: No One Answers the Phone
Digit Analysis Match
|VoiceMailPilotNumber=89944444
|RouteBlockFlag=RouteThisPattern
|RouteBlockCause=0
|AlertingName=
|UnicodeDisplayName=
|DisplayNameLocale=1
|OverlapSendingFlagEnabled=0
|WithTags=
|WithValues=
|CallingPartyNumberPi=NotSelected
|ConnectedPartyNumberPi=NotSelected
|CallingPartyNamePi=NotSelected
|ConnectedPartyNamePi=NotSelected
|CallManagerDeviceType=NoDeviceType
|PatternPrecedenceLevel=Routine
|CallableEndPointName=[23146446-6606-7227-3882-75d07dd6fdef]
|PatternNodeId=[9badd465-d20a-5bc7-1077-8edee47e8caf]
|AARNeighborhood=[]
|AARDestinationMask=[]
|AARKeepCallHistory=true
|AARVoiceMailEnabled=false
|NetworkLocation=OffNet
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 215
Case Study 3: No One Answers the Phone
Digit Analysis Match
|Calling Party Number Type=Cisco Unified CallManager
|Calling Party Numbering Plan=Cisco Unified CallManager
|Called Party Number Type=Cisco Unified CallManager
|Called Party Numbering Plan=Cisco Unified CallManager
|ProvideOutsideDialtone=false
|AllowDeviceOverride=false
|AlternateMatches=
{
|Partition=US Long Distance
{
<
|Pattern=9.1[2-9]XX[2-9]XXXXXX
|PatternType=Translation
|TranslationPartition=[a6bd708e-ac4d-ae55-3134-b90b987e5ad9]
|CallManagerDeviceType=NoDeviceType
|PatternPrecedenceLevel=PlDefault
|PatternRouteClass=RouteClassDefault
|RouteNextHopByCgpn=false
>
}
}
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 216
Case Study 3: No One Answers the Phone
Digit Analysis Match
|TranslationPatternDetails=
|PretransformCallingPartyNumber=89919236
|CallingPartyNumber=+19194769236
|DialingPartition=US Local
|DialingPattern=9.1877[2-9]XXXXXX
|FullyQualifiedCalledPartyNumber=918772888362
|DialingPatternRegularExpression=(9)(1877[2-9][0-9][0-9][0-9][0-9][0-9][0-9])
|DialingWhere=
|PatternType=Translation
|PotentialMatches=NoPotentialMatchesExist
|DialingSdlProcessId=(0,0,0)
|PretransformDigitString=918772888362
|PretransformTagsList=ACCESS-CODE:SUBSCRIBER
|PretransformPositionalMatchList=9:18772888362
|CollectedDigits=+18772888362
|UnconsumedDigits=
|TagsList=SUBSCRIBER
|PositionalMatchList=18772888362
|VoiceMailbox=
|VoiceMailCallingSearchSpace=
|VoiceMailPilotNumber=
|RouteBlockFlag=RouteThisPattern
|RouteBlockCause=1
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 217
Case Study 3: No One Answers the Phone
Digit Analysis Match
|UnicodeDisplayName=
|DisplayNameLocale=1
|OverlapSendingFlagEnabled=0
|WithTags=
|WithValues=
|CallingPartyNumberPi=NotSelected
|ConnectedPartyNumberPi=NotSelected
|CallingPartyNamePi=NotSelected
|ConnectedPartyNamePi=NotSelected
|CallManagerDeviceType=NoDeviceType
|PatternPrecedenceLevel=Routine
|CallableEndPointName=[bb6f140a-5fd4-179a-2cad-2a1d5eacca7e]
|PatternNodeId=[bb6f140a-5fd4-179a-2cad-2a1d5eacca7e]
|AARNeighborhood=[]
|AARDestinationMask=[]
|AARKeepCallHistory=true
|AARVoiceMailEnabled=false
|NetworkLocation=OnNet
|ProvideOutsideDialtone=true
|AllowDeviceOverride=false
|AlternateMatches=

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 218
Case Study 3: No One Answers the Phone
Route List Match
RouteListControl::idle_CcSetupReq - RouteList(UDP LRG - Cisco GK), numberSetup=3 numberMember=1 vmEnabled=0
RoutePlanServer::getRouteList() - RouteListName(23146446-6606-7227-3882-75d07dd6fdef), fRealLocalRouteGroup(16512c76-
e145-8101-9977-952696a53137)
RoutePlanServer::getRouteGroup: standardLocalRG = 00000000-1111-0000-0000-000000000000, input routeGP =00000000-1111-
0000-0000-000000000000
RoutePlanServer::getRouteGroup: LRG flag = 1, lRouteGroupName = 00000000-1111-0000-0000-000000000000
RoutePlanServer::getRouteGroup: standardLocalRG = 00000000-1111-0000-0000-000000000000, input routeGP =16512c76-e145-
8101-9977-952696a53137
RoutePlanServer::getRouteGroup: mDeviceInfoList size =678
RoutePlanServer::getRouteGroup: standardLocalRG = 00000000-1111-0000-0000-000000000000, input routeGP =2bdffebe-b414-
489b-906a-44d16dce30c3
RoutePlanServer::getRouteGroup: LRG flag = 0, lRouteGroupName = 2bdffebe-b414-489b-906a-44d16dce30c3
RoutePlanServer::getRouteGroup: mDeviceInfoList size =678
RouteList - RouteGroup count=''2’’
RouteListCdrc::algorithmCategorization -- CDRC_SERIAL_DISTRIBUTION type=2
RoutePlanServer::updateStartingIndex - RouteGroupName(16512c76-e145-8101-9977-952696a53137)

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 219
Case Study 3: No One Answers the Phone
Finding the Route Group Names
admin:run sql select name,pkid from routegroup where pkid = '16512c76-e145-8101-
9977-952696a53137'
name pkid
================ ====================================
vnt-3945-gw1-sip 16512c76-e145-8101-9977-952696a53137

admin:run sql select name,pkid from routegroup where pkid = '2bdffebe-b414-489b-


906a-44d16dce30c3'
name pkid
====== ====================================
RTP-GK 2bdffebe-b414-489b-906a-44d16dce30c3

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 220
Case Study 3: No One Answers the Phone
Unified CM Sends an INVITE to the PSTN Gateway
03/29/2010 10:36:41.497 |//SIP/SIPUdp/wait_SdlSPISignal: Outgoing SIP UDP message to 172.18.159.231:[5060]:
INVITE sip:[email protected]:5060 SIP/2.0
Via: SIP/2.0/UDP 172.18.106.59:5060;branch=z9hG4bK1515b3154665
From: "Test User 1" <sip:[email protected]>;tag=97903bc0-a3de-4a15-ba27-44c81fe3adcd-45510543
To: <sip:[email protected]>
Date: Mon, 29 Mar 2010 14:36:41 GMT
Call-ID: [email protected]
Supported: timer,resource-priority,replaces
Min-SE: 1800
User-Agent: Cisco-CUCM11.5
Allow: INVITE, OPTIONS, INFO, BYE, CANCEL, ACK, PRACK, UPDATE, REFER, SUBSCRIBE, NOTIFY
CSeq: 101 INVITE
Expires: 180
Allow-Events: presence, kpml
Supported: X-cisco-srtp-fallback
Supported: Geolocation
Call-Info: <sip:172.18.106.59:5060>;method="NOTIFY;Event=telephone-event;Duration=500"
Cisco-Guid: 2081204224-3137452793-0000000466-0996807340
Session-Expires: 1800
P-Asserted-Identity: "Test User 1" <sip:[email protected]>
Contact: <sip:[email protected]:5060>;video;audio
Max-Forwards: 69
Content-Length: 0
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 221
Case Study 3: No One Answers the Phone
IP Phone Unified CM SIP Gateway
(172.18.159.152) (172.18.159.152) (172.18.159.231)

INVITE
100 Trying
REFER
SUBSCRIBE
200 OK (REFER)
200 OK (SUBSCRIBE)
NOTIFY
200 OK (NOTIFY)
REFER
200 OK (REFER)
NOTIFY
200 OK (NOTIFY)
NOTIFY / 200 OK
Repeats 10 Times
SUBSCRIBE
INVITE

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 222
Case Study 3: No One Answers the Phone
Gateway Replies With a 100 Trying
03/29/2010 10:36:41.500 |//SIP/SIPUdp/wait_UdpDataInd: Incoming SIP UDP message size 424 from
172.18.159.231:[5060]:
SIP/2.0 100 Trying
Via: SIP/2.0/UDP 172.18.106.59:5060;branch=z9hG4bK1515b3154665
From: "Test User 1" <sip:[email protected]>;tag=97903bc0-a3de-4a15-ba27-
44c81fe3adcd-45510543
To: <sip:[email protected]>
Date: Mon, 29 Mar 2010 14:37:23 GMT
Call-ID: [email protected]
CSeq: 101 INVITE
Allow-Events: telephone-event
Server: Cisco-SIPGateway/IOS-12.x
Content-Length: 0

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 223
Case Study 3: No One Answers the Phone
Phone Replies With 200 OK for the SUBSCRIBE
03/29/2010 10:36:41.534 |//SIP/SIPTcp/wait_SdlReadRsp: Incoming SIP TCP message from
172.18.159.152 on port 51682 index 2321 with 462 bytes:
SIP/2.0 200 OK
Via: SIP/2.0/TLS 172.18.106.59:5061;branch=z9hG4bK1515a5e1d5a4c
From: <sip:[email protected]>;tag=1976165806
To: <sip:[email protected]>;tag=00260bd9669e07177ee0d51d-14f56f89
Call-ID: [email protected]
Date: Mon, 29 Mar 2010 14:36:41 GMT
CSeq: 102 SUBSCRIBE
Server: Cisco-CP9951/9.0.1
Contact: <sip:[email protected]:51682;transport=TLS>
Expires: 0
Content-Length: 0

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 224
Case Study 3: No One Answers the Phone
Gateway Replies With a 183 Session Progress W/ SDP
03/29/2010 10:36:42.324 |//SIP/SIPUdp/wait_UdpDataInd: Incoming SIP UDP message size 1568 from
172.18.159.231:[5060]:
SIP/2.0 183 Session Progress
Via: SIP/2.0/UDP 172.18.106.59:5060;branch=z9hG4bK1515b3154665
From: "Test User 1" <sip:[email protected]>;tag=97903bc0-a3de-4a15-ba27-44c81fe3adcd-45510543
To: <sip:[email protected]>;tag=DE1EFF8-0
Date: Mon, 29 Mar 2010 14:37:23 GMT
Call-ID: [email protected]
CSeq: 101 INVITE
Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, PRACK, UPDATE, REFER, SUBSCRIBE, NOTIFY, INFO, REGISTER
Allow-Events: telephone-event
Remote-Party-ID: <sip:[email protected]>;party=called;screen=no;privacy=off
Contact: <sip:[email protected]:5060>
Supported: sdp-anat
Server: Cisco-SIPGateway/IOS-12.x
Content-Type: multipart/mixed;boundary=uniqueBoundary
Mime-Version: 1.0
Content-Length: 788
--uniqueBoundary

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 225
Case Study 3: No One Answers the Phone
Gateway Replies With a 183 Session Progress W/ SDP
Content-Type: application/sdp
Content-Disposition: session;handling=required
v=0
o=CiscoSystemsSIP-GW-UserAgent 0 7954 IN IP4 172.18.159.231
s=SIP Call
c=IN IP4 172.18.159.231
t=0 0
m=audio 27980 RTP/AVP 0 8 116 18 100 101
c=IN IP4 172.18.159.231
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:116 iLBC/8000
a=fmtp:116 mode=20
a=rtpmap:18 G729/8000
a=fmtp:18 annexb=no
a=rtpmap:100 X-NSE/8000
a=fmtp:100 192-194
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-16
--uniqueBoundary
Content-Type: application/x-q931
Content-Disposition: signal;handling=optional
Content-Length: 11
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 226
Case Study 3: No One Answers the Phone
Unified CM Sends a 180 Ringing to the IP Phone
03/29/2010 10:36:42.330 |//SIP/SIPTcp/wait_SdlSPISignal: Outgoing SIP TCP message to 172.18.159.152 on port 51682
index 2321
SIP/2.0 180 Ringing
Via: SIP/2.0/TLS 172.18.159.152:51682;branch=z9hG4bK1636ab61
From: "Test User 1" <sip:[email protected]>;tag=00260bd9669e07147bcb3aac-3cda8f0c
To: <sip:[email protected];user=phone>;tag=97903bc0-a3de-4a15-ba27-44c81fe3adcd-45510542
Date: Mon, 29 Mar 2010 14:36:33 GMT
Call-ID: [email protected]
CSeq: 101 INVITE
Allow: INVITE, OPTIONS, INFO, BYE, CANCEL, ACK, PRACK, UPDATE, REFER, SUBSCRIBE, NOTIFY
Allow-Events: presence
Contact: <sip:[email protected]:5061;transport=tls>
Call-Info: <urn:x-cisco-remotecc:callinfo>; security= NotAuthenticated; orientation= to; ui-state= ringout; gci= 2-305505; call-
instance= 1
Send-Info: conference
Remote-Party-ID: <sip:[email protected]>;party=called;screen=no;privacy=off
Content-Length: 0

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 227
Case Study 3: No One Answers the Phone
IP Phone Unified CM SIP Gateway
(172.18.159.152) (172.18.159.152) (172.18.159.231)

INVITE
100 Trying
REFER
SUBSCRIBE
200 OK (REFER)
200 OK (SUBSCRIBE)
NOTIFY
200 OK (NOTIFY)
REFER
200 OK (REFER)
NOTIFY
200 OK (NOTIFY)
NOTIFY / 200 OK
Repeats 10 Times
SUBSCRIBE
200 OK (SUBSCRIBE) INVITE
100 Trying
183 Session Progress
180 Ringing

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 228
Case Study 3: No One Answers the Phone
• Phone Keeps Ringing
• Timestamps Jump from 10:36:42 to 10:37:32
• No SIP Signaling for 50 seconds

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 229
Case Study 3: No One Answers the Phone
IP Phone Sends a NOTIFY
03/29/2010 10:37:32.931 |//SIP/SIPTcp/wait_SdlReadRsp: Incoming SIP TCP message from 172.18.159.152 on port 51682 index
2321 with 1015 bytes:
NOTIFY sip:[email protected] SIP/2.0
Via: SIP/2.0/TLS 172.18.159.152:51682;branch=z9hG4bK13e00d69
To: <sip:[email protected]>
From: <sip:[email protected]>;tag=00260bd9669e0719795cb162-12870e0b
Call-ID: [email protected]
Date: Mon, 29 Mar 2010 14:37:32 GMT
CSeq: 4 NOTIFY
Event: dialog
Subscription-State: active
Max-Forwards: 70
Contact: <sip:[email protected]:51682;transport=TLS>
Allow: ACK,BYE,CANCEL,INVITE,NOTIFY,OPTIONS,REFER,REGISTER,UPDATE,SUBSCRIBE
Content-Length: 366
Content-Type: application/dialog-info+xml
Content-Disposition: session;handling=required
<?xml version="1.0" encoding="UTF-8" ?>
<dialog-info xmlns:call="urn:x-cisco:parmams:xml:ns:dialog-info:dialog:callinfo-dialog" version="1" state="partial"
entity="sip:[email protected]">
<dialog id="22" call-id="[email protected]" local-tag="00260bd9669e07147bcb3aac-
3cda8f0c"><state>terminated</state></dialog></dialog-info>
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 230
Case Study 3: No One Answers the Phone
Unified CM Replies With 200 OK for the NOTIFY
03/29/2010 10:37:32.934 |//SIP/SIPTcp/wait_SdlSPISignal: Outgoing SIP TCP message to
172.18.159.152 on port 51682 index 2321
SIP/2.0 200 OK
Via: SIP/2.0/TLS 172.18.159.152:51682;branch=z9hG4bK13e00d69
From: <sip:[email protected]>;tag=00260bd9669e0719795cb162-12870e0b
To: <sip:[email protected]>;tag=322772766
Date: Mon, 29 Mar 2010 14:37:32 GMT
Call-ID: [email protected]
CSeq: 4 NOTIFY
Content-Length: 0

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 231
Case Study 3: No One Answers the Phone
Phone Sends a CANCEL
03/29/2010 10:37:32.934 |//SIP/SIPTcp/wait_SdlReadRsp: Incoming SIP TCP message from
172.18.159.152 on port 51682 index 2321 with 422 bytes:
CANCEL sip:[email protected];user=phone SIP/2.0
Via: SIP/2.0/TLS 172.18.159.152:51682;branch=z9hG4bK1636ab61
From: "Test User 1" <sip:[email protected]>;tag=00260bd9669e07147bcb3aac-3cda8f0c
To: <sip:[email protected];user=phone>
Call-ID: [email protected]
Max-Forwards: 70
Date: Mon, 29 Mar 2010 14:37:32 GMT
CSeq: 101 CANCEL
User-Agent: Cisco-CP9951/9.0.1
Content-Length: 0

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 232
Case Study 3: No One Answers the Phone
Unified CM Sends a 200 OK for the CANCEL
03/29/2010 10:37:32.935 |//SIP/SIPTcp/wait_SdlSPISignal: Outgoing SIP TCP message to
172.18.159.152 on port 51682 index 2321
SIP/2.0 200 OK
Via: SIP/2.0/TLS 172.18.159.152:51682;branch=z9hG4bK1636ab61
From: "Test User 1" <sip:[email protected]>;tag=00260bd9669e07147bcb3aac-3cda8f0c
To: <sip:[email protected];user=phone>
Date: Mon, 29 Mar 2010 14:37:32 GMT
Call-ID: [email protected]
CSeq: 101 CANCEL
Content-Length: 0

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 233
Case Study 3: No One Answers the Phone
Unified CM Sends CANCEL to Gateway
03/29/2010 10:37:32.938 |//SIP/SIPUdp/wait_SdlSPISignal: Outgoing SIP UDP message to
172.18.159.231:[5060]:
CANCEL sip:[email protected]:5060 SIP/2.0
Via: SIP/2.0/UDP 172.18.106.59:5060;branch=z9hG4bK1515b3154665
From: "Test User 1" <sip:[email protected]>;tag=97903bc0-a3de-4a15-ba27-
44c81fe3adcd-45510543
To: <sip:[email protected]>
Date: Mon, 29 Mar 2010 14:36:41 GMT
Call-ID: [email protected]
CSeq: 101 CANCEL
Max-Forwards: 70
Content-Length: 0

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 234
Case Study 3: No One Answers the Phone
Unified CM Sends 487 in response to INVITE
03/29/2010 10:37:32.939 |//SIP/SIPTcp/wait_SdlSPISignal: Outgoing SIP TCP message to 172.18.159.152 on port 51682
index 2321
SIP/2.0 487 Request Cancelled
Via: SIP/2.0/TLS 172.18.159.152:51682;branch=z9hG4bK1636ab61
From: "Test User 1" <sip:[email protected]>;tag=00260bd9669e07147bcb3aac-3cda8f0c
To: <sip:[email protected];user=phone>;tag=97903bc0-a3de-4a15-ba27-44c81fe3adcd-45510542
Date: Mon, 29 Mar 2010 14:37:32 GMT
Call-ID: [email protected]
CSeq: 101 INVITE
Allow-Events: presence
Content-Length: 0

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 235
Case Study 3: No One Answers the Phone
Unified CM Sends 200 OK for CANCEL to Gateway
03/29/2010 10:37:32.940 |//SIP/SIPUdp/wait_UdpDataInd: Incoming SIP UDP message size 354 from
172.18.159.231:[5060]:
SIP/2.0 200 OK
Via: SIP/2.0/UDP 172.18.106.59:5060;branch=z9hG4bK1515b3154665
From: "Test User 1" <sip:[email protected]>;tag=97903bc0-a3de-4a15-ba27-
44c81fe3adcd-45510543
To: <sip:[email protected]>
Date: Mon, 29 Mar 2010 14:38:15 GMT
Call-ID: [email protected]
CSeq: 101 CANCEL
Content-Length: 0

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 236
Case Study 3: No One Answers the Phone
Gateway Sends 487 in response to INVITE
03/29/2010 10:37:32.941 |//SIP/SIPUdp/wait_UdpDataInd: Incoming SIP UDP message size 473 from
172.18.159.231:[5060]:
SIP/2.0 487 Request Cancelled
Via: SIP/2.0/UDP 172.18.106.59:5060;branch=z9hG4bK1515b3154665
From: "Test User 1" <sip:[email protected]>;tag=97903bc0-a3de-4a15-ba27-
44c81fe3adcd-45510543
To: <sip:[email protected]>;tag=DE1EFF8-0
Date: Mon, 29 Mar 2010 14:38:15 GMT
Call-ID: [email protected]
CSeq: 101 INVITE
Allow-Events: telephone-event
Server: Cisco-SIPGateway/IOS-12.x
Reason: Q.850;cause=16
Content-Length: 0

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 237
Case Study 3: No One Answers the Phone
Gateway sends ACK
03/29/2010 10:37:32.943 |//SIP/SIPUdp/wait_SdlSPISignal: Outgoing SIP UDP message to
172.18.159.231:[5060]:
ACK sip:[email protected]:5060 SIP/2.0
Via: SIP/2.0/UDP 172.18.106.59:5060;branch=z9hG4bK1515b3154665
From: "Test User 1" <sip:[email protected]>;tag=97903bc0-a3de-4a15-ba27-
44c81fe3adcd-45510543
To: <sip:[email protected]>;tag=DE1EFF8-0
Date: Mon, 29 Mar 2010 14:36:41 GMT
Call-ID: [email protected]
Max-Forwards: 70
CSeq: 101 ACK
Allow-Events: presence, kpml
Content-Length: 0

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 238
Case Study 3: No One Answers the Phone
Unified CM Sends ACK
03/29/2010 10:37:32.947 |//SIP/SIPTcp/wait_SdlReadRsp: Incoming SIP TCP message from
172.18.159.152 on port 51682 index 2321 with 416 bytes:
ACK sip:[email protected];user=phone SIP/2.0
Via: SIP/2.0/TLS 172.18.159.152:51682;branch=z9hG4bK1636ab61
From: "Test User 1" <sip:[email protected]>;tag=00260bd9669e07147bcb3aac-3cda8f0c
To: <sip:[email protected];user=phone>;tag=97903bc0-a3de-4a15-ba27-44c81fe3adcd-45510542
Call-ID: [email protected]
Date: Mon, 29 Mar 2010 14:37:32 GMT
CSeq: 101 ACK
Content-Length: 0

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 239
Case Study 3: No One Answers the Phone
IP Phone Unified CM SIP Gateway
(172.18.159.152) (172.18.159.152) (172.18.159.231)

NOTIFY
200 OK (NOTIFY)
CANCEL
200 OK (CANCEL)
CANCEL
487 Request Cancelled
200 OK (CANCEL)
487 Request Cancelled
ACK
ACK

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 240
Case Study 3: No One Answers the Phone
Debugging Calls in IOS
• Enable Q.931 ISDN Debugs:
debug isdn q931
• Enable SIP Debugs:
debug ccsip messages

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 241
Case Study 3: No One Answers the Phone
INVITE From Unified CM to Gateway
*Mar 29 14:37:23.635: //-1/xxxxxxxxxxxx/SIP/Msg/ccsipDisplayMsg:
Received:
INVITE sip:[email protected]:5060 SIP/2.0
Via: SIP/2.0/UDP 172.18.106.59:5060;branch=z9hG4bK1515b3154665
From: "Test User 1" <sip:[email protected]>;tag=97903bc0-a3de-4a15-ba27-44c81fe3adcd-45510543
To: <sip:[email protected]>
Date: Mon, 29 Mar 2010 14:36:41 GMT
Call-ID: [email protected]
Supported: timer,resource-priority,replaces
Min-SE: 1800
User-Agent: Cisco-CUCM11.5
Allow: INVITE, OPTIONS, INFO, BYE, CANCEL, ACK, PRACK, UPDATE, REFER, SUBSCRIBE, NOTIFY
CSeq: 101 INVITE
Expires: 180
Allow-Events: presence, kpml
Supported: X-cisco-srtp-fallback
Supported: Geolocation
Call-Info: <sip:172.18.106.59:5060>;method="NOTIFY;Event=telephone-event;Duration=500"
Cisco-Guid: 2081204224-3137452793-0000000466-0996807340
Session-Expires: 1800
P-Asserted-Identity: "Test User 1" <sip:[email protected]>
Contact: <sip:[email protected]:5060>;video;audio
Max-Forwards: 69
Content-Length: 0
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 242
Case Study 3: No One Answers the Phone
ISDN SETUP Message
*Mar 29 14:37:23.639: ISDN Se0/0/0:23 Q931: TX -> SETUP pd = 8 callref = 0x008B
Bearer Capability i = 0x8090A2
Standard = CCITT
Transfer Capability = Speech
Transfer Mode = Circuit
Transfer Rate = 64 kbit/s
Channel ID i = 0xA98381
Exclusive, Channel 1
Calling Party Number i = 0x2181, '9194769236'
Plan:ISDN, Type:National
Called Party Number i = 0x80, '18772888362'
Plan:Unknown, Type:Unknown

*Mar 29 14:37:23.667: ISDN Se0/0/0:23 Q931: RX <- CALL_PROC pd = 8 callref = 0x808B


Channel ID i = 0xA98381
Exclusive, Channel 1

*Mar 29 14:37:24.463: ISDN Se0/0/0:23 Q931: RX <- PROGRESS pd = 8 callref = 0x808B


Progress Ind i = 0x8281 - Call not end-to-end ISDN, may have in-band info

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 243
Case Study 3: No One Answers the Phone
Gateway Sends 183 in Response to ISDN PROGRESS Message
*Mar 29 14:37:24.463: //-1/xxxxxxxxxxxx/SIP/Msg/ccsipDisplayMsg:
Sent:
SIP/2.0 183 Session Progress
Via: SIP/2.0/UDP 172.18.106.59:5060;branch=z9hG4bK1515b3154665
From: "Test User 1" <sip:[email protected]>;tag=97903bc0-a3de-4a15-ba27-44c81fe3adcd-45510543
To: <sip:[email protected]>;tag=DE1EFF8-0
Date: Mon, 29 Mar 2010 14:37:23 GMT
Call-ID: [email protected]
CSeq: 101 INVITE
Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, PRACK, UPDATE, REFER, SUBSCRIBE, NOTIFY, INFO, REGISTER
Allow-Events: telephone-event
Remote-Party-ID: <sip:[email protected]>;party=called;screen=no;privacy=off
Contact: <sip:[email protected]:5060>
Supported: sdp-anat
Server: Cisco-SIPGateway/IOS-12.x
Content-Type: multipart/mixed;boundary=uniqueBoundary
Mime-Version: 1.0
Content-Length: 788

--uniqueBoundary
Content-Type: application/sdp
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 244
Case Study 3: No One Answers the Phone
• How do we get the gateway to cut through audio on the PROGRESS message?
• RFC 3262: Reliability of Provisional Responses in the Session Initiation Protocol
(SIP)
• Provides a way to acknowledge the 183 Session Progress message – PRACK
• Unified CM SIP Profile Setting “SEP Rel1XX Options”
• Disabled
• Send PRACK for all 1xx Messages
• Send PRACK if 1xx Contains SDP

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 245
Case Study 3: No One Answers the Phone
IP Phone Sends INVITE When User Presses “Redial”
03/29/2010 10:38:47.085 |//SIP/SIPTcp/wait_SdlReadRsp: Incoming SIP TCP message from 172.18.159.152 on port 51682 index
2321 with 1717 bytes:
INVITE sip:[email protected] SIP/2.0
Via: SIP/2.0/TLS 172.18.159.152:51682;branch=z9hG4bK3d7f770b
From: "Test User 1" <sip:[email protected]>;tag=00260bd9669e071b177eda32-75cc7dfe
To: <sip:[email protected]>
Call-ID: [email protected]
Max-Forwards: 70
Date: Mon, 29 Mar 2010 14:38:46 GMT
CSeq: 101 INVITE
User-Agent: Cisco-CP9951/9.0.1
Contact: <sip:[email protected]:51682;transport=tls>
Expires: 180
Accept: application/sdp
Allow: ACK,BYE,CANCEL,INVITE,NOTIFY,OPTIONS,REFER,REGISTER,UPDATE,SUBSCRIBE,INFO
Remote-Party-ID: "Test User 1" <sip:[email protected]>;party=calling;id-type=subscriber;privacy=off;screen=yes
Supported: replaces,join,sdp-anat,norefersub,extended-refer,X-cisco-callinfo,X-cisco-serviceuri,X-cisco-escapecodes,X-cisco-service-
control,X-cisco-srtp-fallback,X-cisco-monrec,X-cisco-config,X-cisco-sis-5.0.0,X-cisco-xsi-9.0.1
Allow-Events: kpml,dialog
Content-Length: 632
Content-Type: application/sdp
Content-Disposition: session;handling=optional
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 246
Case Study 3: No One Answers the Phone
v=0
o=Cisco-SIPUA 21482 0 IN IP4 172.18.159.152
s=SIP Call
t=0 0
m=audio 30308 RTP/SAVP 0 8 18 102 9 116 124 101
c=IN IP4 172.18.159.152
a=crypto:XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:18 G729/8000
a=fmtp:18 annexb=no
a=rtpmap:102 L16/16000
a=rtpmap:9 G722/8000
a=rtpmap:116 iLBC/8000
a=fmtp:116 mode=20
a=rtpmap:124 ISAC/16000
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-15
a=sendrecv
m=video 26760 RTP/AVP 97
c=IN IP4 172.18.159.152
b=TIAS:1000000
a=rtpmap:97 H264/90000
a=fmtp:97 profile-level-id=42801E
a=recvonly

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 247
Case Study 3: No One Answers the Phone
Unified CM Sends a 100 Trying to the Phone
03/29/2010 10:38:47.088 |//SIP/SIPTcp/wait_SdlSPISignal: Outgoing SIP TCP message to
172.18.159.152 on port 51682 index 2321
SIP/2.0 100 Trying
Via: SIP/2.0/TLS 172.18.159.152:51682;branch=z9hG4bK3d7f770b
From: "Test User 1" <sip:[email protected]>;tag=00260bd9669e071b177eda32-75cc7dfe
To: <sip:[email protected]>
Date: Mon, 29 Mar 2010 14:38:47 GMT
Call-ID: [email protected]
CSeq: 101 INVITE
Allow-Events: presence
Content-Length: 0

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 248
Case Study 3: No One Answers the Phone
Unified CM Sends an INVITE to the PSTN Gateway
03/29/2010 10:38:47.102 |//SIP/SIPUdp/wait_SdlSPISignal: Outgoing SIP UDP message to 172.18.159.231:[5060]:
INVITE sip:[email protected]:5060 SIP/2.0
Via: SIP/2.0/UDP 172.18.106.59:5060;branch=z9hG4bK151894fb5e17
From: "Test User 1" <sip:[email protected]>;tag=97903bc0-a3de-4a15-ba27-44c81fe3adcd-45510549
To: <sip:[email protected]>
Date: Mon, 29 Mar 2010 14:38:47 GMT
Call-ID: [email protected]
Supported: 100rel,timer,resource-priority,replaces
Min-SE: 1800
User-Agent: Cisco-CUCM11.5
Allow: INVITE, OPTIONS, INFO, BYE, CANCEL, ACK, PRACK, UPDATE, REFER, SUBSCRIBE, NOTIFY
CSeq: 101 INVITE
Expires: 180
Allow-Events: presence, kpml
Supported: X-cisco-srtp-fallback
Supported: Geolocation
Call-Info: <sip:172.18.106.59:5060>;method="NOTIFY;Event=telephone-event;Duration=500"
Cisco-Guid: 3341204224-3137452919-0000000467-0996807340
Session-Expires: 1800
P-Asserted-Identity: "Test User 1" <sip:[email protected]>
Contact: <sip:[email protected]:5060>;video;audio
Max-Forwards: 69
Content-Length: 0

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 249
Case Study 3: No One Answers the Phone
Unified CM Sends an 100 Trying to IP Phone
03/29/2010 10:38:47.107 |//SIP/SIPUdp/wait_UdpDataInd: Incoming SIP UDP message size 424 from
172.18.159.231:[5060]:
SIP/2.0 100 Trying
Via: SIP/2.0/UDP 172.18.106.59:5060;branch=z9hG4bK151894fb5e17
From: "Test User 1" <sip:[email protected]>;tag=97903bc0-a3de-4a15-ba27-44c81fe3adcd-45510549
To: <sip:[email protected]>
Date: Mon, 29 Mar 2010 14:39:29 GMT
Call-ID: [email protected]
CSeq: 101 INVITE
Allow-Events: telephone-event
Server: Cisco-SIPGateway/IOS-12.x
Content-Length: 0

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 250
Case Study 3: No One Answers the Phone
Gateway sends 183 Session Progress to Unified CM
03/29/2010 10:38:47.972 |//SIP/SIPUdp/wait_UdpDataInd: Incoming SIP UDP message size 1601 from 172.18.159.231:[5060]:
SIP/2.0 183 Session Progress
Via: SIP/2.0/UDP 172.18.106.59:5060;branch=z9hG4bK151894fb5e17
From: "Test User 1" <sip:[email protected]>;tag=97903bc0-a3de-4a15-ba27-44c81fe3adcd-45510549
To: <sip:[email protected]>;tag=DE3DAC4-1E12
Date: Mon, 29 Mar 2010 14:39:29 GMT
Call-ID: [email protected]
CSeq: 101 INVITE
Require: 100rel
RSeq: 42
Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, PRACK, UPDATE, REFER, SUBSCRIBE, NOTIFY, INFO, REGISTER
Allow-Events: telephone-event
Remote-Party-ID: <sip:[email protected]>;party=called;screen=no;privacy=off
Contact: <sip:[email protected]:5060>
Supported: sdp-anat
Server: Cisco-SIPGateway/IOS-12.x
Content-Type: multipart/mixed;boundary=uniqueBoundary
Mime-Version: 1.0
Content-Length: 791
--uniqueBoundary

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 251
Case Study 3: No One Answers the Phone
Gateway sends 183 Session Progress to Unified CM
Content-Type: application/sdp
Content-Disposition: session;handling=required
v=0
o=CiscoSystemsSIP-GW-UserAgent 1896 8548 IN IP4 172.18.159.231
s=SIP Call
c=IN IP4 172.18.159.231
t=0 0
m=audio 17784 RTP/AVP 0 8 116 18 100 101
c=IN IP4 172.18.159.231
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:116 iLBC/8000
a=fmtp:116 mode=20
a=rtpmap:18 G729/8000
a=fmtp:18 annexb=no
a=rtpmap:100 X-NSE/8000
a=fmtp:100 192-194
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-16
--uniqueBoundary
Content-Type: application/x-q931
Content-Disposition: signal;handling=optional
Content-Length: 11
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 252
Case Study 3: No One Answers the Phone
Unified CM Sends PRACK to Gateway with SDP
03/29/2010 10:38:47.983 |//SIP/SIPUdp/wait_SdlSPISignal: Outgoing SIP UDP message to 172.18.159.231:[5060]:
PRACK sip:[email protected]:5060 SIP/2.0
Via: SIP/2.0/UDP 172.18.106.59:5060;branch=z9hG4bK1518c3e52a9ef
From: "Test User 1" <sip:[email protected]>;tag=97903bc0-a3de-4a15-ba27-44c81fe3adcd-45510549
To: <sip:[email protected]>;tag=DE3DAC4-1E12
Date: Mon, 29 Mar 2010 14:38:47 GMT
Call-ID: [email protected]
CSeq: 102 PRACK
RAck: 42 101 INVITE
Allow-Events: presence, kpml
Max-Forwards: 70
Content-Type: application/sdp
Content-Length: 215
v=0
o=CiscoSystemsCCM-SIP 2000 1 IN IP4 172.18.106.59
s=SIP Call
c=IN IP4 172.18.159.152
t=0 0
m=audio 30308 RTP/AVP 0 101
a=rtpmap:0 PCMU/8000
a=ptime:20
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-15
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 253
Case Study 3: No One Answers the Phone
Unified CM Sends 183 with SDP to IP Phone
03/29/2010 10:38:47.989 |//SIP/SIPTcp/wait_SdlSPISignal: Outgoing SIP TCP message to 172.18.159.152 on port 51682
index 2321
SIP/2.0 183 Session Progress
Via: SIP/2.0/TLS 172.18.159.152:51682;branch=z9hG4bK3d7f770b
From: "Test User 1" <sip:[email protected]>;tag=00260bd9669e071b177eda32-75cc7dfe
To: <sip:[email protected]>;tag=97903bc0-a3de-4a15-ba27-44c81fe3adcd-45510548
Date: Mon, 29 Mar 2010 14:38:47 GMT
Call-ID: [email protected]
CSeq: 101 INVITE
Allow: INVITE, OPTIONS, INFO, BYE, CANCEL, ACK, PRACK, UPDATE, REFER, SUBSCRIBE, NOTIFY
Allow-Events: presence
Contact: <sip:[email protected]:5061;transport=tls>
Call-Info: <urn:x-cisco-remotecc:callinfo>; security= NotAuthenticated; orientation= to; gci= 2-305508; call-instance= 1
Send-Info: conference
Remote-Party-ID: <sip:[email protected]>;party=called;screen=yes;privacy=off
Content-Type: application/sdp

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 254
Case Study 3: No One Answers the Phone
Unified CM Sends 183 with SDP to IP Phone
Content-Length: 633
v=0
o=CiscoSystemsCCM-SIP 2000 1 IN IP4 172.18.106.59
s=SIP Call
t=0 0
m=audio 17784 RTP/AVP 0 101
c=IN IP4 172.18.159.231
a=rtpmap:0 PCMU/8000
a=ptime:20
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-15
m=video 0 RTP/AVP 31 34 96 97
c=IN IP4 0.0.0.0
a=rtpmap:31 H261/90000
a=fmtp:31 MAXBR=128
a=rtpmap:34 H263/90000
a=fmtp:34 BPP=12092;F=1
a=rtpmap:96 H263-1998/90000
a=fmtp:96 BPP=27745;F=1;I=1;J=1;T=1;K=1;P=2,4
a=rtpmap:97 H264/90000
a=fmtp:97 sprop-interleaving-depth=21838;sprop-deint-buf-req=1801858876;sprop-max-don-diff=1701606770;max-fs=1767992687;max-
br=164213620;deint-buf-cap=1715224179
a=inactive
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 255
Agenda
• Serviceability Tools Overview
Real-Time Monitoring Tool (RTMT)
Cisco Unified Operating System GUI
Cisco Unified Operating System CLI
Cisco Serviceability Reports Archive and Cisco Unified Reporting
• Troubleshooting Methodology
Problem Description
Information Collection
• Troubleshooting Case Studies
Dropped Call Unable to Place Calls
Delayed Audio Cut-Through Call Drops After Answering
No One Answers the Phone Video Encryption Not Working
• Understanding and Troubleshooting Unified CM Throttling Events
• Troubleshooting Database Replication
Case Study 4: Unable to Place Calls
Problem Description
• Some users report getting a message saying “We're sorry. It is not necessary to
dial a 1 when calling this number. Will you please hang up and try your call
again”
• User who reported the issue indicates they did not dial a 1 – they dial
9 637 5411.
• User reports the problem is reproducible – every time they call that number the
problem happens.

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 257
Case Study 4: Unable to Place Calls
Problem Description
• Reproduce the problem at 2:10 p.m. on 4/16/10
• Search for the call in AnalysisManager

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 258
Case Study 4: Unable to Place Calls
• No calls found…

• Why? Must set CDR Log Calls with Zero Duration Flag to “True”

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 259
Case Study 4: Unable to Place Calls
Problem Description
• Reproduce the problem at 2:19 p.m. on 4/16/10
• Search again for the call in AnalysisManager

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 260
Case Study 4: Unable to Place Calls
• Analysis Result

• Normal Call Clearing


• No Connect Time

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 261
Case Study 4: Unable to Place Calls
• Click on “Record Details”

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 262
Case Study 4: Unable to Place Calls
• Click on “Record Details”

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 263
Case Study 4: Unable to Place Calls
• Use Dialed Number Analyzer:
• https://publisher_ip_address:8443/dna
• Make sure the DNA service is activated on the publisher:

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 264
Case Study 4: Unable to Place Calls
• Search for the phone

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 265
Case Study 4: Unable to Place Calls

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 266
Case Study 4: Unable to Place Calls

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 267
Case Study 4: Unable to Place Calls

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 268
Case Study 4: Unable to Place Calls

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 269
Case Study 4: Unable to Place Calls
• Use show dialplan number to see call routing in IOS
ciscolive-gw1#show dialplan number 918656375411 timeout
Macro Exp.: 918656375411

VoiceEncapPeer901
peer type = voice, system default peer = FALSE,
information type = voice, description = `',
tag = 901, destination-pattern = `9T',
voice reg type = 0, corresponding tag = 0,
allow watch = FALSE
answer-address = `', preference=0,
-- snip --
session-target = `', voice-port = `0/0/0:23',
direct-inward-dial = enabled,
digit_strip = enabled,
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 270
Case Study 4: Unable to Place Calls
• Check the dial peer configuration
knv3-1c-127-rtr1#sh run | beg dial-peer voice 901
dial-peer voice 901 pots
translation-profile incoming KNV_DID
destination-pattern 9T
incoming called-number .
direct-inward-dial
port 0/0/0:23
!

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 271
Case Study 4: Unable to Place Calls
• Run debug isdn q931 to see the outgoing call
Apr 20 10:07:25.791: ISDN Se0/0/0:23 Q931: TX -> SETUP pd = 8 callref = 0x0111
Bearer Capability i = 0x8090A2
Standard = CCITT
Transfer Capability = Speech
Transfer Mode = Circuit
Transfer Rate = 64 kbit/s
Channel ID i = 0xA98397
Exclusive, Channel 23
Calling Party Number i = 0x2181, '9199915644'
Plan:ISDN, Type:National
Called Party Number i = 0x80, '18656375411'
Plan:Unknown, Type:Unknown
Apr 20 10:07:26.303: ISDN Se0/0/0:23 Q931: RX <- CALL_PROC pd = 8 callref = 0x8111
Channel ID i = 0xA98397
Exclusive, Channel 23
Apr 20 10:07:26.623: ISDN Se0/0/0:23 Q931: RX <- PROGRESS pd = 8 callref = 0x8111
Cause i = 0x829F - Normal, unspecified
Progress Ind i = 0x8288 - In-band info or appropriate now available
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 272
Case Study 4: Unable to Place Calls
• Need to remove the 1 for local calls
• Transform either in Unified CM or on the gateway

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 273
Agenda
• Serviceability Tools Overview
Real-Time Monitoring Tool (RTMT)
Cisco Unified Operating System GUI
Cisco Unified Operating System CLI
Cisco Serviceability Reports Archive and Cisco Unified Reporting
• Troubleshooting Methodology
Problem Description
Information Collection
• Troubleshooting Case Studies
Dropped Call Unable to Place Calls
Delayed Audio Cut-Through Call Drops After Answering
No One Answers the Phone Video Encryption Not Working
• Understanding and Troubleshooting Unified CM Throttling Events
• Troubleshooting Database Replication
Case Study 5: Call Drops after Answering
Problem Description
• When a user (89915644) dials another user (89915724), the call drops
immediately after being answered.
• User reports the problem is reproducible – every time they call that number the
problem happens.

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 275
Case Study 5: Call Drops after Answering
Collect Traces
• Problem is reproducible, so generate a test call and then collect traces.

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 276
Case Study 5: Call Drops after Answering
Use TranslatorX to Analyze Traces
• Problem is reproducible, so generate a test call and then collect traces. Drag
and Drop folder into TranslatorX

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 277
Case Study 5: Call Drops after Answering
Use TranslatorX to Analyze Traces
• Can double-click a call to see CDR details

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 278
Case Study 5: Call Drops after Answering
Use TranslatorX to Analyze Traces
• Can double-click a call to see CDR details

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 279
Case Study 5: Call Drops after Answering
Use TranslatorX to Analyze Traces
• Open Call List Window

Select the problem call and


click “Generate Filter” button

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 280
Case Study 5: Call Drops after Answering
Use TranslatorX to Analyze Traces
• Can look at the Filters that were automatically generated

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 281
Case Study 5: Call Drops after Answering

Click ‘Generate Diagram’


button to generate a
Message Sequence Diagram

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 282
Case Study 5: Call Drops after Answering

Can rename headers to


make viewing diagram easier

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 283
Case Study 5: Call Drops after Answering

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 284
Case Study 5: Call Drops after Answering

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 285
Case Study 5: Call Drops after Answering

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 286
Case Study 5: Call Drops after Answering
BYE sip:[email protected]:5061;transport=tls SIP/2.0
Via: SIP/2.0/TLS 10.116.123.197:49876;branch=z9hG4bK56c1b160
From: <sip:[email protected]>;tag=ac7e8ab699c82f40271ffd3d-0b82db6e
To: "Paul Giralt" <sip:[email protected]>;tag=45642980~0d0d25d7-4931-4a07-83c6-b82e2c213ca7-46361365
Call-ID: [email protected]
Max-Forwards: 70
Session-ID: 21a6af9e00105000a000ac7e8ab699c8;remote=629c3da900105000a000881dfc610185
Date: Tue, 28 Jun 2016 02:53:22 GMT
CSeq: 101 BYE
User-Agent: Cisco-CP8865/11.5.1
Content-Length: 0
RTP-RxStat:
Dur=0,Pkt=0,Oct=0,LatePkt=0,LostPkt=0,AvgJit=0,VQMetrics="CCR=0.0000;ICR=0.0000;ICRmx=0.0000;CS=0
;SCS=0;Ver=0.90;VoRxCodec=G.722
64k;CID=6;VoPktSizeMs=0;VoPktLost=0;VoPktDis=0;VoOneWayDelayMs=0"
RTP-TxStat: Dur=0,Pkt=0,Oct=0

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 287
Case Study 5: Call Drops after Answering
• Why did phone send a BYE?

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 288
Case Study 5: Call Drops after Answering
• Generate a Problem Report

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 289
Case Study 5: Call Drops after Answering
• Retrieve Problem Report from Phone
• Must have Web access enabled on
Unified CM configuration page for
phone.

• Download Logs from phone web


page
• Keep this TAR file handy if you need
to open a TAC case

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 290
Case Study 5: Call Drops after Answering
• Problem Report contains various
pieces of diagnostic information
• logcat file contains most recent
logging information
• Open logcat in text editor or try
opening in TranslatorX

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 291
Case Study 5: Call Drops after Answering
• Open logcat file in
TranslatorX and filter by
the SIP Call-ID from the
BYE we saw come from
the 8865
• Try to look for errors that
might have triggered the
BYE
• Double-click the BYE to
find the BYE in the
actual trace file

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 292
Case Study 5: Call Drops after Answering

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 293
Case Study 5: Call Drops after Answering

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 294
Agenda
• Serviceability Tools Overview
Real-Time Monitoring Tool (RTMT)
Cisco Unified Operating System GUI
Cisco Unified Operating System CLI
Cisco Serviceability Reports Archive and Cisco Unified Reporting
• Troubleshooting Methodology
Problem Description
Information Collection
• Troubleshooting Case Studies
Dropped Call Unable to Place Calls
Delayed Audio Cut-Through Call Drops After Answering
No One Answers the Phone Video Encryption Not Working
• Understanding and Troubleshooting Unified CM Throttling Events
• Troubleshooting Database Replication
Case Study 6: Video Encryption Not Working
Problem Description
• Video call from a Cisco DX70 to a Cisco Telepresence Server via
TelePresence Conductor is not being encrypted
• Problem is easily reproducible
• Calls are destined to the TP Server at extension 80029999

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 296
Case Study 6: Video Encryption Not Working
Leverage Session Trace feature in RTMT

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 297
Case Study 6: Video Encryption Not Working
Session Trace Features
• Session trace only traces SIP sessions in detail
• Can show full SIP messages
• Uses correlation tags to include all call legs related to the call selected

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 298
Case Study 6: Video Encryption Not Working
Click on INVITE from DX70

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 299
Case Study 6: Video Encryption Not Working
INVITE from DX70

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 300
Case Study 6: Video Encryption Not Working
Audio m-line in SDP contained in INVITE from DX70

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 301
Case Study 6: Video Encryption Not Working
Now look at 200 OK from Conductor

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 302
Case Study 6: Video Encryption Not Working
200 OK from Conductor

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 303
Case Study 6: Video Encryption Not Working
Audio m-line in SDP contained in 200 OK from Conductor

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 304
Case Study 6: Video Encryption Not Working
Look at ACK from UCM to Conductor

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 305
Case Study 6: Video Encryption Not Working
ACK from UCM to Conductor

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 306
Case Study 6: Video Encryption Not Working
Audio m-line in SDP contained in ACK from UCM to Conductor

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 307
Case Study 6: Video Encryption Not Working
Look at 200 OK from UCM to DX70

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 308
Case Study 6: Video Encryption Not Working
200 OK from UCM to DX70

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 309
Case Study 6: Video Encryption Not Working
Audio m-line in SDP contained in 200 OK from UCM to DX70

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 310
Case Study 6: Video Encryption Not Working
Look carefully at audio and video m-lines
• SDP from Phone to UCM (Offer) w/ Crypto attributes:
m=audio 31646 RTP/SAVP 108 9 124 0 8 116 18 101
m=video 19724 RTP/SAVP 100 126 97
• SDP from Conductor to UCM (Offer) w/ Crypto attributes:
m=audio 52040 RTP/AVP 107 113 108 109 110 96 116 117 118 98 100 102 9 104 105 101 0 8 15 18
m=video 53638 RTP/AVP 126 97 99 34 31
• SDP from UCM to Conductor (Answer):
m=audio 31646 RTP/AVP 108 101
m=video 19724 RTP/AVP 126
• SDP from UCM to Phone (Answer):
m=audio 52040 RTP/AVP 108 101
m=video 53638 RTP/AVP 126

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 311
Case Study 6: Video Encryption Not Working
Root Cause Analysis
• Root Cause is Incompatibility between how UCM / Endpoints and Conductor /
TelePresence Server negotiate best-effort Encryption
• Must enable cisco-telepresence-conductor-interop Normalization Script

• Converts AVP w/ Crypto to SAVP w/ x-cisco-srtp-fallback

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 312
Troubleshooting Live Demo
Agenda
• Serviceability Tools Overview
Real-Time Monitoring Tool (RTMT)
Cisco Unified Operating System GUI
Cisco Unified Operating System CLI
Cisco Serviceability Reports Archive and Cisco Unified Reporting
• Troubleshooting Methodology
Problem Description
Information Collection
• Troubleshooting Case Studies
Dropped Call Unable to Place Calls
Delayed Audio Cut-Through Call Drops After Answering
No One Answers the Phone Video Call is Audio Only
• Understanding and Troubleshooting Unified CM Throttling Events
• Troubleshooting Database Replication
Understanding and Troubleshooting
Throttling Events
CallManager built-in Monitoring & Throttling
• Internal Thread that monitors Itself & Other threads
• Runs every 2 Seconds
ProcMon • Could Trigger %UC_CALLMANAGER-2-TimerThreadSlowed Alarm
• Could Intentionally Abort CallManager Service as a Last Resort

• Triggered against SDL Router Thread congestion only


• Throttles Main Call Processing Thread
CodeYellow • All New Calls Rejected on the Node in CodeYellow

• Triggered against other CallManager Sub Threads


SignalCongestionEntry • Such as SIP Handler Thread
• Could Trigger %UC_CALLMANAGER-2-SignalCongestionEntry

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 316
Understanding ProcMon TimerThreadSlowed
• ProcMon SDL Router Thread
Verification expects to run every 2
seconds
• > 1sec Delay TimerThreadSlowed
Alarm is raised as a forewarning
to Throttling (CodeYellow / Signal
Congestion) in seconds
3 X Max Router Latency 60 secs  Intentional Abort
• Usually induced due to IOWait ====================================
backtrace – CUCM
===================================
conditions #0 0xf774f430 in __kernel_vsyscall ()
#1 0xf691a871 in raise () from /lib/libc.so.6
#2 0xf691c14a in abort () from /lib/libc.so.6
#3 0x083a008e in IntentionalAbort () at ProcessCMProcMon.cpp:88

• Could be Correlated to #4 CMProcMon::verifySdlRouterServices () at ProcessCMProcMon.cpp:748


#5 0x083a04da in CMProcMon::callManagerMonitorThread (cmProcMon=0xe3a7ff70) at
#6 0xf6c3d398 in ACE_OS_Thread_Adapter::invoke (this=0xd93687d8) at OS_Thread_Adapter.cpp:103
CallManager RISDC Perfomance #7 0xf6bfd491 in ace_thread_adapter (args=0xd93687d8) at Base_Thread_Adapter.cpp:126
#8 0xf68d1bc9 in start_thread () from /lib/libpthread.so.0#9 0xf69d2c9e in clone () from

Counter
• \\cucm\System\IOServiceTime
Feb 10 10:15:33 cucm-sub5 local7 2 ccm: 14: cucm-sub5.domain.com : Feb 10 2017 15:15:33.193 UTC : %UC_CALLMANAGER-2-TimerThreadSlowed:
%[AppID=Cisco CallManager][ClusterID=StandAloneCluster][NodeID=cucm-sub5]: Timer thread has slowed beyond acceptable limits

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 317
Investigating TimerThreadSlowed Events
Feb 10 10:15:33 cucm-sub5 local7 2 ccm: 14: cucm-sub5.domain.com : Feb 10 2017 15:15:33.193 UTC : %UC_CALLMANAGER-2-TimerThreadSlowed:
%[AppID=Cisco CallManager][ClusterID=StandAloneCluster][NodeID=cucm-sub5]: Timer thread has slowed beyond acceptable limits

Evidence first seen in


RisDC Perfmon Logs
- Counter IOServiceTime

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 318
Investigating TimerThreadSlowed Events
Feb 10 10:15:33 cucm-sub5 local7 2 ccm: 14: cucm-sub5.domain.com : Feb 10 2017 15:15:33.193 UTC : %UC_CALLMANAGER-2-TimerThreadSlowed:
%[AppID=Cisco CallManager][ClusterID=StandAloneCluster][NodeID=cucm-sub5]: Timer thread has slowed beyond acceptable limits

Next Evidence seen in


VmWare VM Performance
Counters
 Highest Latency
 Read Latency
 Write Latency

© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 319
Investigating TimerThreadSlowed Events
Feb 10 10:15:33 cucm-sub5 local7 2 ccm: 14: cucm-sub5.domain.com : Feb 10 2017 15:15:33.193 UTC : %UC_CALLMANAGER-2-TimerThreadSlowed:
%[AppID=Cisco CallManager][ClusterID=StandAloneCluster][NodeID=cucm-sub5]: Timer thread has slowed beyond acceptable limits

Finally Evidence can be seen in


your SAN’s Performance
Counters
 Average Latency

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 320
Investigating TimerThreadSlowed Events
Cisco UCS C-Series
BE6/7k

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 321
Unified CM on UCS Storage IO Requirements
TRC C-Series B-Series w/ SAN

• IOPS pre spec’d out with Raid5 and number of • Host Level Kernel Disk Command Latency
Disks required Requirement < 4ms
• ✔ Good BBU = Write Back Cache Mode • Physical Device Command Latency
• ✖ Bad BBU = Write Through Mode Requirement < 20ms
• Min IOPS Required

Unified CM IOPS Requirements


(http://bit.ly/1hx7YrY)
BHCA 10k – 100K = 35 – 150 IOPS

Software Upgrades 800-1200 IOPS

Continuous CDR loading to CAR 300 IOPS

Trace collection 100 IOPS

Backups 50 IOPS

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 322
C220 / 240 M3S or M4 Tested Reference Configurations
with Super Cap

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 323
Understanding CallManager Code Yellow
• New call requests are throttled if expected delay to handle signals are
very high CUBE

All interfaces, SIP, SCCP, MGCP, CTI, H.323 clients + trunks   


INVITE
Call reject reason code will be 42 (SWITCHING EQUIPMENT CONGESTION) INVITE 503

SDL
• New calls originating on other nodes are still allowed
 In Code Yellow
ICT calls, PSTN gateways, IP to IP, etc. (incoming calls)

• The depth of SDL queues in conjunction with the sample size is used
to calculate average expected delay to process a signal

This is Code Red CallManager


ms Services Restarts
%
min

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 324
Code-Yellow Entry/Exit
• Entry criteria
Once a node exceeds the code yellow entry latency (20 ms by Default)
it enters code yellow
• Rejected new calls should reduce system load and average expected delay
should drop
• IP phones attempting to get dial tone will get reorder and display a message
saying “too much traffic, try again later”
• Exit criteria
Once delay drops below code yellow entry latency * code yellow exit latency calculation
(example 20 * .4 = 8 ms to exit) the node exits code yellow

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 325
CallManager Code-Yellow Alarms and Alerts
Alarms
Dec 8 14:57:15 sjc-rfd-sub-1 local7 3 : 2244: Dec 08 22:57:15.641 UTC :
%CCM_CALLMANAGER-CALLMANAGER-3-CodeYellowEntry: CodeYellowEntry Expected
Average Delay:214 Entry Latency:20 Exit Latency:8 Sample Size:10 Total Code Yellow
Entry:1 High Priority Queue Depth:0 Normal Priority Queue Depth:0 Low Priority Queue Depth:1285 App ID:Cisco
CallManager Cluster ID:SJC-RFD Node ID:sjc-rfd-sub-1
Dec 8 14:57:23 sjc-rfd-sub-1 local7 3 : 2245: Dec 08 22:57:23.721 UTC : %CCM_CALLMANAGER-CALLMANAGER-
3-CodeYellowExit: CodeYellowExit Expected Average Delay:0 Entry Latency:20 Exit Latency:8 Sample Size:10 Time
Spent in Code Yellow:8 Number of Calls Rejected Due to Call Throttling:238 Total Code Yellow Exit:1 High Priority
Queue Depth:0 Normal Priority Queue Depth:0 Low Priority Queue Depth:0 App ID:Cisco CallManager Cluster
ID:SJC-RFD Node ID:sjc-rfd-sub-1

Alert
Dec 8 14:57:29 sjc-rfd-pub-1 local7 3 : 106: Dec 08 22:57:29.33 UTC : %CCM_RTMT-RTMT-3-RTMT-ERROR-
ALERT: RTMT Alert Name:CodeYellow Detail: From Fri Dec 07 11:04:39 PST 2007 to Sat Dec 08 14:57:28 PST 2007
on node sjc-rfd-sub-1, there are 1 CodeYellowEntry alarm(s) and 0 CodeYellowExit alarm(s) received. On Sat Dec
08 14:57:15 PST 2007, the last CodeYellowEntry alarm generated: CodeYellowEntry AverageDelay : 214
EntryLatency : 20 ExitLatency : 8 SampleSize : 10 TotalCodeYellowEntry : 1 HighPriorityQueueDepth : 0 NodeID :
sjc-rfd-sub-1 App ID:Cisco AMC Service Cluster ID: Node ID:sjc-rfd-pub-1

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 326
Sample CodeYellowEntry Reasons: High IOWait
1. Due to tracing or disk fragmentation
Excessive # of trace files
Disk spacers are overwritten due to core files or other application traces
2. Due to swap activity
System is running out of memory
Memory leak condition
3. Due to other processes starving ccm out of CPU Resources
Trace collection or trace searching for events
4. Due to hard disk or raid array failure
Array accelerator is disabled due to battery failure

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 327
Sample CodeYellowEntry Reasons: CCM Runs Out
of CPU
1. CCM process runs out of CPU
• SDLRouter thread which runs most of call processing is single threaded
• For example on a 4 Core Server, the SDLRouter thread can only utilize 25% of total CPU
• Inspect Proglogs along with RISDC performance data
2. Other process starves ccm out of CPU
• Inspect RISDC performance data process %CPU usage

admin:file dump activelog cm/trace/ccm/Proglogs/ccm001_100_008.ProgLog


ProgramStarted - 01/24/2017-08:40:28.823
ProgramInfo
Internal Name : (null)
Parent Process ID : 1
Process ID : 28710
Program Name : ccm
Thread Monitor Log
………
01/24/2017-08:42:12.288 | Started | 23080 | SDLRouter
………

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 328
CodeYellowEntry Due to SDLRouter Thread Out of
CPU
• Due to tracing or disk fragmentation 
• IOWait is nominal
• No trace collection
• Disk fragmentation is nominal

• Application logs (CiscoSyslog/CallManager) is


inspected to find out exactly when CodeYellowEntry occurred
Feb 11 12:27:58 vnt-cm1b local7 2 ccm: 85855: vnt-cm1b.cisco.com: Feb 11 2017 17:27:58.325 UTC : %UC_CALLMANAGER-2-CodeYellowEntry:
%[AverageDelay=83][EntryLatency=20][ExitLatency=8][SampleSize=10][TotalCodeYellowEntry=16][HighPriorityQueueDepth=0][NormalPriorityQueueDepth=
0][LowPriorityQueueDepth=1112][AppID=Cisco CallManager][ClusterID=VNT-CM1A-Cluster][NodeID=vnt-cm1b]: Unified CM has entered Code Yellow state

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 329
CodeYellowEntry Due to SDLRouter Thread Out of
CPU
• Proglogs inspected

01/24/2017-08:42:12.288 | Started | 23080 | SDLRouter

• RISDC performance data inspected closer


• Thread counter class

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 330
CodeYellowEntry Due to SDLRouter Thread Out of
CPU

CPU Spike Correlates to the


CodeYellow Entry Time

Thread(ccm_23080)
Matches with SDLRouter
Thread’s PID from Last
✔ ProgLog

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 331
Understanding CallManager SignalCongestion
• Very Similar to Code Yellow Entry / Exit Criteria
• Same Service Parameters are used
• Impacts SIP Signaling Only processed via SIP Handler Thread
New Calls & Options Pings are rejected with 503 Service Unavailable, Q.850 Cause Code = 42
• The depth of SDL queues in conjunction with the sample size is used to calculate average
expected delay to process a signal within the SIP Handler Thread
71623913.000 |15:02:58.567 |AppInfo |CMProcMon - TotalDelay = 653 for SIP Handler Thread

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 332
SignalCongestion Alarms and Alerts
Alarms
Feb 11 17:31:34 vnt-cm1b local7 2 ccm: 88012: vnt-cm1b.cisco.com: Feb 11 2017 22:31:34.484 UTC :
%UC_CALLMANAGER-2-SignalCongestionEntry: %[Thread=SIP Handler
Thread][AverageDelay=2184][EntryLatency=20][ExitLatency=8][SampleSize=10][TotalSignalCongestionEntry=9][HighPri
orityQueueDepth=2][NormalPriorityQueueDepth=0][LowPriorityQueueDepth=0][AppID=Cisco
CallManager][ClusterID=VNT-CM1A-Cluster][NodeID=vnt-cm1b]: Unified CM has detected signal congestion in an internal
thread and has throttled activities for that thread
Feb 11 17:31:38 vnt-cm1b local7 5 ccm: 88014: vnt-cm1b.cisco.com: Feb 11 2017 22:31:38.496 UTC :
%UC_CALLMANAGER-5-SignalCongestionExit: %[Thread=SIP Handler
Thread][AverageDelay=0][EntryLatency=20][ExitLatency=8][SampleSize=10][TimeSpentInSignalCongestion=4]
[NumberOfCallsRejected=15054][TotalSignalCongestionExit=9][HighPriorityQueueDepth=0][NormalPriorityQueueDepth=
0][LowPriorityQueueDepth=0][AppID=Cisco CallManager][ClusterID=VNT-CM1A-Cluster][NodeID=vnt-cm1b]: Unified CM
has exited throttling caused by a previous signal congestion condition

Alert
Feb 11 17:31:44 vnt-cm1a local7 2 : 92: vnt-cm1a.cisco.com: Feb 11 2017 22:31:44.565 UTC : %UC_RTMT-2-
RTMT_ALERT: %[AlertName=SyslogSeverityMatchFound][AlertDetail= At Sat Feb 11 17:31:34 EST 2017 on node vnt-
cm1b.cisco.com, the following SyslogSeverityMatchFound events generated: #012SeverityMatch :
Critical#012MatchedEvent : Feb 11 17:31:34 vnt-cm1b local7 2 ccm: 87037: vnt-cm1b.cisco.com: Feb 11 2017
22:31:34.568 UTC : %UC_CALLMANAGER-2-SignalCongestionEntry: %[Thread=SIP Handler
Thread][AverageDelay=2184][EntryLatency=20][ExitLatency=8][SampleSize=10][TotalSignalCongestionEntry=9][HighPri
orityQueueDepth=2][NormalPriorityQueueDepth=0][LowPriorityQueueDepth=0][AppID=Cisco
CallManager][ClusterID=VNT-CM1A-Cluster][NodeID=vnt-cm1b]: Unified CM has detected signal congestion in an internal
thread and has throttled activities for that thread #012AppID : Cisco Syslog Agent#012ClusterID : #012NodeID : vnt-
cm1b#012 TimeStamp : Sat Feb 11 17:31:34][AppID=Cisco AMC Service][ClusterID=][NodeID=vnt-cm1a]: RTMT Alert

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 333
SignalCongestion Entry Due to VM CPU Contention
• Proglogs inspected

01/24/2017-08:42:12.562 | Started | 23089 | SdlThreadedProcess: SIPHandler(2,100,80,1)

• RISDC performance data inspected closer


• Thread counter class

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 334
SignalCongestion Entry Due to VM CPU Contention

CPU Spike Correlates to the


SignalCongestion Entry Time
Thread(ccm_23089)
Matches with SIP Handler
Thread’s PID from Last
✔ ProgLog

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 335
SignalCongestion Entry Due to VM CPU Contention

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 336
SignalCongestion Entry Due to VM CPU Contention

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 337
Unified CM on UCS Performance Monitoring w/ vSphere
CPU
• vSphere Client Refresh Rate = 20
sec
• 35 msec / 20000 msec = 0.175 %
• Above 5% is BAD ☠️

Ready Time is the amount of time a VM wants


to run but could not because Physical CPU
Resources could not be scheduled
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 338
UC on UCS Specs-based 3rd Party Infrastructure
vCenter Performance Statistics Level

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Virtualized Unified CM Performance Reservations Memory
& CPU Consider Increasing under
Shared Environments (B-Series)
with Vmware DRS enabled. As
long as all CPUs in the Cluster
have same clock speed.
Match to ESXi Hosts’ CPU speed
X vCPU required for Unified CM

• DO NOT modify OVA Reservations


• Hypervisor (ESXi) Swapping BAD ☠️

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 340
Reference
Unified CM on UCS Performance Monitoring w/ Vsphere
VMware Support Log Collection

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 341
Agenda
• Serviceability Tools Overview
Real-Time Monitoring Tool (RTMT)
Cisco Unified Operating System GUI
Cisco Unified Operating System CLI
Cisco Serviceability Reports Archive and Cisco Unified Reporting
• Troubleshooting Methodology
Problem Description
Information Collection
• Troubleshooting Case Studies
Dropped Call Unable to Place Calls
Delayed Audio Cut-Through Call Drops After Answering
No One Answers the Phone Video Call is Audio Only
• Understanding and Troubleshooting Unified CM Throttling Events
• Troubleshooting Database Replication
Troubleshooting Database
Replication
Database Replication Setup and Status Monitoring With
RTMT
• Key performance counters to monitor
for replication status
\Number of Replicates Created and State of
Replication(ReplicateCount)\Replicate_State
Look for 2 (Good) on all nodes

\Number of Replicates Created and State of


Replication(ReplicateCount)\Replicates_Created
• All nodes should have the same replicates
created number as the publisher node

• DBReplicationFailure alert
• AMC monitors Replicate_State Counter
• Raised when counter is at
• 3 – Replication Data Transfer is bad in the cluster
• 4 – Replication setup did not succeed
By default it will raise one alert every 60 min from each Node

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 344
Reference

Replicate_State How Does It Work?


• DBMON updates every 1.5 min a single local table named “replicationdynamic”
and puts its node id with a timestamp.
• This replicationdynamic table is replicated across the cluster. All nodes after
updating the local replicationdynamic table also check for other nodes’ updates and
their timestamps.
• If any node that completed replication setup fails to update this table for 1800 sec
(30min) DBMON will detect this and that node will change Replicate_State to 3
• Because each node checks for all other nodes that have completed replication
setup. You may see all nodes report 3 around the same time and one node shows 0
or 4
• DBMON traces
• MaintenanceTask::displayRealTimeReplicationCounter
admin:file search activelog cm/trace/dbl/sdi/dbmon*.txt MaintenanceTask::displayRealTimeReplicationCounter

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 345
Reference
Database Replication Setup and Status
Monitoring With RTMT
• DB replication spaces to monitor
\Enterprise Replication DBSpace Monitors(DSN=ccm;: NodeName = bennet.primatech.cisco.com)\ERDbSpace_Used
\Enterprise Replication DBSpace Monitors(DSN=ccm;: NodeName = bennet.primatech.cisco.com)\ERSBDbSpace_Used

• When a subscriber node is down or disconnected changes to the DB that needs


to be replicated will be queued in to DB spaces
• If these spaces fill up database will go in to blocked:DDR state
• To avoid we will start removing unreachable servers from the Continous Data
Replication (CDR) network
•Jan 4 16:10:49 bennet local7 3 : 0: Jan 04 21:10:48.233 UTC : %CCM_DB_LAYER-DB-3-
IDSReplicationFailure: Combined alarm for emergency and error situations. IDS Replication has failed Event
Class ID:-1 Event class message:ersb dbpace has exceeded 90 percent replicate to server
g_claire_ccm has been removed Event Specific Message:Check network connection and/or IDS status on
server g_claire_ccm . Otherwise contact technical support App ID:Cisco Database Layer Monitor Cluster
ID:PrimatechCluster Node ID:bennet

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 346
Database CLI Commands
• utils dbreplication status
•Runs a background script to check database replication setup.
This utility compares each node’s tables to publisher’s.
Output goes in to a file like “activelog cm/trace/dbl/sdi/ReplicationStatus.113133.out”

• utils dbreplication repair all/nodename


• Runs a background script to repair replication setup on a given nodename or all nodes. Nodename = hostname

• utils dbreplication reset all/nodename


• Runs a background script to reset and setup replication on a given nodename or all nodes. Nodename = hostname

• utils dbreplication stop


• Stops all dbreplication setup/repair/reset processes. Could take long on publisher

• utils dbreplication dropadmindb


• This command is used to drop the Informix syscdr database on any server in the cluster
• It should only be run if replication reset or cluster reset fails and replication cannot
be restarted

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 347
Database CLI Commands Increase on
Large Scale
Clusters
• utils dbreplication setrepltimeout
• Sets the timeout to start automatic DB replication setup after the first subscriber node contacts the
publisher after a switchover following an upgrade or after a fresh install.
• Defaults to 5 minutes
• Preserved across reboots & upgrades
• Remember to return it back to 5 min default pre Unified CM 10.X
• Remember to set it prior to starting your Upgrade
• Unified CM 10.X + Replication Setup Timeout is more intelligent

• show tech repltimeout


• Shows the current Database Replication Setup Timeout value Collect the output
when you
• utils create report database experience
Database Problems
• Generates a detailed database status and replication report

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 348
Database CLI Commands Best Practice Collect the output from
ALL NODES when you
• utils dbreplication runtimestate experience Database
Replication Problems
What replication setup is doing, its progress, and error indication
Checks for TCP, RPC, and IDS-ER connectivity between nodes
Checks if actual data is being replicated between nodes
Compares DB version and tables across nodes

• utils dbreplication quickaudit


This command is a quick alternative, but not a replacement, to the existing “utils dbreplication status” command.
It executes some smart counts on selected dynamic tables to determine if a node’s DB is out of sync.

• utils dbreplication repairtable


This command can resync a single table if it is out of sync.

• utils dbreplication stop all


Stops all dbreplication setup/repair/reset processes on all nodes. Executed on Publisher as long as Database Replicator
Service (DBLRPC) is functional it will work on all nodes.

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 349
Database CLI Commands
• utils dbreplication forcedatasyncsub
• Use when utils dbreplication repair or reset fails to successfully complete.
• Should be preceded with utils dbreplication stop all on Publisher
• All local data on the subscriber will be overwritten with the data currently on the Publisher
• Could take a significant amount of time depending on Clustering Over Wan delay, bandwidth and # of
subscribers
• Subscriber(s) must be rebooted after completion of force data sync
• Automated Database Replication setup will start after reboot

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 350
Introduced in Unified CM 9.1
Best Practice
Database CLI Commands
• utils dbreplication rebuild [nodename |nodename1, nodename2,..,
nodenameN | all ]
• This command will run a combination of the following commands on the specified servers
utils dbreplication stop
utils dbreplication dropadmindb or dropadmindbforce
utils dbreplication reset

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 351
Introduced in Unified CM 9.1

Database CLI Commands


• utils dbreplication setprocess
• This command will increase the parallel processing thread count of certain DB Replication Setup Tasks.
• Maximum Thread count we can set is 40
• Significant improvements to DB Replication Setup time in Large Clusters with Clustering Over WAN Delay
• Setting larger PROCESS option may consume more system resources especially in Large Clusters with
little to NO Delay in between Cluster nodes
• If Set prior to Upgrade the setting is persistent just like “utils dbreplication setrepltimeout”

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 352
Database CLI Commands
Collect the output if you receive
• show tech notify DBChangeNotifyFailure Alert in
• Show DB change notify subscription details. addition to DBMON Traces

admin:show tech notify


-------------------- show tech notify --------------------
Database Change Notify Monitor
Msg I 0/2 P 8902 DB 0
0 I 0 P 6 H 6 T 6 S 5 DbTraceMon
1 I 0 P 0 H 0 T 0 S 2 DbIPsecMon
2 I 0 P 0 H 0 T 0 S 5 SERVICE_TOMCAT[127.0.0.1]:32798
3 I 0 P 11 H 11 T 11 S 2 LpmTool
4 I 0 P 0 H 0 T 0 S 2 License Manager Trace[127.0.0.1]:32931
5 I 0 P 0 H 0 T 0 S 2 DRF Local Trace[127.0.0.1]:32937
6 I 0 P 0 H 0 T 0 S 2 DRF Master Trace[127.0.0.1]:32940
284 I 0 P 0 H 0 T 0 S 58 ccm:Client_PID=30282[10.9.40.8]:36959

MSG I <inuse count/max inuse has ever been> P <processed> DB <count in DB>
<client index> I <inuse count/not consumed> H <head ptr> T <tail ptr> S <tables subscribed> <client name>
• run sql sql_statement
• Run a given SQL statement against the LOCAL database. SQL statement can not include any stored
procedures
• Example run sql to run against a different DB
run sql select sum(seg_blkfree) as blkfree, sum(seg_blkused) as blkused from sysmaster:syssegments

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 353
Database Replication Setup and Status Monitoring
With CLI
admin:utils dbreplication runtimestate

DB and Replication Services: ALL RUNNING

Cluster Replication State: Replication status command started at: 2009-06-28-16-10


Replication status command in PROGRESS 135 tables checked out of 425
Processing Table: typeqsigvariant with 3 records

Use 'file view activelog cm/trace/dbl/sdi/ReplicationStatus.2009_06_28_16_10_14.out' to see the details

DB Version: ccm9_1_2_11900_12
Repltimeout set to: 300s
PROCESS option set to: 1

Cluster Detailed View from cl-cucm9-pub (3 Servers):

PING CDR Server REPL. Dbver& REPL. REPLICATION SETUP


SERVER-NAME IP ADDRESS (msec) RPC? (ID)& STATUS QUEUE TABLES LOOP? (RTMT) & details
------------- ------------ ------ ---- -------------- ----- ----- ----- -------------------
cl-cucm9-pub 172.16.27.77 0.086 Yes (2) Connected 0 match Yes (2) PUB Setup Completed
cl-cucm9-sub1 172.16.27.78 0.329 Yes (3) Connected 0 match Yes (2) Setup Completed
cl-cucm9-sub2 172.16.27.79 0.427 Yes (4) Connected 0 match Yes (2) Setup Completed

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 354
Database Replication Setup and Status Monitoring
With CLI
admin:utils dbreplication status

-------------------- utils dbreplication status --------------------

Replication status check is now running in background.


Use command 'utils dbreplication runtimestate' to check its progress

The final output will be in file cm/trace/dbl/sdi/ReplicationStatus.2009_06_28_16_10_14.out

Please use "file view activelog cm/trace/dbl/sdi/ReplicationStatus.2009_06_28_16_10_14.out "


command to see the output

• This command will check CDR (Continuous Data Replication) connectivity as well as compares all tables’
content to the one in publisher.
• It could take a long time (hours) to complete in large clusters use utils dbreplication quickaudit first
• This Database Status check runs in the background and its progress can be monitored via
utils dbreplication runtimestate
• Replicationdynamic table could be out of sync all the time. Can be ignored.

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 355
Database Replication Setup and Status Monitoring
With CLI—Good Case
admin:file view activelog cm/trace/dbl/sdi/ReplicationStatus.144821.out

SERVER ID STATE STATUS QUEUE CONNECTION CHANGED


-----------------------------------------------------------------------
g_bldr_ccm4_ccm 2 Active Local 0
g_bldr_ccm5_ccm 3 Active Connected 0 Sep 6 16:27:15
-------------------------------------------------

utils dbreplication status output

To determine if replication is suspect, look for the following:


(1) Number of rows in a table do not match on all nodes.
(2) Non-zero values occur in any of the other output columns for a table
(3) ***** PLEASE IGNORE MISMATCHES IN ReplicationDynamic TABLE *****

First a summary of replication servers and their server status is provided


------ Statistics for ccmdbtemplate_bldr_ccm4_ccm_1_3_alarmusertext ------
Node Rows Extra Missing Mismatch Processed
---------------- --------- --------- --------- --------- ---------
g_bldr_ccm4_ccm 0 0 0 0 0
g_bldr_ccm5_ccm 0 0 0 0 0

• This command views replication status output file. The replication status on this
cluster is good, because all servers are either local or connected, and no tables
show up as suspect.

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 356
Database Replication Setup and Status Monitoring
With CLI—Servers Out of Sync
admin:file view activelog cm/trace/dbl/sdi/ReplicationStatus.144821.out

SERVER ID STATE STATUS QUEUE CONNECTION CHANGED


-----------------------------------------------------------------------
g_bldr_ccm4_ccm 2 Active Local 0
g_bldr_ccm5_ccm 3 Active Connected 0 Sep 6 16:27:15
---------- Suspect Replication Summary ----------

For table: ccmdbtemplate_bldr_ccm4_ccm_1_27_processnode replication is suspect for


node(s):
g_bldr_ccm5_ccm

For table: ccmdbtemplate_bldr_ccm4_ccm_1_34_replicationdynamic replication is suspect for


node(s):
g_bldr_ccm5_ccm
Note: processnode table and replicationdynamic tables are suspect. Process node is a problem but remember
replicationdynamic can be ignored.
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 357
Database Replication Monitoring CM Database
Report Command “cdr list server”

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 358
Reference
Database Replication Monitoring CM Database
Report Command “cdr list server”
• The state column can have the following values:

State Description
Active The Server Is Active and Replicating Data
Deleted The Server Has Been Deleted and that It Is Not Capturing or Delivering Data and the
Queues Are Being Drained
Quiescent The Server Is in the Process of Being Defined
Suspended Delivery of Replication Data to the Server Is Suspended

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 359
Reference
Database Replication Monitoring CM Database
Report Command “cdr list server”
• The status column can have the following values

State Description
Connected The Server Connection Is Up
Connecting The Server Is Attempting to Connect
The Server Connection Is Down in Response to an
Disconnect
Explicit Disconnect
The Server Connection Is Down Due to a Network Error Because the Server Is
Dropped
Unavailable
An Error Has Occurred (Check the Log and Contact Customer Support, if
Error
Necessary)
This Server Is the Local Server as Opposed to a
Local
Remote Server
Timeout The Connection Is Down Due to an Idle Timeout

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 360
Reference

Database Replication Setup Monitoring Logs


• Cisco database replicator service logs
file list activelog cm/trace/dbl/* date detail

• During the first CDR define phase of replication setup you


should see logs for each node in the cluster that establishes communication with
Publisher DBMON
20170124_084901_vnt-cm1b_g_5_ccm11_5_1_12900_21_dbl_repl_cdr_define.log

• During the CDR realize template followed by sync/check phase of replication


setup you should see logs for the group of nodes
20170124_085553_dbl_repl_cdr_Broadcast.log

• Once replication setup is complete look for this file


20170124_085553_dbl_repl_output_Broadcast.log
In this file you will find how long the replication setup took
You will also find the exact commands used to setup replication setup.
This can tell you which nodes’ replication was in fact setup in this attempt/run.

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 361
Reference

Database Replication Logs to Collect


Traces and Output to collect if you suspect a DB Replication problem
1. Event Viewer-Application log — all nodes 7. Output of “utils dbreplication runtimestate” — all nodes
file get activelog syslog/CiscoSyslog*
8. Output of “utils dbreplication status” — publisher only
2. Cisco Database Replicator Trace — publisher
only • Performs a comparison of all tables from all nodes in the cluster.
Identifies if any of the tables are have a mismatch.
file get activelog cm/trace/dbl/dbl_repl* • Could take a long time on clustering over wan or large databases
file get activelog cm/trace/dbl/sdi/startrpc.log (over 1 hour)
file get activelog
cm/trace/dbl/sdi/replication_scripts_output.log 9. Output of “show tech dbstateinfo” — all nodes
3. Cisco Informix database service — all nodes • Generates a report that details the current Database status
file get activelog cm/log/informix/ccm.log*
10. Output of “show tech activesql” — all nodes
4. Cisco Database Layer Monitor — all nodes
file get activelog cm/trace/dbl/sdi/dbmon*.txt • Generates a report on all Active SQL Traces

5. Cisco Abort Transaction Spooling — all nodes 11. Output of “utils create report database” — all nodes
file get activelog cm/log/informix/ats/*.*
• Generates a report that collects all relevant Database Logs/Traces
6. Cisco Row Information Spooling — all nodes • Unified CM 9.X +
file get activelog cm/log/informix/ris/*.*

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 362
Database Replication Reports to Run
• Cisco Unified Reporting
• Unified CM database status
• Replication status similar to RTMT
• Replication config files check across the cluster
hosts/rhosts/sqlhosts/service
• Unified CM database replication debug
• cdr list repl
• Reports can be downloaded in xml format
and sent to TAC
Click Here to
generate a
new Report

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 363
Database Replication Service Dependencies
Check the Following Services to Ensure They Are Still Running on All Nodes
• A Cisco DB[STARTED]
admin:show process name cmoninit detail
PID PPID TID %CPU %MEM S USER MINFL MAJFL RSS VSZ STARTED COMMAND
6292 1 - 1.7 4.6 S informix 335 127155 189876 282252 Wed Feb 13 09:57:45 2008 /usr/local/cm/bin/cmoninit
• A Cisco DB Replicator[STARTED]
admin:show process search python
root 6596 1 0 Feb13 ? 00:00:00 /usr/bin/python /usr/local/cm/bin/dblrpc
• Cisco Database Layer Monitor[STARTED]
admin:show process name dbmon detail
PID PPID TID %CPU %MEM S USER MINFL MAJFL RSS VSZ STARTED COMMAND
6674 4886 - 0.2 0.2 S database 1764 3110 11188 476012 Wed Feb 13 09:58:29 2008 /usr/local/cm/bin/dbmon
• Cluster Manager[STARTED]
admin:show process name clm detail
PID PPID TID %CPU %MEM S USER MINFL MAJFL RSS VSZ STARTED COMMAND
13436 1 - 0.0 0.1 S root 4746 1679 7012 44288 Wed May 14 11:19:28 2008 /usr/local/platform/bin/clm/clm
• Look for core files as well as TCP/IP port usage through firewalls

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 364
Database Replication Service Dependencies
• Database Replication also depends on communication between all nodes in the
Cluster
Internal Firewalls Rules managed by Cluster Manager
• Look for the following messages on each node
admin:file search activelog syslog/CiscoSyslog INJECTED
May 1 08:38:57 lg-pub-1 local7 6 : 17: May 01 12:38:57.264 UTC : %CCM_CLUSTERMANAGER-CLUSTERMANAGER-
6-CLM_PeerState: Current ClusterMgr session state. Node's Name or IP:lg-sub-6 Node's
State:POLICY_INJECTED App ID:Cisco Cluster Manager Cluster ID: Node ID:lg-pub-1

• To test connectivity between Subscriber


and Publisher use Publisher
admin:utils network connectivity
This command can take up to 3 minutes to complete.
Continue (y/n)?y
Running test, please wait ...
Sub Sub
Network connectivity test with the publisher completed successfully.
admin:

Sub
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 365
Database Replication Service Dependencies
• Starting with Unified CM 8.6 we can query real time ClusterManager Authentication State

admin:show network cluster


1.2.3.4 cm1b.cisco.com cm1b Subscriber callmanager DBSub authenticated using TCP since Mon Feb 16 12:13:40 2016
1.2.3.5 cm1c.cisco.com cm1c Subscriber callmanager DBSub authenticated using TCP since Wed Jun 3 19:17:56 2016
1.2.3.6 cm1a.cisco.com cm1a Publisher callmanager DBPub authenticated
1.2.3.7 cups1b.cisco.com cups1b Subscriber cups DBSub authenticated using TCP since Wed Jun 3 19:18:17 2016
1.2.3.8 cups1a.cisco.com cups1a Subscriber cups DBPub authenticated using TCP since Thu Mar 5 22:47:49 2016

Server Table (processnode) Entries


----------------------------------
cm1a.cisco.com
cm1b.cisco.com
cm1c.cisco.com
1.2.3.7
1.2.3.8
admin:

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 366
Database Replication Recovery with Unified CM 9.0+
• DB replication setup is Automated
• Always check “utils dbreplication runtimestate” output first
• Ensure DBLRPC Connectivity is Good
• Observe the Database Replicator Logs
• file list activelog cm/trace/dbl/* date detail
• If you observe an irrecoverable fault only then take action
• Repeated cdr_define logs
• Repeated errors in the output_Broadcast log

Publisher
admin: utils dbreplication rebuild [nodename | nodename1,nodename2,..,nodenameN | all ]

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 367
Questions?
Cisco Spark
Ask Questions, Get Answers, Continue the Experience

Use Cisco Spark to communicate with the Speaker and fellow


participants after the session

Download the Cisco Spark app from iTunes or Google Play


1. Go to the Cisco events Mobile app
2. Find this session
3. Click the Spark button under Speakers in the session description
4. Enter the room, room name = TECUCC-3000
5. Join the conversation!

The Spark Room will be open for 2 weeks after Cisco Live

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 369
Complete Your Online Session Evaluation
• Please complete your Online
Session Evaluations after each
session
• Complete 4 Session Evaluations &
the Overall Conference Evaluation
(available from Thursday) to receive
your Cisco Live T-shirt
• All surveys can be completed via
the Cisco Live Mobile App or the
Don’t forget: Cisco Live sessions will be available
Communication Stations for viewing on-demand after the event at
CiscoLive.com/Online

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 370
Continue Your Education
• Demos in the Cisco campus
• Visit the Hub (2.2) for:
- DevNet Zone related sessions and demos
- Walk-in Self-Paced Labs
- Technical Solutions Clinic
• Lunch & Learn topics
• Meet the Engineer 1:1 meetings
• Related sessions
• BRKUCC-2011 - Best Practices for Migrating Previous Versions of Unified CM to Version 11
• BRKUCC-2932 - Troubleshooting SIP with Cisco Unified Communications

• Join the session’s spark room 👍

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 371
Thank You
19,000+
Members
Join the Customer Connection Program Strong

• Influence product direction


Join in World of Solutions
• Access to early adopter & beta trials
Collaboration zone
• Monthly technical & roadmap briefings
 Join at the Customer Connection stand
• Connect in private online community  New member thank-you gift *
 CCP ribbon for access to NDA sessions
• Exclusive perks at Cisco Live
• Collaboration NDA Roadmap Sessions Mon & Tues
• Q&A Open Forum with Collaboration Product
Management Tues 4:00 – 5:30 Join Online
• Reserved seats at Collaboration Innovation Talk www.cisco.com/go/ccp
Thurs 8:00am – 9:00am
Come to Collaboration zone to get your
• 2 new CCP tracks launching at Cisco Live:
ribbon and new member gift
Security & Enterprise Networks
* While supplies last
TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 374
Collaboration Cisco Education Offerings
Course Description Cisco Certification
CCIE Collaboration Advanced Workshop (CIEC) Gain expert-level skills to integrate, configure, and troubleshoot complex CCIE® Collaboration
collaboration networks
Implementing Cisco Collaboration Applications Understand how to implement the full suite of Cisco collaboration CCNP® Collaboration
(CAPPS) applications including Jabber, Cisco Unified IM and Presence, and Cisco
Unity Connection.
Implementing Cisco IP Telephony and Video Learn how to implement Cisco Unified Communications Manager, CUBE, CCNP® Collaboration
Part 1 (CIPTV1) and audio and videoconferences in a single-site voice and video network.

Implementing Cisco IP Telephony and Video Obtain the skills to implement Cisco Unified Communications Manager in a
Part 2 (CIPTV2) modern, multisite collaboration environment.

Troubleshooting Cisco IP Telephony and Video Troubleshoot complex integrated voice and video infrastructures
(CTCOLLAB)
Implementing Cisco Collaboration Devices Acquire a basic understanding of collaboration technologies like Cisco Call CCNA® Collaboration
(CICD) Manager and Cisco Unified Communications Manager.

Implementing Cisco Video Network Devices Learn how to evaluate requirements for video deployments, and implement
(CIVND) Cisco Collaboration endpoints in converged Cisco infrastructures.

For more details, please visit: http://learningnetwork.cisco.com


Questions? Visit the Learning@Cisco Booth or contact [email protected]

TECUCC-3000 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 375
Appendix

You might also like