0% found this document useful (0 votes)

240 views7 pages

A Quick Guide For PANOS Troubleshooting - v2

This document provides a quick guide to troubleshooting issues with PANOS firewalls. It describes the key daemons that run on the management plane (MP) and data plane (DP) of PANOS version 5.0. It outlines common symptoms that may occur like configuration commit failures, daemon crashes, lockups, and resource leaks. It provides examples of CLI commands that can help diagnose specific problems on the MP and DP.

Uploaded by

Arun Somashekar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

240 views7 pages

A Quick Guide For PANOS Troubleshooting - v2

Uploaded by

Arun Somashekar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

A quick guide for PANOS troubleshooting

PANOS version 5.0

Sept 2013
Yonghui Cheng

1 MP quick walk through

In a nutshell, PANOS MP software is running on a Linux PC.

Daemons:

 masterd
Manage all other daemons. Use CLI “show system software status” to show all daemon status.
 sysd
manages inter-daemon communications
 mgmtsrvr
management backend, take care of configuration management, commit, reporting, etc.
 devsrvr
take care of pushing configuration to DP, plus misc. communication with DP, such as URL
filtering request response etc.
 useridd
userid features such as communicating with user-id agents, can also act as agent to other
firewalls
 authd
all user authentication, lock account etc.
 ha-agent
manage HA status, configuration sync etc.
 logrcvr
recording traffic and threat log send by DP, run compression on log blocks and generate index
on the blocks on the fly
 varrcvr
receive pcap send by DP, receive file send from DP and forward files to wildfire cloud
 l3svc
Serve web pages for captive-portal, NTLM auth, URL admin override page, URL block page.
 Websrvr
Serve web pages for admin UI.
 Sslvpn
Serve web pages for GlobalProtect feature.
 rasmgr
Backend logic for GlobalProtect feature.
 sslmgr
Fulfill OCSP and CRL query request by daemons and DP, manage OCSP and CRL repository.
 routed
routing daemon and dynamic routing state-machine
 cryptod
encrypt/decrypt password, private key etc. so we can include them as part of config file.
 ikemgr/keymgr
ISAKMP daemon and IPSec key repository management.

Cron jobs:

 log indexing, summary-gen

runs every 15 minutes
 Report-gen
Once per day, configurable
 Content/AV/URL update
Once per day, configurable

2 MP Typical Symptoms
When troubleshooting the system is necessary, we should start with identifying the main symptom.

2.1 Configuration commit failure

Configuration commit is a collective process involves almost all MP and DP daemons so it can fail due to
various reasons. It is however managed by the “mgmtsrvr” as “jobs”, and the job status can be shown
through both CLI and WebUI.

There are special types of commit, such as “auto-commit”, “HA-Sync” and “commit-all”, which are
triggered by special events such as data plane boot-up, HA peer commit and Panorama commit
correspondingly, and different actions are involved depend on the types.

Configuration commit involves config preprocessing, validation, phase-1 and phase-2. Other daemon
participate phase-1 and phase-2 as “management clients”. So in order to pin point the root cause of
commit failure, first step is to identify which step or party rejected/failed the commit.

CLI examples:
show job all
show job id <#>
show management-clients
2.2 MP daemon crash
To find out which daemon crashed, check for the backtrace files. Usually in this case gather tech support
tarball and in addition the core file is good enough. Knowing how to recreate the crash usually is the key
for further troubleshooting.

If the crash is associated with config change/commit, getting the “candidate” config is also important.

CLI examples:
show system files
scp export core-file control-plane from …

2.3 MP daemon not responsive

When this happens, it might catch attention due to time out when loading WebUI pages or executing CLI
commands. Further validation is required before declare a daemon is lost response. Sometimes the
daemon might keep printing out same debug log over and over.

If the daemon enters an infinite loop, then you should see constant high CPU usage, and maybe
accompanied with repeated log messages in debug log. If the daemon is in deadlock or waiting
indefinitely, then its CPU usage should not be high, and debug logging might appear as if stopped or
looping.

Since most daemons are multi-threaded, so it is possible that only part of the functionality is lost.

By taking and analyzing multiple back-trace of the daemon gathered in short intervals, it is possible to
tell where the code got stuck. Next step is generating a coredump file for that daemon for further
analysis.

To restore the functionality, try to restart the daemon manually.

CLI examples:
show system software status
show system resource follow
debug software trace <daemon>
debug software core <daemon>
debug software restart <daemon>

2.4 MP daemon memory leak

Memory leak could be noticed due to system slow down in general or unable to perform certain
operations. To check current memory usage, run the “show system resource follow” command and hit
“M” to sort by memory usage. Some fluctuation of memory usage is expected under normal usage, so it
is necessary to find out a baseline memory usage by check the log file “mp-monitor.log”.

There is no good way to troubleshoot memory leak in the field, so if there is no other issues need to
bring up attention, the only thing might help is get as much information as possible about the box’s past
activity. In addition, acquire a coredump file for the daemon in question is also helpful.
By specify limit on a daemon’s virtual memory usage size, admin can make system restart daemon that
is leaking memory.

CLI examples:
debug software virt-limit service <daemon name> limit <size_in_KB>

2.5 MP daemon resource leak (other than memory)

Other resources a daemon might leak include sockets, file descriptors etc. In this case, the most
important thing is to find out how to reproduce the leak.

2.6 MP lockup
If MP lost response such as does not responds to ping, or cannot login through serial console, is it
possible due to kernel issues. In this case have to ask customer to monitor serial console print out
messages.

2.7 Maintenance mode

3 DP quick walk-through
In a nutshell, PANOS DP software is running on a Linux PC with multiple CPU, plus various hardware
engines to offload/accelerate networking, security and content processing.

Daemons:

 Supervisor
Initializing DP engines and memory pools
 Sysdagent
Communicating with sysd on MP
 Brdagent
Config, manage and monitor peripheral chips
 comm/pan_comm
Communicate with devsrvr, participate in commit and other config change.
 dha/pan_dha
Implement link/path monitoring, implement status change on interface status etc.
 mprelay
Communicate with routed, keymgr etc, implement vpn and pbf monitoring
 pan_tasks
The packet forwarding daemons, runs on dedicated CPU cores.

Typical CPU core assignment

 Core 0:
Generic daemons other than pan_tasks
 Core 1:
flow_mgmt, the pan_task that dedicate on session management
 Core 2+:
Regular pan_tasks that can process network traffic

4 DP Typical Symptoms

4.1 DP daemon crash

Dataplane (DP) has two types of daemons, packet processing (pan_task) and other daemons. Packet
processing tasks each exclusively occupy CPU core 1 and beyond. CPU core 0 is reserved to run only
“other” type of processes and not packet processing tasks.

DP daemon can crash due to various reasons, and it can be determined by check existence of backtrace
files. However due to compiler optimization, the backtrace file usually does not contain sufficient
information to determine root cause, and due to use of shared memory, the coredump files might not
able to give us complete information as well. But it is still a good practice to collect the backtrace file and
coredump file (also pcap files if available).

Usually the crash is the result of particular traffic pattern combined with particular configuration, so if
possible, we can try to tweak configuration to prevent the crash from happen too frequently.

4.2 DP restart (data-plane down or lost heartbeat)

Dataplane can restart altogether due to various reason, such as repeated daemon crash can trigger the
monitoring software take escalated action and restart DP. So in such cases the issue should be treated as
DP daemon crash issue.

It is possible for DP to restart by itself due to severe memory corruption. In this case, the only clue might
be left is some error messages in the “dataplane-consoleoutput.log”.

Sometimes the DP might appear lost response from MP monitoring software perspective, in this case DP
will be rebooted automatically in order to recover from this status. This can be the result of real
hardware failure but more often is caused by either MP or DP being over loaded or some other bugs.

It is important to determine which side (DP vs MP) contains root cause of the issue, which might not be
easy to tell. Gathering techsupport tarball is the best bet to start the investigation.

4.3 DP resource leak or resource shortage

Dataplane packet, memory and buffer usage can be checked by following CLI commands. Resource
shortage can cause failure of corresponding operation or even malfunctions. Resource leak will
eventually cause permanent resource shortage.

CLI examples:
debug dataplane pool statistics
show system setting ctd state
show system setting ssl-decrypt memory

4.4 Network performance issue

In this case, there is no obvious bottleneck observed on the box, yet the network throughput or latency
is far below average. In this case it is usually due to packet loss or excessive latency.

There could be many reason for packet loss, so being able to find out what is causing it is the key. There
are many tools on the box can help us figure out this. The tools most relevant are packet-diag pcap and
global counters. Is it also important to validate that packet loss is introduced by the box, not by other
devices on the network. It is important to keep in mind that packet can be dropped by other network
device along the path, even the cable. Packet can also get forward incorrectly.

When QoS feature is enabled, it might introduce latency and packet drop as well.

4.5 DP performance issue

In this case there are clear signs of bottleneck on the box, such as DP CPU usage too high, hardware
engine is saturated, excessive queue length observed etc.

Some traffic patterns are known to cause high CPU usage, such as zip decompression, SSL decryption,
VPN, software based content scanning, DNS traffic scanning etc.

Generic steps to troubleshoot this type of issue usually start with “app-override” some traffic. This is
important step as it isolate session setup problem from layer-7 scanning problem. By reducing CPU load,
it can also provide insight about whether other problems are involved at the same time.

4.6 Recoverable/intermittent network issue

If customer is using script to monitor their network and reports intermittent packet drop problem, the
ideal solution is to ask them integrate some CLI commands in their script so that the failure status can be
captured. Relevant CLI commands are:

“show counter global” and “show counter global filter delta yes”

“show session info” and “show session all”

More sophisticated scripts can be developed to start/stop packet-diag debugging.

4.7 datapath HW component issues

Some platforms has specialized hardware chip to help accelerate packet parsing, flow cut-through
and/or special operations such as DFA/AHO pattern match.

Phy/Mac chips:

 PA-4060 use “Puma FPGA” as MAC chip, other PA-4000 use vitesse
 PA-500, PA-2000, PA-3000, PA-5000 use Marvel as Phy/MAC chip
 PA-5000 also use Petra as 10G Phy/Mac
 PA-200 use GMX interface on Octeon

NP (network processor) chip:

Main features of NP is to support basic packet ingress (parsing, logic interface matching and
classification), flow cut-through (flow match, packet header rewrite such as NAT, TTL) and packet
forwarding (route, ARP, MAC lookup).

 PA-4000 use “EZ chip”, which support traffic mangement

 PA-2000 use “Lion” FPGA
 PA-5000 use “Tiger” FPGA, and Petra support traffic management
 PA-3000 use “Liger” FPGA
 Other platforms use software for these tasks

Troubleshooting for specific chips usually requires specific knowledge for the platform and components
involved. However in most cases it is still possible to narrow down the issue to specific chip, or specific
component/engine of specific chip.

Monitoring Linux OS Agent User's Guide
No ratings yet
Monitoring Linux OS Agent User's Guide
120 pages
204.4293.31 - DmOS - Troubleshooting Guide
No ratings yet
204.4293.31 - DmOS - Troubleshooting Guide
49 pages
Veritas Netbackup 6.0: Troubleshooting Guide
No ratings yet
Veritas Netbackup 6.0: Troubleshooting Guide
691 pages
Symphony Ready-Reckoner
No ratings yet
Symphony Ready-Reckoner
69 pages
COB2 (1) .Close of Business - BATCH - JOB.CONTROL, Errors-R10.01 PDF
100% (9)
COB2 (1) .Close of Business - BATCH - JOB.CONTROL, Errors-R10.01 PDF
56 pages
IBM Tivoli UNIX Log Agent User Guide - EN
No ratings yet
IBM Tivoli UNIX Log Agent User Guide - EN
112 pages
Itm Ux Agent
No ratings yet
Itm Ux Agent
270 pages
204.4293.08 - DmOS - Troubleshooting Guide
No ratings yet
204.4293.08 - DmOS - Troubleshooting Guide
74 pages
Q04. How To Check The Datapump Import Jobs Are Running or Not ?
No ratings yet
Q04. How To Check The Datapump Import Jobs Are Running or Not ?
6 pages
《发布说明：数据平面开发套件》
No ratings yet
《发布说明：数据平面开发套件》
62 pages
Emr Na-A00050300en Us-3
No ratings yet
Emr Na-A00050300en Us-3
39 pages
Proxy Interview Questions and Answers Vol 1.0
100% (3)
Proxy Interview Questions and Answers Vol 1.0
9 pages
dm3 Diagnostics Win v7
No ratings yet
dm3 Diagnostics Win v7
179 pages
Beginners Guide To AMOS
No ratings yet
Beginners Guide To AMOS
7 pages
What Is A Semaphore and A Semaphore Timeout?
100% (1)
What Is A Semaphore and A Semaphore Timeout?
8 pages
204.4293.35 - DmOS - Troubleshooting Guide
No ratings yet
204.4293.35 - DmOS - Troubleshooting Guide
50 pages
Martin Berger - Oracle Priva
No ratings yet
Martin Berger - Oracle Priva
46 pages
Ospf Interview Questions Scenario Based PDF
No ratings yet
Ospf Interview Questions Scenario Based PDF
14 pages
Mod05 GL Maint
No ratings yet
Mod05 GL Maint
38 pages
Frq0 BK Troubleshoot
No ratings yet
Frq0 BK Troubleshoot
260 pages
Mastering Linux Debugging Techniques
No ratings yet
Mastering Linux Debugging Techniques
10 pages
Managing SG B3936 90065
No ratings yet
Managing SG B3936 90065
390 pages
Nutanix Security Guide v6 7
No ratings yet
Nutanix Security Guide v6 7
233 pages
Cases: Current Author: Review Area: Created On: Current Status: Authoring
No ratings yet
Cases: Current Author: Review Area: Created On: Current Status: Authoring
5 pages
Rac Trouble Shooting
100% (1)
Rac Trouble Shooting
7 pages
Mavenir ePDG Troubleshooting Guide 3 0
No ratings yet
Mavenir ePDG Troubleshooting Guide 3 0
45 pages
204.4293.16 - DmOS - Troubleshooting Guide
No ratings yet
204.4293.16 - DmOS - Troubleshooting Guide
78 pages
DHCP Interview Questions and Answers
No ratings yet
DHCP Interview Questions and Answers
8 pages
EDU 311 80a MOD 10 Performance Troubleshooting - 10
No ratings yet
EDU 311 80a MOD 10 Performance Troubleshooting - 10
33 pages
204.4293.12 - DmOS - Troubleshooting Guide
No ratings yet
204.4293.12 - DmOS - Troubleshooting Guide
74 pages
Gcse Examen
No ratings yet
Gcse Examen
24 pages
Openmic Crash, Hang, Monitoring
No ratings yet
Openmic Crash, Hang, Monitoring
33 pages
Troubleshoot
No ratings yet
Troubleshoot
19 pages
Kubernetes Networking: Marian Babik, Spyridon Trigazis Cern
No ratings yet
Kubernetes Networking: Marian Babik, Spyridon Trigazis Cern
19 pages
Advanced Administration - Troubleshooting - Tech
No ratings yet
Advanced Administration - Troubleshooting - Tech
16 pages
Docu46525 Smarts Service Assurance Manager 9.2 Troubleshooting Guide
No ratings yet
Docu46525 Smarts Service Assurance Manager 9.2 Troubleshooting Guide
62 pages
Vdocuments - MX Comandos Diagnostico Errores m5000
No ratings yet
Vdocuments - MX Comandos Diagnostico Errores m5000
11 pages
How To Troubleshoot High Dataplane CPU
No ratings yet
How To Troubleshoot High Dataplane CPU
9 pages
ASC Troubleshooting Cheatsheet - Rev2.2 - FINAL
No ratings yet
ASC Troubleshooting Cheatsheet - Rev2.2 - FINAL
14 pages
IP Networking Quick Best Practice Guide-2 PDF
No ratings yet
IP Networking Quick Best Practice Guide-2 PDF
10 pages
Mainframe SRDF/A MSC Troubleshooting Guide: EMC Proven™ Professional Knowledge Sharing 2008
No ratings yet
Mainframe SRDF/A MSC Troubleshooting Guide: EMC Proven™ Professional Knowledge Sharing 2008
28 pages
Memory Dump Analysis Anthology (Dmitry Vostokov)
No ratings yet
Memory Dump Analysis Anthology (Dmitry Vostokov)
431 pages
Nagios 344
No ratings yet
Nagios 344
7 pages
s2s VPN PPT
No ratings yet
s2s VPN PPT
30 pages
4.3.4 Lab - Linux Servers - ILM
No ratings yet
4.3.4 Lab - Linux Servers - ILM
7 pages
CP SNMP BestPracticesGuide
No ratings yet
CP SNMP BestPracticesGuide
40 pages
Firmware Matrix
No ratings yet
Firmware Matrix
15 pages
An Intrusion-Tolerant and Self-Recoverable Network Service System Using A Security Enhanced Chip Multiprocessor
No ratings yet
An Intrusion-Tolerant and Self-Recoverable Network Service System Using A Security Enhanced Chip Multiprocessor
18 pages
Users and Groups: Files Related To User/group
No ratings yet
Users and Groups: Files Related To User/group
12 pages
Fundamental Solution Enabler
No ratings yet
Fundamental Solution Enabler
8 pages
6-01159-04A - I6ki2k - SNMP Guide PDF
No ratings yet
6-01159-04A - I6ki2k - SNMP Guide PDF
26 pages
KDB
No ratings yet
KDB
482 pages
Lecture 7 Sqa
No ratings yet
Lecture 7 Sqa
31 pages
Sun Systems Fault Analysis Workshoponline Assessment
No ratings yet
Sun Systems Fault Analysis Workshoponline Assessment
13 pages
FTK Primer - Getting Started
No ratings yet
FTK Primer - Getting Started
73 pages
Parsing in Python
No ratings yet
Parsing in Python
7 pages
Impact2012 - DataPower Troubleshooting PDF
No ratings yet
Impact2012 - DataPower Troubleshooting PDF
34 pages
Answer:: NO.1 A. B. C. D
No ratings yet
Answer:: NO.1 A. B. C. D
2 pages
07 VXVM VXDMP (Dynamic Multi Pathing)
No ratings yet
07 VXVM VXDMP (Dynamic Multi Pathing)
5 pages
Memory Dump ICS Lab.
No ratings yet
Memory Dump ICS Lab.
21 pages
BSC Commands
No ratings yet
BSC Commands
31 pages
Networking Fundamentals and Certification Blog - HTTPS Interview Questions
No ratings yet
Networking Fundamentals and Certification Blog - HTTPS Interview Questions
4 pages
Cyberscape Report Ru
No ratings yet
Cyberscape Report Ru
27 pages
3.1.3.4 Lab - Linux Servers
No ratings yet
3.1.3.4 Lab - Linux Servers
7 pages
Script For Logic 1
No ratings yet
Script For Logic 1
6 pages
Crash Dump Analysis
No ratings yet
Crash Dump Analysis
10 pages
Beginners Tutorial For Regular Expressions in Python - Python Learning
No ratings yet
Beginners Tutorial For Regular Expressions in Python - Python Learning
23 pages
Centera Guide
No ratings yet
Centera Guide
12 pages
Global Protect
No ratings yet
Global Protect
15 pages
The Heap Size in Weblogic
No ratings yet
The Heap Size in Weblogic
6 pages
Technical Document For Storage Dude 1: Symmask - Sid 1200 - WWN 10000000c9408060 Rename "Kopeqdrac01/a1"
No ratings yet
Technical Document For Storage Dude 1: Symmask - Sid 1200 - WWN 10000000c9408060 Rename "Kopeqdrac01/a1"
12 pages
60-Objects Tab
No ratings yet
60-Objects Tab
11 pages
Quick Reference Guide - Feb.2014
No ratings yet
Quick Reference Guide - Feb.2014
2 pages
When Applications Crash Part I - Watson
No ratings yet
When Applications Crash Part I - Watson
5 pages
Comandos Diagnostico Errores M5000
No ratings yet
Comandos Diagnostico Errores M5000
11 pages
Ip Routing
No ratings yet
Ip Routing
24 pages
Mistral Introduction OpenStack Meetup
No ratings yet
Mistral Introduction OpenStack Meetup
18 pages
Analyzing Kernel Crash On Red Hat
No ratings yet
Analyzing Kernel Crash On Red Hat
9 pages
How To Troubleshoot SIC Related Issues
No ratings yet
How To Troubleshoot SIC Related Issues
12 pages
Downloading SAP Kernel Patches
No ratings yet
Downloading SAP Kernel Patches
21 pages
A Guide To The MARIE Machine Simulator Environment
No ratings yet
A Guide To The MARIE Machine Simulator Environment
20 pages
Table Top Exercise
No ratings yet
Table Top Exercise
32 pages
Which Is The Best Gym in Basaveshwaranagar, Bangalore That Is Easy On The Budget?
No ratings yet
Which Is The Best Gym in Basaveshwaranagar, Bangalore That Is Easy On The Budget?
10 pages
GYM Training
No ratings yet
GYM Training
10 pages
Storage
No ratings yet
Storage
3 pages
Windows Bugcheck Analysis
No ratings yet
Windows Bugcheck Analysis
22 pages
ws2022 Ig en
No ratings yet
ws2022 Ig en
74 pages
MCUXIDECTUG
No ratings yet
MCUXIDECTUG
150 pages
Control of Means Calculations: Unified Model Documentation Paper C05
No ratings yet
Control of Means Calculations: Unified Model Documentation Paper C05
7 pages
Installing Windows 2019 On HPE Superdome Flex
No ratings yet
Installing Windows 2019 On HPE Superdome Flex
35 pages
Debugging Core Files Using HP WDB
No ratings yet
Debugging Core Files Using HP WDB
56 pages
Case Study-Linux Kernel Crash Dump
No ratings yet
Case Study-Linux Kernel Crash Dump
2 pages
README
No ratings yet
README
4 pages
Installation Security Information: GDB-PEDA Cheatsheet - Page 1
No ratings yet
Installation Security Information: GDB-PEDA Cheatsheet - Page 1
3 pages
GDB Debug Native Part of Java Application (C - C++ Libraries and JDK) by Alexey Pirogov Medium
No ratings yet
GDB Debug Native Part of Java Application (C - C++ Libraries and JDK) by Alexey Pirogov Medium
20 pages
GMX Linux Release Notes 111223 - Console
No ratings yet
GMX Linux Release Notes 111223 - Console
17 pages
Android Architect Aswin SL
No ratings yet
Android Architect Aswin SL
6 pages
How To Configure 'Kdump' For Oracle VM 3.2.x (Doc ID 1520837.1)
No ratings yet
How To Configure 'Kdump' For Oracle VM 3.2.x (Doc ID 1520837.1)
3 pages
F5 201 - Study Guide - TMOS Administration r2 PDF
0% (1)
F5 201 - Study Guide - TMOS Administration r2 PDF
109 pages

A Quick Guide For PANOS Troubleshooting - v2

Uploaded by

A Quick Guide For PANOS Troubleshooting - v2

Uploaded by

A quick guide for PANOS troubleshooting

PANOS version 5.0

1 MP quick walk through

 log indexing, summary-gen

2.1 Configuration commit failure

2.3 MP daemon not responsive

To restore the functionality, try to restart the daemon manually.

2.4 MP daemon memory leak

2.5 MP daemon resource leak (other than memory)

2.7 Maintenance mode

Typical CPU core assignment

4.1 DP daemon crash

4.2 DP restart (data-plane down or lost heartbeat)

4.3 DP resource leak or resource shortage

4.4 Network performance issue

4.5 DP performance issue

4.6 Recoverable/intermittent network issue

“show session info” and “show session all”

More sophisticated scripts can be developed to start/stop packet-diag debugging.

4.7 datapath HW component issues

NP (network processor) chip:

 PA-4000 use “EZ chip”, which support traffic mangement

You might also like