PerconaMonitoringAndManagement-2 17 0 PDF
PerconaMonitoringAndManagement-2 17 0 PDF
Table of contents
1. Welcome 5
1.1 Setting up 5
2. Setting up 8
2.1 Setting up 8
2.2 Server 10
2.3 Client 56
3. Using 115
4. How to 153
5. Details 178
6. FAQ 314
6.2 What are the minimum system requirements for PMM? 314
6.5 How often are NGINX logs in PMM Server rotated? 314
6.9 Can I add an AWS RDS MySQL or Aurora MySQL instance from a non-default AWS partition? 315
6.10 How do I troubleshoot communication issues between PMM Client and PMM Server? 315
6.13 How do I use a custom Prometheus configuration file inside PMM Server? 316
6.15 What are my login credentials when I try to connect to a Prometheus Exporter? 316
6.16 How to provision PMM Server with non-default admin password? 316
1. Welcome
Percona Monitoring and Management (PMM) is a free, open-source monitoring tool for MySQL,
PostgreSQL, MongoDB, and ProxySQL, and the servers they run on. PMM helps you improve the
performance of databases, simplify their management, and strengthen their security.
• Drill-down and discover the cause of inefficiencies, anticipate performance issues, or troubleshoot
existing ones
PMM is efficient, quick to set up and easy to use. It runs in cloud, on-prem, or across hybrid platforms. It is
supported by our legendary expertise in open source databases, and by a vibrant developer and user
community.
1.1 Setting up
PMM Server can run as:
• a Docker container
• a virtual machine
PMM Client runs on all hosts you want to monitor. The setup varies according to the type of system:
• MongoDB
• PostgreSQL
• Amazon RDS
• Microsoft Azure
• ProxySQL
• Linux
• External services
• HAProxy
Quickstart installation
PMM Server
PMM Server is the heart of PMM. It receives data from clients, collates it and stores it. Metrics are drawn as
tables, charts and graphs within dashboards, each a part of the web-based user interface.
PMM Client
PMM Client runs on every database host or node you want to monitor. The client collects server metrics,
general system metrics, and query analytics data, and sends it to the server.
Percona Platform
Welcome
PostgreSQL
Amazon RDS Glossary
ProxySQL
Microsoft Azure
HA
Google Cloud Platform
Linux
External
Services
HAProxy
2. Setting up
2.1 Setting up
The PMM setting-up process can be broken into three key stages:
You must set up at least one PMM Server. A server can run:
• with Docker
• as a virtual appliance
You must set up PMM Client on each node where there is a service to be monitored. You can do this:
You must configure your services and add them to PMM Server’s inventory of monitored systems. This is
different for each type of service:
• MySQL and variants (Percona Server for MySQL, Percona XtraDB Cluster, MariaDB)
• MongoDB
• PostgreSQL
• ProxySQL
• Amazon RDS
• Microsoft Azure
• Linux
• External services
• HAProxy
If you have configured everything correctly, you’ll see data in the PMM user interface, in one of the
dashboards specific to the type of service.
2.2 Server
System requirements
Disk
Approximately 1 GB of storage per monitored database node with data retention set to one week. By default,
retention is 30 days.
Memory
A minimum of 2 GB per monitored database node. The increase in memory usage is not proportional to the
number of nodes. For example, data from 20 nodes should be easily handled with 16 GB.
Architecture
Your CPU must support the SSE4.2 instruction set, a requirement of ClickHouse, a third-party column-
oriented database used by Query Analytics. If your CPU is lacking this instruction set you won’t be able to
use Query Analytics.
• with Docker
• as a virtual appliance
When PMM Server is running, set up PMM Client for each node or service.
Warning We highly recommend you review get-pmm2.sh prior to running on your system, to ensure the
content is as expected.
2.2.2 Docker
We maintain a Docker image for PMM Server. This section shows how to run PMM Server as a Docker
container, directly and with Docker compose. (The tags used here are for the latest version of PMM 2
(2.17.0). Other tags are available.)
System requirements
Software
PMM Server expects the data volume (specified with --volume ) to be /srv . Using any other value will
result in data loss when upgrading.
Optionally you can enable http (insecure) by including --publish 80:80 in the above docker run
command however note that PMM Client requires TLS to communication with the server so will only work
on the secure port.
You can disable manual updates via the Home Dashboard PMM Upgrade panel by adding -e
DISABLE_UPDATES=true to the docker run command.
4. In a web browser, visit https://server hostname:443 (or //server hostname:80 if optionally enabled) to
see the PMM user interface.
It is possible to change some server setting by using environment variables when starting the Docker
container. Use -e var=value in your pmm-server run command.
Variable Description
Ignored variables
These variables will be ignored by pmm-managed when starting the server. If any other variable is found, it
will be considered invalid and the server won’t start.
Variable Description
You can test a new release of the PMM Server Docker image by making backups of your current pmm-server
and pmm-data containers which you can restore if you need to.
With jq :
5. (Optional) Repeat step 1 to confirm the version, or check the PMM Upgrade panel on the Home
Dashboard.
Restore
2. Restore backups.
3. Restore permissions.
sudo docker run --rm --volumes-from pmm-data -it percona/pmm-server:2 chown -R root:root /srv
&& \
sudo docker run --rm --volumes-from pmm-data -it percona/pmm-server:2 chown -R pmm:pmm /srv/
alertmanager && \
sudo docker run --rm --volumes-from pmm-data -it percona/pmm-server:2 chown -R root:pmm /srv/
clickhouse && \
sudo docker run --rm --volumes-from pmm-data -it percona/pmm-server:2 chown -R
grafana:grafana /srv/grafana && \
sudo docker run --rm --volumes-from pmm-data -it percona/pmm-server:2 chown -R pmm:pmm /srv/
logs && \
sudo docker run --rm --volumes-from pmm-data -it percona/pmm-server:2 chown -R
postgres:postgres /srv/postgres && \
sudo docker run --rm --volumes-from pmm-data -it percona/pmm-server:2 chown -R pmm:pmm /srv/
prometheus && \
sudo docker run --rm --volumes-from pmm-data -it percona/pmm-server:2 chown -R pmm:pmm /srv/
victoriametrics && \
sudo docker run --rm --volumes-from pmm-data -it percona/pmm-server:2 chown -R
postgres:postgres /srv/logs/postgresql.log
version: '2'
services:
pmm-server:
image: percona/pmm-server:2
hostname: pmm-server
container_name: pmm-server
restart: always
logging:
driver: json-file
options:
max-size: "10m"
max-file: "5"
ports:
- "443:443"
volumes:
- data:/srv
volumes:
data:
2. Run:
sudo docker-compose up
3. Access PMM Server on https://X.X.X.X:443 where X.X.X.X is the IP address of the PMM Server host.
2. Remove containers.
If the host where you will run PMM Server has no internet connection, you can download the Docker image
on a separate (internet-connected) host and securely copy it.
1. On an internet-connected host, download the Docker image and its checksum file.
wget https://downloads.percona.com/downloads/pmm2/2.17.0/docker/pmm-server-2.17.0.docker
wget https://downloads.percona.com/downloads/pmm2/2.17.0/docker/pmm-server-2.17.0.sha256sum
Run PMM Server as a virtual machine by downloading and importing the PMM 2.17.0 Open Virtual Appliance
(OVA) file into any virtualization software supporting the OVF standard.
This page shows how to set up PMM Server as a virtual machine in VMware Workstation Player and Oracle
VM VirtualBox.
Most steps can be done with either a user interface or on the command line, but some steps can only be
done in one or the other. Sections are labeled UI for user interface or CLI for command line instructions.
Terminology
• Hypervisor is software (e.g. VirtualBox, VMware) that runs the guest OS as a virtual machine.
Item Value
VM specifications
Component Value
CPU 1
Users
root percona
admin admin
Download
UI
4. Click the link for pmm-server-2.17.0.ova to download it. Note where your browser saves it.
5. Right click the link for pmm-server-2.17.0.sha256sum and save it in the same place as the .ova file.
6. (Optional) Verify.
CLI
wget https://www.percona.com/downloads/pmm2/2.17.0/ova/pmm-server-2.17.0.ova
wget https://www.percona.com/downloads/pmm2/2.17.0/ova/pmm-server-2.17.0.sha256sum
Verify
CLI
IMPORT
UI
4. Click Open.
5. Click Continue.
b. Click Save.
• (Recommended) Click Customize Settings. This opens the VM’s settings page without starting the
machine.
CLI
2. Import and convert the OVA file. ( ovftool can’t change CPU or memory settings during import but it can
set the default interface.)
RECONFIGURE INTERFACE
When using the command line, the interface is remapped during import.
UI
UI
3. When the instance has booted, note the IP address in the guest console.
CLI/UI
1. Start the virtual machine in GUI mode. (There’s no way to redirect a VMware VM’s console to the host.)
2. When the instance has booted, note the IP address in the guest console.
IMPORT
UI
2. In the File field, type the path to the downloaded .ova file, or click the folder icon to navigate and open it.
3. Click Continue.
4. On the Appliance settings page, review the settings and click Import.
5. Click Start.
6. When the guest has booted, note the IP address in the guest console.
CLI
1. Open a terminal and change directory to where the downloaded .ova file is.
• With custom settings (in this example, Name: “PMM Server”, CPUs: 2, RAM: 8192 MB).
RECONFIGURE INTERFACE
UI
1. Click Settings.
2. Click Network.
4. In the Name field, select your host’s active network interface (e.g. en0: Wi-Fi (Wireless) ).
5. Click OK.
CLI
2. Find the name of the active interface you want to bridge to (one with Status: Up and a valid IP address).
Example: en0: Wi-Fi (Wireless)
3. Bridge the virtual machine’s first interface ( nic1 ) to the host’s en0 ethernet adapter.
UI
2. Click Start.
3. When the guest has booted, note the IP address in the guest console.
CLI
tail -f /tmp/pmm-server-console.log
UI
3. Enter the default username and password in the relevant fields and click Log in.
• username: admin
• password: admin
UI
• Username: root
• Password: percona
UI/CLI
ssh-keygen -f admin
4. Copy and paste the contents of the admin.pub file into the SSH Key field.
5. Click Apply SSH Key. (This copies the public key to /home/admin/.ssh/authorized_keys in the guest).
When the guest OS starts, it will get an IP address from the hypervisor’s DHCP server. This IP can change
each time the guest OS is restarted. Setting a static IP for the guest OS avoids having to check the IP
address whenever the guest is restarted.
CLI
2. Log in as root .
3. Edit /etc/sysconfig/network-scripts/ifcfg-eth0
BOOTPROTO=none
IPADDR=192.168.1.123 # Example
NETMASK=255.255.255.0
GATEWAY=192.168.1.1
UI
Assuming that you have an AWS (Amazon Web Services) account, locate Percona Monitoring and
Management Server in AWS Marketplace (or use this link).
Selecting a region and instance type in the Pricing Information section will give you an estimate of the costs
involved. This is only an indication of costs. You will choose regions and instance types in later steps.
Percona Monitoring and Management Server is provided at no cost, but you may need to pay for
infrastructure costs.
Disk space consumed by PMM Server depends on the number of hosts being monitored. Although each
environment will be unique, you can consider the data consumption figures for the PMM Demo web site
which consumes approximately 230MB/host/day, or ~6.9GB/host at the default 30 day retention period.
For more information, see our blog post How much disk space should I allocate for Percona Monitoring
and Management?.
2. Subscribe to this software: Check the terms and conditions and click Continue to Configuration.
a. Choose Action: Select a launch option. Launch from Website is a quick way to make your instance
ready. For more control, choose Launch through EC2.
e. Security Group Settings: Choose a security group or click *Create New Based On Seller Settings
g. Click Launch.
In the Security Group section, which acts like a firewall, you may use the preselected option Create new
based on seller settings to create a security group with recommended settings. In the Key Pair select an
already set up EC2 key pair to limit access to your instance.
It is important that the security group allow communication via the the following ports: 22, 80, and 443.
PMM should also be able to access port 3306 on the RDS that uses the instance.
Applying settings
Scroll up to the top of the page to view your settings. Then, click the Launch with 1 click button to continue
and adjust your settings in the EC2 console.
Your instance settings are summarized in a special area. Click the Launch with 1 click button to continue.
The Launch with 1 click button may alternatively be titled as Accept Software Terms & Launch with 1-Click.
Your clicking the Launch with 1 click button, deploys your instance. To continue setting up your instance,
run the EC2 console. It is available as a link at the top of the page that opens after you click the Launch
with 1 click button.
Your instance appears in the EC2 console in a table that lists all instances available to you. When a new
instance is only created, it has no name. Make sure that you give it a name to distinguish from other
instances managed via the EC2 console.
After you add your new instance it will take some time to initialize it. When the AWS console reports that the
instance is now in a running state, you many continue with configuration of PMM Server.
When started the next time after rebooting, your instance may acquire another IP address. You may
choose to set up an elastic IP to avoid this problem.
With your instance selected, open its IP address in a web browser. The IP address appears in the IPv4
Public IP column or as value of the Public IP field at the top of the Properties panel.
To run the instance, copy and paste its public IP address to the location bar of your browser. In the
Percona Monitoring and Management welcome page that opens, enter the instance ID.
You can copy the instance ID from the Properties panel of your instance, select the Description tab back in
the EC2 console. Click the Copy button next to the Instance ID field. This button appears as soon as you
hover the cursor of your mouse over the ID.
Hover the cursor over the instance ID for the Copy button to appear.
Paste the instance in the Instance ID field of the Percona Monitoring and Management welcome page and
click Submit.
PMM Server provides user access control, and therefore you will need user credentials to access it:
You will be prompted to change the default password every time you log in.
The PMM Server is now ready and the home page opens.
You are creating a username and password that will be used for two purposes:
2. authentication between PMM Server and PMM Clients - you will re-use these credentials when configuring
PMM Client for the first time on a server, for example:
For instructions about how to access your instances by using an SSH client, see Connecting to Your
Linux Instance Using SSH
Make sure to replace the user name ec2-user used in this document with admin .
Your AWS instance comes with a predefined size which can become a limitation. To make more disk space
available to your instance, you need to increase the size of the EBS volume as needed and then your
instance will reconfigure itself to use the new size.
The procedure of resizing EBS volumes is described in the Amazon documentation: Modifying the Size,
IOPS, or Type of an EBS Volume on Linux.
After the EBS volume is updated, PMM Server instance will auto-detect changes in approximately 5 minutes
or less and will reconfigure itself for the updated conditions.
Upgrading to a larger EC2 instance class is supported by PMM provided you follow the instructions from the
AWS manual. The PMM AMI image uses a distinct EBS volume for the PMM data volume which permits
independent resizing of the EC2 instance without impacting the EBS volume.
The PMM data volume is mounted as an XFS formatted volume on top of an LVM volume. There are two
ways to increase this volume size:
1. Add a new disk via EC2 console or API, and expand the LVM volume to include the new disk volume.
To expand the existing EBS volume for increased capacity, follow these steps.
3. You can check information about volume groups and logical volumes with the vgs and lvs commands:
vgs
lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
DataLV DataVG Vwi-aotz-- <12.80g ThinPool 1.74
ThinPool DataVG twi-aotz-- 15.96g 1.39 1.29
4. Now we can use the lsblk command to see that our disk size has been identified by the kernel correctly,
but LVM2 is not yet aware of the new size. We can use pvresize to make sure the PV device reflects the
new size. Once pvresize is executed, we can see that the VG has the new free space available.
pvscan
pvresize /dev/xvdb
pvs
5. We then extend our logical volume. Since the PMM image uses thin provisioning, we need to extend both
the pool and the volume:
lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
DataLV DataVG Vwi-aotz-- <12.80g ThinPool 1.77
ThinPool DataVG twi-aotz-- 15.96g 1.42 1.32
Size of logical volume DataVG/ThinPool_tdata changed from 16.00 GiB (4096 extents) to 31.96
GiB (8183 extents).
Logical volume DataVG/ThinPool_tdata successfully resized.
lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
DataLV DataVG Vwi-aotz-- <12.80g ThinPool 1.77
ThinPool DataVG twi-aotz-- 31.96g 0.71 1.71
6. Once the pool and volumes have been extended, we need to now extend the thin volume to consume the
newly available space. In this example we’ve grown available space to almost 32GB, and already
consumed 12GB, so we’re extending an additional 19GB:
lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
DataLV DataVG Vwi-aotz-- <12.80g ThinPool 1.77
ThinPool DataVG twi-aotz-- 31.96g 0.71 1.71
Size of logical volume DataVG/DataLV changed from <12.80 GiB (3276 extents) to <31.80 GiB
(8140 extents).
Logical volume DataVG/DataLV successfully resized.
lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
DataLV DataVG Vwi-aotz-- <31.80g ThinPool 0.71
ThinPool DataVG twi-aotz-- 31.96g 0.71 1.71
7. We then expand the XFS file system to reflect the new size using xfs_growfs , and confirm the file system
is accurate using the df command.
df -h /srv
xfs_growfs /srv
df -h /srv
Software prerequisites
DOCKER
Debian, Ubuntu
apt-add-repository https://download.docker.com/linux/centos/docker-ce.repo
systemctl enable docker
systemctl start docker
MINIKUBE
• To start a fully-working 3 node XtraDB cluster, consisting of sets of 3x HAProxy, 3x PXC and 6x PMM
Client containers, you will need at least 9 vCPU available for minikube. (1x vCPU for HAProxy and PXC
and 0.5vCPU for each pmm-client containers).
• You can pass the environment variable --env ENABLE_DBAAS=1 to force the DBaaS feature when
starting up pmm-server container. You can omit the variable and enable the feature later using
PMM UI, please follow the link in step 3. below.
• Add the option --network minikube if you run PMM Server and minikube in the same Docker instance.
(This will share a single network and the kubeconfig will work.)
• Add the options --env PMM_DEBUG=1 and/or --env PMM_TRACE=1 if you need extended debug details
docker run --detach --publish 80:80 --publish 443:443 --name pmm-server percona/pmm-server:2
(This step is optional, because the same can be done from the web interface of PMM on first login.)
3. IMPORTANT: Please follow instructions on How to activate the DBaaS feature in Advanced Settings
of PMM.
You need to enable the feature using PMM UI if you omitted --env ENABLE_DBAAS=1 when starting up the
container.
2. Deploy the Percona operators configuration for PXC and PSMDB in minikube:
# Prepare a set of base64 encoded values and non encoded for user and pass with administrator
privileges to pmm-server (DBaaS)
PMM_USER='admin';
PMM_PASS='<RANDOM_PASS_GOES_IN_HERE>';
v1.7.0/deploy/bundle.yaml \
| kubectl apply -f -
4. Get your kubeconfig details from minikube (to register your Kubernetes cluster with PMM Server):
You will need to copy this output to your clipboard and continue with add a Kubernetes cluster to PMM.
1. Create your cluster via eksctl or the Amazon AWS interface. For example:
2. When your EKS cluster is running, install the PXC and PSMDB operators:
# Prepare a set of base64 encoded values and non encoded for user and pass with administrator
privileges to pmm-server (DBaaS)
PMM_USER='admin';
PMM_PASS='<RANDOM_PASS_GOES_IN_HERE>';
3. Modify your kubeconfig file, if it’s not utilizing the aws-iam-authenticator or client-certificate method
for authentication with Kubernetes. Here are two examples that you can use as templates to modify a
copy of your existing kubeconfig:
---
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: << CERT_AUTH_DATA >>
server: << K8S_CLUSTER_URL >>
name: << K8S_CLUSTER_NAME >>
contexts:
- context:
cluster: << K8S_CLUSTER_NAME >>
user: << K8S_CLUSTER_USER >>
name: << K8S_CLUSTER_NAME >>
current-context: << K8S_CLUSTER_NAME >>
kind: Config
preferences: {}
users:
- name: << K8S_CLUSTER_USER >>
user:
exec:
apiVersion: client.authentication.k8s.io/v1alpha1
command: aws-iam-authenticator
args:
- "token"
- "-i"
- "<< K8S_CLUSTER_NAME >>"
- --region
- << AWS_REGION >>
env:
- name: AWS_ACCESS_KEY_ID
value: "<< AWS_ACCESS_KEY_ID >>"
- name: AWS_SECRET_ACCESS_KEY
value: "<< AWS_SECRET_ACCESS_KEY >>"
---
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: << CERT_AUTH_DATA >>
server: << K8S_CLUSTER_URL >>
name: << K8S_CLUSTER_NAME >>
contexts:
- context:
cluster: << K8S_CLUSTER_NAME >>
user: << K8S_CLUSTER_USER >>
name: << K8S_CLUSTER_NAME >>
current-context: << K8S_CLUSTER_NAME >>
kind: Config
preferences: {}
users:
- name: << K8S_CLUSTER_NAME >>
user:
client-certificate-data: << CLIENT_CERT_DATA >>
client-key-data: << CLIENT_KEY_DATA >>
Prerequisites
4. You can specify cluster option in form or simply click on “My first cluster” and button Create
8. Click Authorize
12. Create Service Account, copy and store kubeconfig - output of the following command
echo "
apiVersion: v1
kind: Config
users:
- name: percona-dbaas-cluster-operator
user:
token: $token
clusters:
- cluster:
certificate-authority-data: $certificate
server: $server
name: self-hosted-cluster
contexts:
- context:
cluster: self-hosted-cluster
user: percona-dbaas-cluster-operator
name: svcs-acct-context
current-context: svcs-acct-context
"
Important Ensure there are no stray new lines in the kubeconfig, especially in long lines like certificate
or token.
Deleting clusters
You should delete all installation operators as the operators own resources.
If a Public Address is set in PMM Settings, for each DB cluster an API Key is created which can be found
on the page /graph/org/apikeys . You should not delete them (for now, until issue PMM-8045 is fixed) –
once a DB cluster is removed from DBaaS, the related API Key is also removed.
If you only run eksctl delete cluster without cleaning up the cluster first, there will be a lot of orphaned
resources as Cloud Formations, Load Balancers, EC2 instances, Network interfaces, etc.
In the pmm-managed repository, in the deploy directory there are 2 example bash scripts to install and delete
the operators from the EKS cluster.
#!/bin/bash
#!/bin/bash
(Both scripts are similar except the install script command is apply while in the delete script it is delete .)
After deleting everything in the EKS cluster, run this command (using your own configuration path) and
wait until the output only shows service/kubernetes before deleting the cluster with the eksclt delete
command.
Example output:
If you don’t need the cluster anymore, you can uninstall everything in it and destroy it:
docker run --detach --name pmm-server --publish 80:80 --publish 443:443 --env ENABLE_DBAAS=1
percona/pmm-server:2;
Important
• Use --network minikube if running PMM Server and minikube in the same Docker instance. This way
they will share single network and the kubeconfig will work.
• Use Docker variables --env PMM_DEBUG=1 --env PMM_TRACE=1 to see extended debug details.
This step is optional, because the same can be done from the web interface of PMM on the first login.
To make services visible externally, you create a LoadBalancer service or manually run commands to
expose ports:
See also
• DBaaS Dashboard
• Install minikube
• Setting up a Standalone MYSQL Instance on Kubernetes & exposing it using Nginx Ingress Controller
2.3 Client
PMM Client is a collection of agents and exporters that run on the host being monitored.
These sections cover the different ways to install PMM Client on a Linux node and register it with PMM
Server. The options are:
• Option 1 – For Debian- or Red Hat-based distributions, install percona-release and use a Linux
package manager ( apt / dnf ) to install PMM Client.
• Option 2 – For Debian- or Red Hat-based distributions, download .deb / .rpm PMM Client packages and
install them
• Option 3 – For other Linux distributions, download and unpack generic PMM Client Linux binaries
• PMM Server is installed and running with a known IP address accessible from the client node.
• You have superuser access to any database servers that you want to monitor.
• System requirements
• Operating system – PMM Client runs on any modern 64-bit Linux distribution. It is tested on
supported versions of Debian, Ubuntu, CentOS, and Red Hat Enterprise Linux.
• Disk – A minimum of 100 MB of storage is required for installing the PMM Client package. With a
good connection to PMM Server, additional storage is not required. However, the client needs to
store any collected data that it cannot dispatch immediately, so additional storage may be required
if the connection is unstable or the throughput is low. (Caching only applies to Query Analytics
data; VictoriaMetrics data is never cached on the client side.)
Tip If you have used percona-release before, disable and re-enable the repository:
1. Configure repositories.
wget https://repo.percona.com/apt/percona-release_latest.generic_all.deb
sudo dpkg -i percona-release_latest.generic_all.deb
3. Check.
pmm-admin --version
1. Configure repositories.
3. Check.
pmm-admin --version
2. Under Version:, select the one you want (usually the latest).
Here are the download page links for each supported platform.
• Debian 9 (“Stretch”)
• Debian 10 (“Buster”)
• Red Hat/CentOS/Oracle 7
• Red Hat/CentOS/Oracle 8
sha256sum -c pmm2-client-2.17.0.tar.gz.sha256sum
sudo ./install_tarball
PATH=$PATH:/usr/local/percona/pmm2/bin
pmm-admin status
The PMM Client Docker image is a convenient way to run PMM Client as a preconfigured Docker container.
docker pull \
percona/pmm-client:2
2. Use the image as a template to create a persistent data store that preserves local data when the image is
updated.
docker create \
--volume /srv \
--name pmm-client-data \
percona/pmm-client:2 /bin/true
3. Run the container to start PMM Agent in setup mode. Set X.X.X.X to the IP address of your PMM Server.
(Do not use the docker --detach option as PMM agent only logs to the console.)
PMM_SERVER=X.X.X.X:443
docker run \
--rm \
--name pmm-client \
-e PMM_AGENT_SERVER_ADDRESS=${PMM_SERVER} \
-e PMM_AGENT_SERVER_USERNAME=admin \
-e PMM_AGENT_SERVER_PASSWORD=admin \
-e PMM_AGENT_SERVER_INSECURE_TLS=1 \
-e PMM_AGENT_SETUP=1 \
-e PMM_AGENT_CONFIG_FILE=pmm-agent.yml \
--volumes-from pmm-client-data \
percona/pmm-client:2
4. Check status.
In the PMM user interface you will also see an increase in the number of monitored nodes.
You can now add services with pmm-admin by prefixing commands with docker exec pmm-client .
Tip
• Adjust host firewall and routing rules to allow Docker communications. (Read more in the FAQ.)
• For help:
2. Remove containers.
docker rm pmm-client
version: '2'
services:
pmm-client:
image: percona/pmm-client:2
hostname: pmm-client-myhost
container_name: pmm-client
restart: always
ports:
- "42000:42000"
- "42001:42001"
logging:
driver: json-file
options:
max-size: "10m"
max-file: "5"
volumes:
- ./pmm-agent.yaml:/etc/pmm-agent.yaml
environment:
- PMM_AGENT_CONFIG_FILE=/etc/pmm-agent.yaml
- PMM_AGENT_SERVER_USERNAME=admin
- PMM_AGENT_SERVER_PASSWORD=admin
- PMM_AGENT_SERVER_ADDRESS=X.X.X.X:443
- PMM_AGENT_SERVER_INSECURE_TLS=true
entrypoint: pmm-agent setup
• Check the values in the environment section match those for your PMM Server. ( X.X.X.X is the IP
address of your PMM Server.)
• Use unique hostnames across all PMM Clients (value for services.pmm-client.hostname ).
3. Run the PMM Agent setup. This will run and stop.
sudo docker-compose up
4. Edit docker-compose.yml , comment out the entrypoint line (insert a # ) and save.
...
# entrypoint: pmm-agent setup
sudo docker-compose up -d
6. Verify.
In the GUI.
• admin / admin is the default PMM username and password. This is the same account you use to log into
the PMM user interface, which you had the option to change when first logging in.
Examples
Register on PMM Server with IP address 192.168.33.14 using the default admin/admin username and
password, a node with IP address 192.168.33.23 , type generic , and name mynode .
• MySQL and variants (Percona Server for MySQL, Percona XtraDB Cluster, MariaDB)
• MongoDB
• PostgreSQL
• ProxySQL
• Amazon RDS
• Microsoft Azure
• Linux
• External services
• HAProxy
Note To change the parameters of a previously-added service, remove the service and re-add it with new
parameters.
You should specify service type and service name for removing from monitoring One of next types has to be
set: mysql, mongodb, postgresql, proxysql, haproxy, external
See also
• Percona release
PMM Client collects metrics from MySQL, Percona Server for MySQL, Percona XtraDB Cluster, and MariaDB.
(Amazon RDS is also supported and explained in a separate section.)
This page shows you how to set up PMM to monitor a MySQL or MySQL-based database instance. (You
should read it completely before starting work.)
Check that:
• PMM Server is installed and running with a known IP address accessible from the client node.
• PMM Client is installed and the node is registered with PMM Server.
• You have superuser access to any database servers that you want to monitor.
It is good practice to use a non-superuser account to connect PMM Client to the monitored database
instance. This example creates a database user with name pmm , password pass , and the necessary
permissions.
Decide which source of metrics to use, and configure your database server for it. The choices are Slow
query log and Performance Schema.
While you can use both at the same time we recommend using only one–there is some overlap in the data
reported, and each incurs a small performance penalty. The choice depends on the version and variant of
your MySQL instance, and how much detail you want to see.
Here are the benefits and drawbacks of Slow query log and Performance Schema metrics sources.
Benefits Drawbacks
Slow query log More detail. PMM Client must be on the same host as
the database server or have access to the
slow query log.
Lower resource impact (with query Log files grow and must be actively
sampling feature in Percona Server managed.
for MySQL).
This section covers how to configure a MySQL-based database server to use the slow query log as a source
of metrics.
Applicable versions
Server Versions
MySQL 5.1-5.5
MariaDB 10.1.2+
The slow query log records the details of queries that take more than a certain amount of time to complete.
With the database server configured to write this information to a file rather than a table, PMM Client parses
the file and sends aggregated data to PMM Server via the Query Analytics part of PMM Agent.
Settings
Examples
Configuration file
slow_query_log=ON
log_output=FILE
long_query_time=0
log_slow_admin_statements=ON
log_slow_slave_statements=ON
Session
Some MySQL-based database servers support extended slow query log variables.
Applicable versions
Server Versions
MariaDB 10.0
Settings
Examples
log_slow_rate_limit=100
log_slow_rate_type='query'
slow_query_log_always_write_time=1
log_slow_verbosity='full'
slow_query_log_use_global_control='all'
log_slow_rate_limit=100
Slow query log files can grow quickly and must be managed.
When adding a service with the command line use the pmm-admin option --size-slow-logs to set at what
size the slow query log file is rotated. (The size is specified as a number with a suffix. See pmm-admin add
mysql .)
You can manage log rotation yourself, for example, with logrotate . If you do, you can disable PMM
Client’s log rotation with the --slow-log-rotation=false option when adding a service with pmm-admin add .
PERFORMANCE SCHEMA
This section covers how to configure a MySQL-based database server to use Performance Schema as a
source of metrics.
Applicable versions
Server Versions
MariaDB 10.3+
PMM’s MySQL Performance Schema Details dashboard charts the various performance_schema metrics.
Examples
Configuration file
performance_schema=ON
performance-schema-instrument='statement/%=ON'
performance-schema-consumer-statements-digest=ON
innodb_monitor_enable=all
Session
UPDATE performance_schema.setup_consumers
SET ENABLED = 'YES' WHERE NAME LIKE '%statements%';
SET GLOBAL innodb_monitor_enable = all;
There is no Explain or Example data shown by default in Query Analytics when monitoring MariaDB
instances version 10.5.7 or lower. A workaround is to set this variable.
Session
UPDATE performance_schema.setup_instruments SET ENABLED = 'YES', TIMED = 'YES' WHERE NAME LIKE
'statement/%';
UPDATE performance_schema.setup_consumers SET ENABLED = 'YES' WHERE NAME LIKE '%statements%';
Query time distribution is a chart in the Details tab of Query Analytics showing the proportion of query time
spent on various activities. It is enabled with the query_response_time_stats variable and associated
plugins.
Applicable versions
Server Versions
Percona Server for MySQL 5.7 (not Percona Server for MySQL 8.0.)
MariaDB 10.0.4
Configuration file
query_response_time_stats=ON
Session
Tablestats
Some table metrics are automatically disabled when the number of tables exceeds a default limit of 1000
tables. This prevents PMM Client from affecting the performance of your database server.
The limit can be changed when adding a service on the command line with the two pmm-admin options:
User statistics
Applicable versions
User activity, individual table and index access details are shown on the MySQL User Details dashboard
when the userstat variable is set.
Server Versions
MariaDB 5.2.0+
Examples
Configuration file
userstat=ON
Session
Add a service
When you have configured your database server, you can add a MySQL service with the user interface or on
the command line.
When adding a service with the command line, you must use the pmm-admin --query-source=SOURCE option
to match the source you’ve chosen and configured the database server for.
With the PMM user interface, you select Use performance schema, or deselect it to use slow query log.
Main
details
Labels
Environment --environment
Region
Availability
zone
Replication --replication-set
set
Cluster --cluster
Custom --custom-labels
labels
Additional
options
Skip --skip-connection-
connection check
check
Table
statistics
limit
→ Disabled --disable-tablestats
→ Default --disable-tablestats-
limit
→ Custom --disable-tablestats-
limit
Use --perfschema if
performance selected, --slowlog if
schema not.
Add the database server as a service using one of these example commands. If successful, PMM Client will
print MySQL Service added with the service’s ID and name. Use the --environment and -custom-labels
options to set tags for the service to help identify them.
Default query source ( slowlog ), service name ( {node name}-mysql ), and service address/port
( 127.0.0.1:3306 ), with database server account pmm and password pass .
Slow query log source and log size limit (1 gigabyte), service name ( MYSQL_NODE ) and service address/port
( 191.168.1.123:3306 ).
Slow query log source, disabled log management (use logrotate or some other log management tool),
service name ( MYSQL_NODE ) and service address/port ( 191.168.1.123:3306 ).
Default query source ( slowlog ), service name ( {node}-mysql ), connect via socket.
Performance schema query source, service name ( MYSQL_NODE ) and default service address/port
( 127.0.0.1:3306 )
Performance schema query source, service name ( MYSQL_NODE ) and default service address/port
( 127.0.0.1:3306 ) specified with flags.
Default query source ( slowlog ), environment labeled test , custom labels setting source to slowlog . (This
example uses positional parameters for service name and service address.)
2. Look in the Services tab for a matching Service Type (MySQL), Service name, Addresses, and any other
details entered in the form.
3. Look in the Agents tab to check the desired data source is being used.
Check data
If query response time plugin was installed, check for data in the MySQL Query Response Time Details
dashboard or select a query in PMM Query Analytics to see the Query time distribution bar.
See also
2.3.3 MongoDB
In Query Analytics, you can monitor MongoDB metrics and queries. Run the pmm-admin add command to
use these monitoring services.
For MongoDB monitoring services to work in Query Analytics, you need to set up the mongodb_exporter
user.
Here is an example for the MongoDB shell that creates and assigns the appropriate roles to the user.
db.createRole({
role: "explainRole",
privileges: [{
resource: {
db: "",
collection: ""
},
actions: [
"listIndexes",
"listCollections",
"dbStats",
"dbHash",
"collStats",
"find"
]
}],
roles:[]
})
db.getSiblingDB("admin").createUser({
user: "mongodb_exporter",
pwd: "s3cR#tpa$$worD",
roles: [
{ role: "explainRole", db: "admin" },
{ role: "clusterMonitor", db: "admin" },
{ role: "read", db: "local" }
]
})
Enabling Profiling
For MongoDB to work correctly with Query Analytics, you need to enable profiling in your mongod
configuration. (Profiling is not enabled by default because it may reduce the performance of your MongoDB
server.)
You can enable profiling from command line when you start the mongod server. This command is useful if
you start mongod manually.
Note that you need to specify a path to an existing directory that stores database files with the --dpbath .
When the --profile option is set to 2, mongod collects the profiling data for all operations. To decrease the
load, you may consider setting this option to 1 so that the profiling data are only collected for slow
operations.
The --slowms option sets the minimum time for a slow operation. In the given example, any operation
which takes longer than 200 milliseconds is a slow operation.
The --rateLimit option, which is available if you use PSMDB instead of MongoDB, refers to the number of
queries that the MongoDB profiler collects. The lower the rate limit, the less impact on the performance.
However, the accuracy of the collected information decreases as well.
If you run mongod as a service, you need to use the configuration file which by default is /etc/mongod.conf .
In this file, you need to locate the operationProfiling: section and add the following settings:
operationProfiling:
slowOpThresholdMs: 200
mode: slowOp
These settings affect mongod in the same way as the command line options. Note that the configuration file
is in the YAML format. In this format the indentation of your lines is important as it defines levels of nesting.
where username and password are credentials for the monitored MongoDB access, which will be used
locally on the database host. Additionally, two positional arguments can be appended to the command line
flags: a service name to be used by PMM, and a service address. If not specified, they are substituted
automatically as <node>-mongodb and 127.0.0.1:27017 .
The command line and the output of this command may look as follows:
Beside positional arguments shown above you can specify service name and service address with the
following flags: --service-name , --host (the hostname or IP address of the service), and --port (the port
number of the service). If both flag and positional argument are present, flag gains higher priority. Here is
the previous example modified to use these flags:
You can add a MongoDB instance using a UNIX socket with the --socket option:
If the password contains special symbols like the ‘at’ ( @ ) symbol, the host might not be detected
correctly. Make sure that you insert the password with special characters replaced with their escape
sequences. The simplest way is to use the encodeURIComponent JavaScript function in your browser’s web
console (usually found under Development Tools). Evaluate the function with your password as the
parameter. For example:
encodeURIComponent('$ecRet_pas$w@rd')
"%24ecRet_pas%24w%40rd"
SSL/TLS related parameters are passed to an SSL enabled MongoDB server as monitoring service
parameters along with the pmm-admin add command when adding the MongoDB monitoring service.
--tls
--tls-skip-verify
--tls-certificate-key-file=PATHTOCERT
--tls-certificate-key-file-password=IFPASSWORDTOCERTISSET
--tls-ca-file=PATHTOCACERT
2.3.4 PostgreSQL
PMM Client collects metrics from PostgreSQL and Percona Distribution for PostgreSQL databases.
This page shows how to set up PMM to monitor a PostgreSQL database instance. (Read it completely before
starting work.)
Check that:
• PMM Server is installed and running with a known IP address accessible from the client node.
• PMM Client is installed and the node is registered with PMM Server.
• You have superuser access to any database servers that you want to monitor.
(PMM follows PostgreSQL’s end-of-life policy. For specific details on supported platforms and versions, see
Percona’s Software Platform Lifecycle page.)
We recommend creating a PMM database account that can connect to the postgres database with the
SUPERUSER role.
1. Create a user. This example uses pmm . (Replace ****** with a strong password of your choice.)
2. PMM must be able to log in locally as this user to the PostgreSQL instance. To enable this, edit the
pg_hba.conf file. If not already enabled by an existing rule, add:
sudo su - postgres
psql -c "select pg_reload_conf()"
Decide which database extension to use, and configure your database server for it. The choices are:
We recommend choosing only one of these. If you use both, you will get duplicate metrics.
Important While we recommend use of the newer pg_stat_monitor extension, be aware it is currently in
beta phase and unsupported.
Benefits Drawbacks
Bucket-based aggregation
pg_stat_monitor collects statistics and aggregates data in a data collection unit called a bucket. These
are linked together to form a bucket chain.
• a time limit for each bucket’s data collection (the bucket expiry)
When a bucket’s expiration time is reached, accumulated statistics are reset and data is stored in the next
available bucket in > the chain.
When all buckets in the chain have been used, the first bucket is reused and its contents are overwritten.
PG_STAT_STATEMENTS
Install
• Debian/Ubuntu
• Red Hat/CentOS
Configure
shared_preload_libraries = 'pg_stat_statements'
track_activity_query_size = 2048 # Increase tracked query string size
pg_stat_statements.track = all # Track all statements including nested
track_io_timing = on # Capture read/write stats
PG_STAT_MONITOR
Install
• If you use Percona Distribution for PostgreSQL, you can install the extension with your Linux package
manager. See Installing Percona Distribution for PostgreSQL.
• If you use PostgreSQL you can install by downloading and compiling the source code. See Installing
pg_stat_monitor .
Configure
shared_preload_libraries = 'pg_stat_monitor'
You can get a list of available settings with SELECT * FROM pg_stat_monitor_settings; .
5. In a psql session:
SELECT pg_stat_monitor_version();
Add a service
When you have configured your database server, you can add a PostgreSQL service with the user interface
or on the command line.
Add the database server as a service using one of these example commands. If successful, PMM Client will
print PostgreSQL Service added with the service’s ID and name. Use the --environment and -custom-
labels options to set tags for the service to help identify them.
Examples
2. Look in the Services tab for a matching Service Type (PostgreSQL), Service name, Addresses, and any
other details entered in the form.
3. Look in the Agents tab to check the desired data source is being used.
If using Docker, use sudo docker exec pmm-client pmm-admin inventory list services
Check data
2.3.5 ProxySQL
USAGE
where username and password are credentials for the administration interface of the monitored ProxySQL
instance. Additionally, two positional arguments can be appended to the command line flags: a service
name to be used by PMM, and a service address. If not specified, they are substituted automatically as
<node>-proxysql and 127.0.0.1:6032 .
Beside positional arguments shown above you can specify service name and service address with the
following flags: --service-name , and --host (the hostname or IP address of the service) and --port (the
port number of the service), or --socket (the UNIX socket path). If both flag and positional argument are
present, flag gains higher priority. Here is the previous example modified to use these flags for both host/
port or socket connections:
Required settings
It is possible to use PMM for monitoring Amazon RDS (just like any remote MySQL instance). In this case,
the PMM Client is not installed on the host where the database server is deployed. By using the PMM web
interface, you connect to the Amazon RDS DB instance. You only need to provide the IAM user access key
(or assign an IAM role) and PMM discovers the Amazon RDS DB instances available for monitoring.
First of all, ensure that there is the minimal latency between PMM Server and the Amazon RDS instance.
Network connectivity can become an issue for VictoriaMetrics to scrape metrics with 1 second resolution.
We strongly suggest that you run PMM Server on AWS (Amazon Web Services) in the same availability zone
as Amazon RDS instances.
It is crucial that enhanced monitoring be enabled for the Amazon RDS DB instances you intend to monitor.
Set the Enable Enhanced Monitoring option in the settings of your Amazon RDS DB instance.
It is recommended that you use an IAM user account to access Amazon RDS DB instances instead of using
your AWS account. This measure improves security as the permissions of an IAM user account can be
limited so that this account only grants access to your Amazon RDS DB instances. On the other hand, you
use your AWS account to access all AWS services.
The procedure for creating IAM user accounts is well described in the Amazon RDS documentation. This
section only goes through the essential steps and points out the steps required for using Amazon RDS with
Percona Monitoring and Management.
The first step is to define a policy which will hold all the necessary permissions. Then, you need to associate
this policy with the IAM user or group. In this section, we will create a new user for this purpose.
Creating a policy
A policy defines how AWS services can be accessed. Once defined it can be associated with an existing user
or group.
1. Select the Policies option on the navigation panel and click the Create policy button.
2. On the Create policy page, select the JSON tab and replace the existing contents with the following JSON
document.
{ "Version": "2012-10-17",
"Statement": [{ "Sid": "Stmt1508404837000",
"Effect": "Allow",
"Action": [ "rds:DescribeDBInstances",
"cloudwatch:GetMetricStatistics",
"cloudwatch:ListMetrics"],
"Resource": ["*"] },
{ "Sid": "Stmt1508410723001",
"Effect": "Allow",
"Action": [ "logs:DescribeLogStreams",
"logs:GetLogEvents",
"logs:FilterLogEvents" ],
"Resource": [ "arn:aws:logs:*:*:log-group:RDSOSMetrics:*" ]}
]
}
3. Click Review policy and set a name to your policy, such as AmazonRDSforPMMPolicy . Then, click the Create
policy button.
Policies are attached to existing IAM users or groups. To create a new IAM user, select Users on the Identity
and Access Management page at AWS. Then click Add user and complete the following steps:
1. On the Add user page, set the user name and select the Programmatic access option under Select AWS
access type. Set a custom password and then proceed to permissions by clicking the Permissions button.
2. On the Set permissions page, add the new user to one or more groups if necessary. Then, click Review.
To discover an Amazon RDS DB instance in PMM, you either need to use the access key and secret access
key of an existing IAM user or an IAM role. To create an access key for use with PMM, open the IAM console
and click Users on the navigation pane. Then, select your IAM user.
To create the access key, open the Security credentials tab and click the Create access key button. The
system automatically generates a new access key ID and a secret access key that you can provide on the
PMM Add Instance dashboard to have your Amazon RDS DB instances discovered.
You may use an IAM role instead of IAM user provided your Amazon RDS DB instances are associated with
the same AWS account as PMM.
In case, the PMM Server and Amazon RDS DB instance were created by using the same AWS account, you
do not need create the access key ID and secret access key manually. PMM retrieves this information
automatically and attempts to discover your Amazon RDS DB instances.
The last step before you are ready to create an Amazon RDS DB instance is to attach the policy with the
required permissions to the IAM user.
First, make sure that the Identity and Access Management page is open and open Users. Then, locate and
open the IAM user that you plan to use with Amazon RDS DB instances. Complete the following steps, to
apply the policy:
3. Using the Filter, locate the policy with the required permissions (such as AmazonRDSforPMMPolicy ).
4. Select a check-box next to the name of the policy and click Review.
5. The selected policy appears on the Permissions summary page. Click Add permissions.
Query Analytics requires Configuring Performance Schema as the query source, because the slow query log
is stored on the AWS (Amazon Web Services) side, and QAN agent is not able to read it. Enable the
performance_schema option under Parameter Groups in Amazon RDS.
Caution Enabling Performance Schema on T2 instances is not recommended because it can easily run the
T2 instance out of memory.
When adding a monitoring instance for Amazon RDS, specify a unique name to distinguish it from the local
MySQL instance. If you do not specify a name, it will use the client’s host name.
Create the pmm user with the following privileges on the Amazon RDS instance that you want to monitor:
GRANT SELECT, PROCESS, REPLICATION CLIENT ON *.* TO 'pmm'@'%' IDENTIFIED BY 'pass' WITH
MAX_USER_CONNECTIONS 10;
GRANT SELECT, UPDATE, DELETE, DROP ON performance_schema.* TO 'pmm'@'%';
If you have Amazon RDS with a MySQL version prior to 5.5, REPLICATION CLIENT privilege is not available
there and has to be excluded from the above statement.
General system metrics are monitored by using the rds_exporter exporter which replaces node_exporter .
rds_exporter gives access to Amazon CloudWatch metrics.
node_exporter , used in versions of PMM prior to 1.8.0, was not able to monitor general system metrics
remotely.
The preferred method of adding an Amazon RDS database instance to PMM is via the
Configuration→PMM Inventory→Add Instance menu option.
This method supports Amazon RDS database instances that use Amazon Aurora, MySQL, or MariaDB
engines, as well as any remote PostgreSQL, ProxySQL, MySQL and MongoDB instances.
The following steps are needed to add an Amazon RDS database instance to PMM:
3. Enter the access key ID and the secret access key of your IAM user.
4. Click the Discover button for PMM to retrieve the available Amazon RDS instances.
5. For the instance that you would like to monitor, select the Start monitoring button.
6. You will see a new page with the number of fields. The list is divided into the following groups: Main
details, RDS database, Labels, and Additional options. Some already known data, such as already entered
AWS access key, are filled in automatically, and some fields are optional.
The Main details section allows you to specify the DNS hostname of your instance, the service name to use
within PMM, the port your service is listening on, and the database user name and password.
The Labels section allows you to specify labels for the environment, the AWS region and availability zone
to be used, the Replication set and Cluster names and also it allows you to set the list of custom labels in
a key:value format.
The Additional options section contains specific flags which allow you to tune the RDS monitoring. They
can allow you to skip connection check, to use TLS for the database connection, not to validate the TLS
certificate and the hostname, as well as to disable basic and/or enhanced metrics collection for the RDS
instance to reduce costs.
Also this section contains a database-specific flag, which would allow Query Analytics for the selected
remote database:
• when adding some remote MySQL, AWS RDS MySQL or Aurora MySQL instance, you will be able to
choose using performance schema for the database monitoring;
• when adding a PostgreSQL instance, you will be able to activate using pg_stat_statements extension;
• when adding a MongoDB instance, you will be able to choose using Query Analytics MongoDB profiler.
7. Finally press the Add service button to start monitoring your instance.
3. Follow steps 4 to 6 as in the previous section. Fill the form and remember to select PG Stat Statement to
enable Query Analytics. To get queries for Query Analytics, you need to enable pg_stat_statements in your
instance by running:
Required settings
It is possible to use PMM for monitoring Azure database instances like other remote instances. In this case,
the PMM Client is not installed on the host where the database server is deployed. By using the PMM web
interface, you connect to the Azure DB instance. Discovery is not yet implemented in PMM but it is possible
to add known instances by providing the connection parameters.
First of all, ensure that there is the minimal latency between PMM Server and the Azure instance.
Second, add a firewall rule to enable access from PMM Client like this:
Query Analytics requires you to configure Performance Schema as the query source, because the slow query
log is stored on the Azure side, and QAN agent is not able to read it. Enable the performance_schema option
under Parameter Groups in Amazon RDS.
When adding a monitoring instance for Azure, specify a unique name to distinguish it from the local MySQL
instance. If you do not specify a name, it will use the client’s host name.
Create the pmm user with the following privileges on the Amazon RDS instance that you want to monitor:
GRANT SELECT, PROCESS, REPLICATION CLIENT ON *.* TO 'pmm'@'%' IDENTIFIED BY 'pass' WITH
MAX_USER_CONNECTIONS 10;
GRANT SELECT, UPDATE, DELETE, DROP ON performance_schema.* TO 'pmm'@'%';
Example:
and be sure to set Performance Schema as the query collection method for Query Analytics.
2.3.9 MariaDB.
MariaDB up to version 10.2 works out of the box but starting with MariaDB 10.3 instrumentation is disabled
by default and cannot be enabled since there is no SUPER role in Azure-MariaDB. So, it is not possible to
run the required queries to enable instrumentation. Monitoring will work but Query Analytics won’t receive
any query data.
2.3.10 PostgreSQL
For PostgreSQL follow the same methods used for MySQL and MariaDB and enable track_io_timing in the
instance configuration to enable Query Analytics.
pg_stat_statements.track = all
You need to get the Client ID, Client Secret, Tenant ID and Subscription ID.
Navigate to:
When you fill in all fields press the Discover button and you will see a list of available databases for
monitoring.
• Microsoft.DBforMySQL/servers
• Microsoft.DBforMySQL/flexibleServers
• Microsoft.DBforMariaDB/servers
• Microsoft.DBforPostgreSQL/servers
• Microsoft.DBforPostgreSQL/flexibleServers
• Microsoft.DBforPostgreSQL/serversv2
• https://docs.microsoft.com/en-us/azure/postgresql/
• https://docs.microsoft.com/en-us/azure/mysql/
You will need to set pg_stat_statements.track = all in you PostgreSQL Server settings to use PMM Query
Analytics
In the list of databases on the Discovery page click Start Monitoring to add the selected Azure Database to
PMM.
• node_cpu_average
• azure_resource_info
• node_filesystem_size_bytes
• azure_memory_percent_average
• azure_storage_percent_average
• azure_storage_used_bytes_average
• node_network_receive_bytes_total
• node_network_transmit_bytes_total
• PMM Agent to collect queries related metrics using pg_stat_statements for PostgreSQL or Performance
Schema for MySQL (MariaDB)
PMM can monitor MySQL or PostgreSQL instances hosted on the Google Cloud Platform.
MySQL
2. The database server must be accessible by PMM Client. If PMM Client is not also hosted on GCP, you will
need to add a network interface with a public interface.
3. Configure Performance Schema on the MySQL server. Using the GCP console’s Cloud Shell or your own
gcloud installation, run:
9. Check for values in the MySQL Instance Overview dashboard and in Query Analytics
PostgreSQL
2. The database server must be accessible by PMM Client. If PMM Client is not also hosted on GCP, you will
need to add a network interface with a public interface.
3. Configure pg_stat_statements . Open an interactive SQL session with your GCP PostgreSQL server and
run:
9. Check for values in the PostgreSQL Instance Overview dashboard and Query Analytics
MYSQL
• As a Docker container:
docker run -d \
-v ~/path/to/admin-api-file.json:/config \
-p 127.0.0.1:3306:3306 \
gcr.io/cloudsql-docker/gce-proxy:1.19.1 \
/cloud_sql_proxy \
-instances=example-project-NNNN:us-central1:mysql-for-pmm=tcp:0.0.0.0:3306 \
-credential_file=/config
• On Linux:
6. Add instance
POSTGRESQL
./cloud_sql_proxy -instances=example-project-NNNN:us-central1:pg-for-pmm=tcp:5432 \
-credential_file=/path/to/credential-file.json
6. Load extension:
7. Add service:
2.3.12 Linux
PMM collects Linux metrics automatically starting from the moment when you have configured your node
for monitoring with pmm-admin config .
You can collect metrics from an external (custom) exporter on a node when:
• this node has been configured using the pmm-admin config command.
USAGE
• external will collect metrics from the exporter that is run on the same host as PMM Client’s connection
to it by a port. (See more details with pmm-admin add external --help .)
• external-serverless is useful for collecting metrics from cloud services. You need a host and port
number to add it to PMM Server. (See more details with pmm-admin add external-serverless --help .)
PMM can collect any metrics in Open metrics or Prometheus exposition format. You must specify the host
and port of these metrics using the pmm-admin add external or pmm-admin add external-serverless
commands.
From this point, PMM will collect and store available metrics.
To browse and visualize collected metrics as a first step, we can look at the Advanced Data Exploration
dashboard and select informative services and metrics.
THIRD-PARTY EXPORTERS
CUSTOM EXPORTER
You can write a custom external exporter or extend your application to expose metrics in Prometheus
format.
EXAMPLES
• Add an exporter running on local port 9256 to the group called processes .
• Use the group and host names to automatically generate a service name.
or by parsing required data from a URL string, in which case you only need to pass a valid URL.
2.3.14 HAProxy
• After HAProxy is running (default address http://localhost:8404/metrics) you can add it to PMM.
• this node has been configured using the pmm-admin config command.
USAGE
where listen-port is the port number where HAProxy running. (This is the only required flag.)
Additionally, one positional argument can be appended to the command line flags: a service name to be
used by PMM. If not specified, they are substituted automatically as <node>-haproxy .
During adding here is connection check (can be skipped by flag --skip-connection-check ). If HAProxy
doesn’t run properly on the given port then you will see an error message:
Beside positional argument shown above you can specify service name with the following flags: --username
, --password , --metrics-path (path for scraping metrics, default: /metrics) and --scheme (http or https).
Here are some examples:
3. Using
3.1 Using
• User Interface
• Finding dashboards
• Annotating events
• Integrated alerting
• Percona Platform
• Security Threat Tool: Enabling and seeing the results of database security checks
3.2.1 Logging in
1. Start a web browser and in the address bar enter the server name or IP address of the PMM server host.
3. Enter the username and password given to you by your system administrator. The defaults are:
• Username: admin
• Password: admin
4. Click Log in
5. If this is your first time logging in, you’ll be asked to set a new password. (We recommend you do.) Enter
a new password in both fields and click Submit. You can click Skip to carry on with the default password.
3.2.2 Dashboards
Dashboards are grouped into folders. You can customize these, by renaming them or creating new ones.
The area inside dashboards is populated by panels. Some are in collapsible panel groups. A panel can show
a value, a graph, a chart, or a visual representation of a set.
3.2.3 Controls
2. Navigation bar
3. View controls
3.2.4 Navigation
There are two ways to open the dashboard search page. (Each takes you to the same search screen.)
• Click the dashboard name in the navigation bar (top row, to the right of the icon). (To search within
the current folder, click the folder name instead of the dashboard name.)
1. Click Search dashboards by name and begin typing any part of the dashboard name (in this example,
“Instances”).
2. Click one of the search results to go to that dashboard. Change the search text to refine the list.
3. To abandon the search, click the icon at the end of the search bar.
In the main menu, the PMM Dashboards icon reveals a submenu containing links to all PMM dashboards
grouped by service type. (This menu will eventually replace the shortcut menu which has links to commonly-
used dashboards.
3.2.5 Panels
Charts, graphs and set-based panels reveal extra information when the mouse is moved over them.
Some panels have an information icon in the top left corner. Mouse over this to reveal panel information.
Panel menu
At the top of each panel and to the right of the panel name is the panel menu.
Tip The menu is hidden until you mouse over it. Look for the symbol in the title bar of a panel.
Item Description
VIEW
The View menu items opens panels in full-window mode. This is useful for graphs with several metrics.
Exit a panel’s full window mode by pressing Escape or clicking the left arrow next to the dashboard
name.
See also
3.3.1 Definitions
• Alerts are generated when their criteria (alert rules) are met; an alert is the result of an alert rule
expression evaluating to true.
• Alert rules are based on alert rule templates. We provide a default set of templates. You can also create
your own.
PMM’s Integrated Alerting is a customized and separate instance of the Prometheus Alertmanager, and
distinct from Grafana’s alerting functionality.
3.3.2 Prerequisites
Set up a communication channel: When the Communication tab appears, select it. Enter details for Email or
Slack. (Read more)
The Alerting menu also lists Alert Rules and Notification Channels. These are for Grafana’s alerting
functionality.
2. Click Add.
• Name
• Type
• Email:
• Addresses
• Pager Duty
• Routing key
• Service key
• Slack
• Channel
4. Click Add to add the notification channel, or Cancel to abort the operation.
2. Click Add.
• Template
• Name
• Threshold
• Duration(s)
• Severity
• Filters
• Channels
• Activate
4. Click Add to add the alert rule, or Cancel to abort the operation.
2. Click Add.
---
templates:
- name: mysql_too_many_connections
version: 1
summary: MySQL connections in use
tiers: [anonymous, registered]
expr: |-
max_over_time(mysql_global_status_threads_connected[5m]) / ignoring (job)
mysql_global_variables_max_connections
* 100
> [[ .threshold ]]
params:
- name: threshold
summary: A percentage from configured maximum
unit: '%'
type: float
range: [0, 100]
value: 80
for: 5m
severity: warning
labels:
foo: bar
annotations:
description: |-
More than [[ .threshold ]]% of MySQL connections are in use on
{{ $labels.instance }}
VALUE = {{ $value }}
LABELS: {{ $labels }}
summary: MySQL too many connections (instance {{ $labels.instance }})
The parameters used in the template follow a format and might include different fields depending on
their type :
• name (required): the name of the parameter. Spaces and special characters not allowed.
• type (required): PMM currently supports the float type. (More will be available in the future, such
as string or bool .)
• range (optional): only for float parameters, defining the boundaries for the value.
Restrictions
• Value strings must not include any of these special characters: < > ! @ # $ % ^ & * ( )
_ / \ ' + - = (space)
4. Click Add to add the alert rule template, or Cancel to abort the operation.
3.3.7 Video
This short (3:36) video shows how to activate and configure Integrated Alerting.
Query Analytics supports MySQL, MongoDB and PostgreSQL. The minimum requirements for MySQL are:
Query Analytics displays metrics in both visual and numeric form. Performance-related characteristics
appear as plotted graphics with summaries.
• Filters Panel
• Overview Panel
• Details Panel
Query Analytics data retrieval is not instantaneous and can be delayed due to network conditions. In such
situations no data is reported and a gap appears in the sparkline.
• The Filter panel occupies the left side of the dashboard. It lists filters, grouped by category. Selecting
one reduces the Overview list to those items matching the filter.
• The first five of each category are shown. If there are more, the list is expanded by clicking Show all
beside the category name, and collapsed again with Show top 5.
• Applying a filter may make other filters inapplicable. These become grayed out and inactive.
• Separately, the global Time range setting filters results by time, either your choice of Absolute time
range, or one of the predefined Relative time ranges.
To the right of the Filters panel and occupying the upper part of the dashboard is the Overview panel.
Each row of the table represents the metrics for a chosen object type, one of:
• Query
• Service Name
• Database
• Schema
• User Name
• Client Host
At the top of the second column is the dimension menu. Use this to choose the object type.
On the right side of the dimension column is the Dimension Search bar.
Enter a string and press Enter to limit the view to queries containing only the specified keywords.
Delete the search text and press Enter to see the full list again.
Columns
• The first column is the object’s identifier. For Query, it is the query’s Fingerprint.
• The second column is the Main metric, containing a reduced graphical representation of the metric over
time, called a sparkline, and a horizontal meter, filled to reflect a percentage of the total value.
Tool-tips
• For the Query dimension, hovering over the information icon reveals the query ID and its example.
• Hovering on the main metric sparkline highlights the data point and a tooltip shows the data value
under the cursor.
• Hovering on the main metric meter reveals the percentage of the total, and other details specific to the
main metric.
• Hovering on column values reveals more details on the value. The contents depends on the type of
value.
• When clicked, a text field and list of available metrics are revealed. Select a metric or enter a search
string to reduce the list. Selecting a metric adds it to the panel.
• A metric column is removed by clicking on the column heading and selecting Remove column.
• The value plotted in the main metric column can be changed by clicking a metric column heading and
selecting Swap with main metric.
Sorting
• Click either the up or down caret to sort the list by that column’s ascending or descending values.
Pagination
• The pagination device lets you move forwards or backwards through pages, jump to a specific page,
and choose how many items are listed per page.
• Selecting an item in the Overview panel opens the Details panel with a Details Tab.
• If the dimension is Query, the panel also contains the Examples Tab, Explain Tab, and Tables Tab.
Details Tab
The Details tab contains a Query time distribution bar (only for MySQL databases) and a set of Metrics in
collapsible subpanels.
• The Query time distribution bar shows a query’s total time made up of colored segments, each
segment representing the proportion of time spent on one of the follow named activities:
• blk_read_time : Total time the statement spent reading blocks (if track_io_timing is enabled,
otherwise zero).
• blk_write_time : Total time the statement spent writing blocks (if track_io_timing is enabled,
otherwise zero).
• innodb_queue_wait : Time the query spent either waiting to enter the InnoDB queue, or in it pending
execution.
• Metric: The Metric name, with a question-mark tool-tip that reveals a description of the metric on
mouse-over.
• Sum: A summation of the metric for the selected query, and the percentage of the total.
• Each row in the table is a metric. The contents depends on the chosen dimension.
Examples Tab
The Examples tab shows an example of the selected query’s fingerprint or table element.
Explain Tab
The Explain tab shows the explain output for the selected query, in Classic or JSON formats:
Tables Tab
The Tables tab shows information on the tables and indexes involved in the selected query.
MongoDB is conceptually different from relational database management systems, such as MySQL and
MariaDB.
Relational database management systems store data in tables that represent single entities. Complex
objects are represented by linking tables.
In contrast, MongoDB uses the concept of a document where all essential information for a complex object
is stored in one place.
Query Analytics can monitor MongoDB queries. Although MongoDB is not a relational database management
system, you analyze its databases and collections in the same interface using the same tools.
• DBaaS (Alpha)
The Security Threat Tool runs regular checks against connected databases, alerting you if any servers pose
a potential security threat.
All checks run on the PMM Client side. Results are sent to PMM Server where a summary count is shown on
the Home Dashboard, with details in the PMM Database Checks dashboard.
Checks are automatically downloaded from Percona Platform and run every 24 hours. (This period is not
configurable.)
Check results data always remains on the PMM Server. It is not related to anonymous data sent for
Telemetry purposes.
The Failed security checks panel on the Home Dashboard shows the number of failed checks classed as
critical (red), major (amber), and trivial (blue).
Key
Details are in the PMM Database Checks dashboard (select PMM→PMM Database Checks).
How to enable
The Security Threat Tool (STT) is disabled by default. Enable it in PMM Settings→Advanced Settings.
Enabling STT in the settings also causes the PMM server to download STT checks from Percona Platform
and run them once. This operation runs in the background so even though the settings update finishes
instantly it might take some time for the checks to complete download and execution and the results (if any)
to be visible in the PMM Database Checks dashboard.
3. In the Actions column for a chosen check, click the Interval icon
5. Click Save
Check ID Description
postgresql_super_role PostgreSQL has users (besides postgres , rdsadmin , and pmm_user ) with the role
‘SUPER’
The DBaaS dashboard is where you add, remove, and operate on Kubernetes and database clusters.
To activate the DBaaS feature, select PMM → PMM Settings → Advanced settings. Then turn on the
DBaaS feature by clicking on toggle in the Technical preview features section of the page.
Kubernetes clusters
2. Enter values for the Kubernetes Cluster Name and Kubeconfig file in the corresponding fields.
3. Click Register.
4. A message will momentarily display telling you whether the registration was successful or not.
You can’t unregister a Kubernetes cluster if there DB clusters associated with it.
1. Click Unregister.
1. Find the row with the Kubernetes cluster you want to see.
2. In the Actions column, open the menu and click Show configuration.
Administrators can select allowed and default versions of components versions for each cluster.
1. Find the row with the Kubernetes cluster you want to manage.
2. In the Actions column, open the menu and click Manage versions.
4. Activate or deactivate allowed versions, and select a default in the Default menu.
5. Click Save.
DB clusters
ADD A DB CLUSTER
a. Enter a value for Cluster name that complies with domain naming rules.
Small, Medium and Large are fixed preset values for Memory, CPU, and Disk.
Beside each resource type is an estimate of the required and available resources represented
numerically in absolute and percentage values, and graphically as a colored, segmented bar showing
the projected ratio of used to available resources. A red warning triangle is shown if the requested
resources exceed those available.
5. When both Basic Options and Advanced Options section icons are green, the Create Cluster button
becomes active. (If inactive, check the values for fields in sections whose icon is red.)
• Connection:
• DB Cluster Parameters:
• Cluster Status:
DELETE A DB CLUSTER
1. Find the row with the database cluster you want to delete.
Important Deleting a cluster in this way also deletes any attached volumes.
EDIT A DB CLUSTER
2. Find the row with the database cluster you want to change.
RESTART A DB CLUSTER
3. In the Actions column, open the menu and click the required action:
• For active clusters, click Suspend.
4. How to
4.1 How to
• Configure via the PMM Settings page
• Troubleshoot:
• Integrated Alerting
4.2 Configure
The Settings page is where you configure PMM.
Open the Settings page from the main menu with Configuration→Settings. The page opens with the
Metrics Resolution settings tab selected.
• Configure
• Metrics resolution
• Advanced Settings
• Data Retention
• Telemetry
• Public address
• DBaaS
• Integrated Alerting
• Public Address
• SSH Key
• Alertmanager integration
• Percona Platform
• Login
• Sign up
• Password Reset
• Password Forgotten
• Communication
• Slack
Diagnostics
On all tabs is a Diagnostics section (top-right). Click Download server diagnostics to retrieve PMM
diagnostics data which can be examined and/or shared with our support team should you need help.
Metrics are collected at three intervals representing low, medium and high resolutions.
The Metrics Resolution settings tab contains a radio button with three fixed presets (Rare, Standard and
Frequent) and one editable custom preset (Custom).
Each preset is a group of low, medium and high resolutions. The values are in seconds.
Short time intervals are high resolution metrics. Longer time intervals are low resolution. So:
• A low resolution interval increases the time between collection, resulting in low-resolution metrics and
lower disk usage.
• A high resolution interval decreases the time between collection, resulting in high-resolution metrics
and higher disk usage.
The default values (in seconds) for the fixed presets and their resolution names are:
No Standard 60 10 5
No Frequent 30 5 1
Yes Custom 60 10 5
(defaults)
Values for the Custom preset can be entered as values, or changed with the arrows.
If there is poor network connectivity between PMM Server and PMM Client, or between PMM Client and the
database server it is monitoring, scraping every second may not be possible when the network latency is
greater than 1 second.
Data Retention
Data retention specifies how long data is stored by PMM Server. By default, time-series data is stored for 30
days. You can adjust the data retention time to balance your system’s available disk space with your
metrics history requirements.
Telemetry
The Telemetry switch enables gathering and sending basic anonymous data to Percona, which helps us to
determine where to focus the development and what is the uptake of the various versions of PMM.
Specifically, gathering this information helps determine if we need to release patches to legacy versions
beyond support, determining when supporting a particular version is no longer necessary, and even
understanding how the frequency of release encourages or deters adoption.
• PMM Version,
We do not gather anything that would make the system identifiable, but the following two things are to be
mentioned:
1. The Country Code is evaluated from the submitting IP address before it is discarded.
2. We do create an “instance ID” - a random string generated using UUID v4. This instance ID is generated
to distinguish new instances from existing ones, for figuring out instance upgrades.
The first telemetry reporting of a new PMM Server instance is delayed by 24 hours to allow sufficient time to
disable the service for those that do not wish to share any information.
There is a landing page for this service, available at check.percona.com, which clearly explains what this
service is, what it’s collecting, and how you can turn it off.
Grafana’s anonymous usage statistics is not managed by PMM. To activate it, you must change the PMM
Server container configuration after each update.
As well as via the PMM Settings page, you can also disable telemetry with the -e DISABLE_TELEMETRY=1
option in your docker run statement for the PMM Server.
• If the Security Threat Tool is enabled in PMM Settings, Telemetry is automatically enabled.
When active, PMM will automatically check for updates and put a notification in the home page Updates
dashboard if any are available.
The Security Threat Tool performs a range of security-related checks on a registered instance and reports
the findings. It is off by default.
To see the results of checks, select PMM Database Checks to open the Security Checks/Failed Checks
dashboard, and select the Failed Checks tab.
Checks are re-fetched and re-run at intervals. There are three named intervals:
Rare interval 78
Frequent interval 4
The address or hostname PMM Server will be accessible at. Click Get from browser to have your browser
detect and populate this field automatically.
DBaaS
Caution DBaaS functionality is a technical preview that must be turned on with a server feature flag. See
DBaaS.
Integrated Alerting
Public Address
This section lets you upload your public SSH key to access the PMM Server via SSH (for example, when
accessing PMM Server as a virtual appliance).
Enter your public key in the SSH Key field and click Apply SSH Key.
Alertmanager manages alerts, de-duplicating, grouping, and routing them to the appropriate receiver or
display component.
This section lets you configure integration of VictoriaMetrics with an external Alertmanager.
• The Alertmanager URL field should contain the URL of the Alertmanager which would serve your PMM
alerts.
• The Prometheus Alerting rules field is used to specify alerting rules in the YAML configuration format.
Fill both fields and click the Apply Alertmanager settings button to proceed.
This panel is where you create, and log into and out of your Percona Platform account.
Login
If you have a Percona Platform account, enter your credentials and click Login.
Sign up
1. Click Sign up
4. Select the check box acknowledging our terms of service and privacy policy
5. Click Sign up
A brief message will confirm the creation of your new account and you may now log in with these
credentials.
Your Percona Platform account is separate from your PMM User account.
Password Reset
PASSWORD FORGOTTEN
In case you forgot your password, click on the Forgot password link in the login page.
You will be redirected to a password reset page. Enter the email you are registered with in the field and click
on Reset via Email.
If you did not forget your password but you still want to change it, go to https://okta.percona.com/
enduser/settings (make sure you are logged in).
Insert you current password and the new password in the form to the bottom right of the page. If you
cannot see the form, you will need to click on the Edit Profile green button (you will be prompted for you
password).
Click on Change Password. If everything goes well, you will see a confirmation message.
4.2.7 Communication
If there is no Communication tab, go to the Advanced Settings tab and activate Integrated Alerting.
• Server Address: The default SMTP smarthost used for sending emails, including port number.
• None
• Plain
• Login
• CRAM-MD5
Slack
See also
4.3 Upgrade
Upgrade PMM Server before upgrading PMM Clients.
PMM Server can run natively, as a Docker image, a virtual appliance, or an AWS cloud instance. Each has
its own installation and update steps.
The preferred and simplest way to update PMM Server is with the PMM Upgrade panel on the Home page.
If one is available, click the update button to update to the version indicated.
Because of the significant architectural changes between PMM1 and PMM2, there is no direct upgrade path.
The approach to making the switch from PMM version 1 to 2 is a gradual transition, outlined in this blog
post.
In short, it involves first standing up a new PMM2 server on a new host and connecting clients to it. As new
data is reported to the PMM2 server, old metrics will age out during the course of the retention period (30
days, by default), at which point you’ll be able to shut down your existing PMM1 server.
Any alerts configured through the Grafana UI will have to be recreated due to the target dashboard id’s not
matching between PMM1 and PMM2. In this instance we recommend moving to Alertmanager recipes in
PMM2 for alerting which, for the time being, requires a separate Alertmanager instance. However, we are
working on integrating this natively into PMM2 Server and expect to support your existing Alertmanager
rules.
4.4 Secure
You can improve the security of your PMM installation with:
pmm-admin status
Tip You can gain an extra level of security by keeping PMM Server isolated from the internet, if possible.
You need valid SSL certificates to encrypt traffic between client and server.
With our Docker, OVF and AMI images, self-signed certificates are in /srv/nginx .
Mounting certificates
• The certificates must be owned by root. You can do this with: sudo chown 0:0 /etc/pmm-certs/*
• The mounted certificate directory ( /etc/pmm-certs in this example) must contain the files
certificate.crt , certificate.key , ca-certs.pem and dhparam.pem .
• For SSL encryption, the container must publish on port 443 instead of 80.
Copying certificates
If PMM Server is running as a Docker image, use docker cp to copy certificates. This example copies
certificate files from the current working directory to a running PMM Server docker container.
To enable:
1. Start a shell within the Docker container: docker exec -it pmm-server bash
2. Edit /etc/grafana/grafana.ini
4.5 Optimize
4.5.1 Improving PMM Performance with Table Statistics Options
If a MySQL instance has a lot of schemas or tables, there are two options to help improve the performance
of PMM when adding instances with pmm-admin add : --disable-tablestats and --disable-tablestats-
limit .
• These settings can only be used when adding an instance. To change them, you must remove and re-
add the instances.
• You can only use one of these options when adding an instance.
When adding an instance with pmm-admin add , the --disable-tablestats option disables table statistics
collection when there are more than the default number (1000) of tables in the instance.
USAGE
4.5.3 Change the number of tables beyond which per-table statistics is disabled
When adding an instance with pmm-admin add , the --disable-tablestats-limit option changes the number
of tables (from the default of 1000) beyond which per-table statistics collection is disabled.
USAGE
EXAMPLE
Add a MySQL instance, disabling per-table statistics collection when the number of tables in the instance
reaches 2000.
4.6 Annotate
Annotations mark a moment in time. They are useful for marking system changes or other significant
application events. They can be set globally or for specific nodes or services.
You create them on the command line with the pmm-admin annotate command.
Annotations show as a vertical dashed line on a dashboard graph. Reveal the annotation text by mousing
over the caret indicator below the line.
You turn annotations on or off with the PMM Annotations switch in the second row menu bar.
4.7 Backup
1. Navigate to PMM Settings→Advanced Settings
5. Click Add
• Name:
• Description:
• Type:
• S3:
• Local Client:
• Local Server:
• Endpoint:
• Bucket Name:
• Access Key:
• Secret Key:
3. Restart Grafana.
4. Install libraries.
5. A new browser tab opens. Wait for the image to be rendered then use your browser’s image save function
to download the image.
If the necessary plugins are not installed, a message in the Share Panel will say so.
4.9 Troubleshoot
4.9.1 Update
If PMM server wasn’t updated properly, or if you have concerns about the release, you can force the update
process in 2 ways:
1. From the UI - Home panel: click with the Alt key on the reload icon in the Update panel (IMG needed) to
make the Update Button visible even if you are on the same version as available for update. Pressing this
button will force the system to rerun the update so that any broken or not installed components can be
installed. In this case, you’ll go through the usual update process with update logs and successful
messages at the end.
2. By API call (if UI not available): You can call the Update API directly with:
Replace admin:admin with your username/password, and replace PMM_SERVER with your server address.
Refresh The Home page in 2-5 min and you should see that PMM was updated.
Broken network connectivity may be due to many reasons. Particularly, when using Docker, the container is
constrained by the host-level routing and firewall rules. For example, your hosting provider might have
default iptables rules on their hosts that block communication between PMM Server and PMM Client,
resulting in DOWN targets in VictoriaMetrics. If this happens, check the firewall and routing settings on the
Docker host.
PMM is also able to generate diagnostics data which can be examined and/or shared with our support team
to help quickly solve an issue. You can get collected logs from PMM Client using the pmm-admin summary
command.
Logs obtained in this way includes PMM Client logs and logs which were received from the PMM Server,
stored separately in the client and server folders. The server folder also contains its own client
subfolder with the self-monitoring client information collected on the PMM Server.
Beginning with PMM version 2.4.0, there is an additional flag that enables the fetching of pprof debug
profiles and adds them to the diagnostics data. To enable, run pmm-admin summary --pprof .
• Go to PMM > PMM Settings and click Download server diagnostics. (See Diagnostics in PMM Settings.)
You are not logged in as a privileged user. You need either Admin or Editor roles to work with Integrated
Alerting.
When I get an email or page from my system the IP is not reachable from outside my organization how
do I fix this?
You can configure your PMM Server’s Public Address by navigating to PMM → PMM Settings → Advanced
Settings, and supply an address to use in your alert notifications.
There’s already an Alertmanager integration tab without me turning it on, I know because I was using
your existing Alertmanager integration.
Before you can use a notification channel you must provide your connection details.
For PagerDuty you can configure in the notification channel tab of Integrated Alerting by supplying your
server/routing key.
In configuring my email server I’m being asked for a Username and Password as well as Identity and
Secret. What is the difference between these and which do I use or do I need both?
However, you can copy them and edit the copies. (PMM >=2.14.0).
If you create a custom alert rule template you will have access to edit.
Creating rules
I’m ready to create my first rule! I’ve chosen a template and given it a name…what is the format of the
fields?
• Threshold - float value, it has different meanings depending on what template is used
• Key must be an exact match. You can find a complete list of keys by using the Explore main
menu item in PMM
• Value is an exact match or when used with a ‘fuzzy’ evaluator (=~) can be a regular expression.
E.g. service_name=~ps.*
Variables in Templates
The concept of “template” implies things like variable substitutions…where can I use these? Where can I
find a complete list of them?
5. Details
5.1 Details
• Architecture: high-level architecture and main components.
• Commands:
• pmm-agent: The manual page for the PMM Client agent program
• VictoriaMetrics: the third-party monitoring solution and time-series database that replaced Prometheus
in PMM 2.12.0
5.2 Architecture
PMM works on the client/server principle, where a single server instance communicates with one or more
clients.
Except when monitoring AWS RDS instances, a PMM Client must be running on the host to be monitored.
• Exporters for each database and service type. When an exporter runs, it connects to the database or
service instance, runs the metrics collection routines, and sends the results to PMM Server.
• pmm-agent : Run as a daemon process, it starts and stops exporters when instructed.
• vmagent : A VictoriaMetrics daemon process that sends metrics data (pushes) to PMM Server.
• pmm-managed
• Query Analytics
• Grafana
• VictoriaMetrics
• Query Analytics (QAN) enables you to analyze MySQL query performance over periods of time. In
addition to the client-side QAN agent, it includes the following:
• QAN API is the back-end for storing and accessing query data collected by the QAN agent running
on a PMM Client.
• QAN Web App is a web application for visualizing collected Query Analytics data.
• Metrics Monitor provides a historical view of metrics that are critical to a MySQL or MongoDB server
instance. It includes the following:
• Grafana is a third-party dashboard and graph builder for visualizing data aggregated (by
VictoriaMetrics or Prometheus) in an intuitive web interface.
PMM Client
• pmm-admin is a command-line tool for managing PMM Client, for example, adding and removing
database instances that you want to monitor. (Read more.).
• pmm-agent is a client-side component a minimal command-line interface, which is a central entry point
in charge for bringing the client functionality: it carries on client’s authentication, gets the client
configuration stored on the PMM Server, manages exporters and other agents.
To make data transfer from PMM Client to PMM Server secure, all exporters are able to use SSL/TLS
encrypted connections, and their communication with the PMM server is protected by the HTTP basic
authentication.
5.3 UI components
1. Main menu (also Grafana menu, side menu)
2. Navigation bar
3. View controls
The main menu is part of the Grafana framework and is visible on every page.
Configuration
Server Admin
PMM Database
Checks
DBaaS
The DBaaS icon appears only if a server feature flag has been set.
The Backup Management icon appears when Backup Management is activated in PMM Settings→Advanced
Settings.
Help
Navigation bar
(Display only)
Mark as favorite
Share dashboard
View controls
Dashboard settings
Refresh dashboard
View selectors
This menu bar is context sensitive; it changes according to the page you are on. (With wide menus on small
screens, items may wrap to the next row.)
Item Description
Shortcut menu
This menu contains shortcuts to other dashboards. The list changes according to the page you’re on.
Important This menu will be removed in future releases. Its function will be replaced by the PMM
Dashboards main menu entry.
Item Description
HA HA dashboards
The Compare menu links to the Instances Overview dashboard for the current service type.
Services menu
Services
PMM menu
Menu Item
PMM
PMM Inventory
PMM Settings
5.4 Dashboards
5.4.1 Dashboards
Insight
• Home Dashboard
• VictoriaMetrics
PMM
• PMM Inventory
OS Dashboards
• Disk Details
• Network Details
• Memory Details
• Nodes Compare
• Nodes Overview
• Node Summary
• NUMA Details
• Processes Details
Prometheus Dashboards
MySQL Dashboards
MongoDB Dashboards
PostgreSQL Dashboards
ProxySQL Dashboards
HA Dashboards
5.4.2 Insight
Home Dashboard
The Home Dashboard is a high-level overview of your environment, the starting page of the PMM portal
from which you can open the tools of PMM, and browse to online resources.
On the PMM home page, you can also find the version number and a button to update your PMM Server.
GENERAL INFORMATION
This section contains links to online resources, such as PMM documentation, releases notes, and blogs.
This section is automatically updated to show the most recent dashboards that you worked with. It also
contains the dashboards that you have bookmarked.
STATISTICS
This section shows the total number of hosts added to PMM and the total number of database instanced
being monitored. This section also current the version number. Use the Check for Updates Manually button
to see if you are using the most recent version of PMM.
ENVIRONMENT OVERVIEW
This section lists all added hosts along with essential information about their performance. For each host,
you can find the current values of the following metrics:
• CPU Busy
• Memory Available
• Disk Reads
• Disk Writes
• Network IO
• DB Connections
• DB QPS
• Virtual CPUs
• RAM
• Host Uptime
• DB Uptime
The Advanced Data Exploration dashboard provides detailed information about the progress of a single
Prometheus metric across one or more hosts.
A gauge is a metric that represents a single numerical value that can arbitrarily go up and down.
Gauges are typically used for measured values like temperatures or current memory usage, but also
“counts” that can go up and down, like the number of running goroutines.
A counter is a cumulative metric that represents a single numerical value that only ever goes up. A counter
is typically used to count requests served, tasks completed, errors occurred, etc. Counters should not be
used to expose current counts of items whose number can also go down, e.g. the number of currently
running goroutines. Use gauges for this use case.
METRIC RATES
Shows the number of samples Per second stored for a given interval in the time series.
This dashboard supports metrics related to NUMA. The names of all these metrics start with
node_memory_numa .
VictoriaMetrics
No description
No description
5.4.3 PMM
PMM Inventory
The Inventory dashboard is a high level overview of all objects PMM “knows” about.
It contains three tabs (services, agents, and nodes) with lists of the correspondent objects and details
about them, so that users are better able to understand which objects are registered against PMM Server.
These objects are composing a hierarchy with Node at the top, then Service and Agents assigned to a Node.
• Nodes – Where the service and agents will run. Assigned a node_id , associated with a machine_id
(from /etc/machine-id ). Few examples are bare metal, virtualized, container.
• Services – Individual service names and where they run, against which agents will be assigned. Each
instance of a service gets a service_id value that is related to a node_id . Examples are MySQL,
Amazon Aurora MySQL. This feature also allows to support multiple mysqld instances on a single node,
with different service names, e.g. mysql1-3306, and mysql1-3307.
• Agents – Each binary (exporter, agent) running on a client will get an agent_id value.
3. Click Delete. The interface will ask you to confirm the operation:
5.4.4 OS Dashboards
The Overall CPU Utilization metric shows how much of the overall CPU time is used by the server. It has
these components:
No description
System
This component the proportion of time the CPUs spent inside the Linux kernel for operations like context
switching, memory allocation and queue handling.
User
This component is the time spent in the user space. Normally, most of the MySQL CPU time is in user
space. A high value of user time indicates a CPU bound workload.
Softirq
This component is the portion of time the CPU spent servicing software interrupts generated by the device
drivers. A high value of softirq may indicates a poorly configured device. The network devices are
generally the main source of high softirq values.
Steal
When multiple virtual machines share the same physical host, some virtual machines may be allowed to
use more of their share of CPU and that CPU time is accounted as Steal by the virtual machine from which
the time is taken.
Iowait
This component is the time the CPU spent waiting for disk IO requests to complete. A high value of iowait
indicates a disk bound load.
Nice
No description
This metric presents global values: while there may be a lot of unused CPU, a single core may be saturated.
Look at the Max Core Utilization to see if any core is reaching close to 100%.
This shows the total utilization of each CPU core along with the average utilization of all CPU cores. Watch
for any core close to 100% utilization and investigate the root cause.
No description
No description
No description
Disk Details
Shows the percentage of disk space utilization for every mount point defined on the system. Having some of
the mount points close to 100% space utilization is not good because of the risk of a “disk full” error that
can block one of the services or even cause a crash of the entire system.
In cases where the mount point is close to 100% consider removing unused files or expanding the space
allocated to the mount point.
MOUNT POINT
Shows information about the disk space usage of the specified mount point.
Having Free close to 0 B is not good because of the risk of a “disk full” error that can block one of the
services or even cause a crash of the entire system.
In cases where Free is close to 0 B consider removing unused files or expanding the space allocated to the
mount point.
DISK LATENCY
Shows average latency for Reads and Writes IO Devices. Higher than typical latency for highly loaded
storage indicates saturation (overload) and is frequent cause of performance problems. Higher than normal
latency also can indicate internal storage problems.
DISK OPERATIONS
Shows amount of physical IOs (reads and writes) different devices are serving. Spikes in number of IOs
served often corresponds to performance problems due to IO subsystem overload.
DISK BANDWIDTH
Shows volume of reads and writes the storage is handling. This can be better measure of IO capacity usage
for network attached and SSD storage as it is often bandwidth limited. Amount of data being written to the
disk can be used to estimate Flash storage life time.
DISK LOAD
Shows how much disk was loaded for reads or writes as average number of outstanding requests at
different period of time. High disk load is a good measure of actual storage utilization. Different storage
types handle load differently - some will show latency increases on low loads others can handle higher load
with no problems.
DISK IO UTILIZATION
Shows disk Utilization as percent of the time when there was at least one IO request in flight. It is designed
to match utilization available in iostat tool. It is not very good measure of true IO Capacity Utilization.
Consider looking at IO latency and Disk Load Graphs instead.
Shows how effectively Operating System is able to merge logical IO requests into physical requests. This is
a good measure of the IO locality which can be used for workload characterization.
DISK IO SIZE
Network Details
This section reports the inbound speed, outbound speed, traffic errors and drops, and retransmit rate.
NETWORK TRAFFIC
This section contains the Network traffic and network utilization hourly metrics.
• Netstat: TCP
• TCP segments
• Netstat: UDP
• UDP Lite
InDatagrams
Packets received
OutDatagrams
Packets sent
InCsumErrors
InErrors
RcvbufErrors
SndbufErrors
NoPorts
ICMP
• ICMP Errors
• Messages/Redirects
• Echos
• Timestamps/Mask Requests
ICMP Errors
InErrors
Messages which the entity received but determined as having ICMP-specific errors (bad ICMP checksums,
bad length, etc.)
OutErrors
Messages which this entity did not send due to problems discovered within ICMP, such as a lack of buffers
InDestUnreachs
OutDestUnreachs
InType3
Destination unreachable
OutType3
Destination unreachable
InCsumErrors
InTimeExcds
Messages/Redirects
InMsgs
Messages which the entity received. Note that this counter includes all those counted by icmpInErrors
InRedirects
OutMsgs
Messages which this entity attempted to send. Note that this counter includes all those counted by
icmpOutErrors
OutRedirects
Redirect messages sent. For a host, this object will always be zero, since hosts do not send redirects
Echos
InEchoReps
InEchos
OutEchoReps
OutEchos
Timestamps/Mask Requests
InAddrMaskReps
InAddrMasks
OutAddrMaskReps
OutAddrMasks
InTimestampReps
InTimestamps
OutTimestampReps
OutTimestamps
Memory Details
MEMORY USAGE
No description
The Node Temperature Details dashboard exposes hardware monitoring and sensor data obtained through
the sysfs virtual file system of the node.
Hardware monitoring devices attached to the CPU and/or other chips on the motherboard let you monitor
the hardware health of a system. Most modern systems include several of such devices. The actual list can
include temperature sensors, voltage sensors, fan speed sensors, and various additional features, such as
the ability to control the rotation speed of the fans.
CHIPS TEMPERATURES
Presents data taken from the temperature sensors connected to other system controllers
Describes the pulse width modulation of the PWN-equipped fans. PWM operates like a switch that constantly
cycles on and off, thereby regulating the amount of power the fan gains: 100% makes it rotate at full
speed, while lower percentage slows rotation down proportionally.
Nodes Compare
This dashboard lets you compare a wide range of parameters. Parameters of the same type are shown side
by side for all servers, grouped into the following sections:
• System Information
• CPU
• Memory
• Disk Partitions
• Disk Performance
• Network
The System Information section shows the System Info summary of each server, as well as System Uptime,
CPU Cores, RAM, Saturation Metrics, and Load Average gauges.
The CPU section offers the CPU Usage, Interrupts, and Context Switches metrics.
In the Memory section, you can find the Memory Usage, Swap Usage, and Swap Activity metrics.
The Disk Partitions section encapsulates two metrics, Mountpoint Usage and Free Space.
The Disk Performance section contains the I/O Activity, Disk Operations, Disk Bandwidth, Disk IO Utilization,
Disk Latency, and Disk Load metrics.
Finally, Network section shows Network Traffic, and Network Utilization Hourly metrics.
Nodes Overview
The Nodes Overview dashboard provides details about the efficiency of work of the following components.
Each component is represented as a section in the dashboard.
• CPU
• Disk
• Network
The CPU section offers the CPU Usage, CPU Saturation and Max Core Usage, Interrupts and Context
Switches, and Processes metrics.
In the Memory section, you can find the Memory Utilization, Virtual Memory Utilization, Swap Space, and
Swap Activity metrics.
The Disk section contains the I/O Activity, Global File Descriptors Usage, Disk IO Latency, and Disk IO Load
metrics.
In the Network section, you can find the Network Traffic, Network Utilization Hourly, Local Network Errors,
and TCP Retransmission metrics.
Node Summary
SYSTEM SUMMARY
CPU USAGE
The CPU time is measured in clock ticks or seconds. It is useful to measure CPU time as a percentage of the
CPU’s capacity, which is called the CPU usage.
When a system is running with maximum CPU utilization, the transmitting and receiving threads must all
share the available CPU. This will cause data to be queued more frequently to cope with the lack of CPU.
CPU Saturation may be measured as the length of a wait queue, or the time spent waiting on the queue.
Interrupt is an input signal to the processor indicating an event that needs immediate attention. An interrupt
signal alerts the processor and serves as a request for the processor to interrupt the currently executing
code, so that the event can be processed in a timely manner.
Context switch is the process of storing the state of a process or thread, so that it can be restored and
resume execution at a later point. This allows multiple processes to share a single CPU, and is an essential
feature of a multitasking operating system.
PROCESSES
No description
MEMORY UTILIZATION
No description
No description
SWAP SPACE
No description
SWAP ACTIVITY
Swap Activity is memory management that involves swapping sections of memory to and from physical
storage.
I/O ACTIVITY
Disk I/O includes read or write or input/output operations involving a physical disk. It is the speed with
which the data transfer takes place between the hard disk drive and RAM.
No description
DISK IO LATENCY
Shows average latency for Reads and Writes IO Devices. Higher than typical latency for highly loaded
storage indicates saturation (overload) and is frequent cause of performance problems. Higher than normal
latency also can indicate internal storage problems.
DISK IO LOAD
Shows how much disk was loaded for reads or writes as average number of outstanding requests at
different period of time. High disk load is a good measure of actual storage utilization. Different storage
types handle load differently - some will show latency increases on low loads others can handle higher load
with no problems.
NETWORK TRAFFIC
Network traffic refers to the amount of data moving across a network at a given point in time.
No description
Total Number of Local Network Interface Transmit Errors, Receive Errors and Drops. Should be Zero
TCP RETRANSMISSION
Retransmission, essentially identical with Automatic repeat request (ARQ), is the resending of packets which
have been either damaged or lost. Retransmission is one of the basic mechanisms used by protocols
operating over a packet switched computer network to provide reliable communication (such as that
provided by a reliable byte stream, for example TCP).
NUMA Details
For each node, this dashboard shows metrics related to Non-uniform memory access (NUMA).
MEMORY USAGE
Shows the free memory as the ratio to the total available memory.
Dirty
Bounce
Mapped
KernelStack
Memory missed is allocated on a node despite the process preferring some different node.
Memory foreign is intended for a node, but actually allocated on some different node.
ANONYMOUS MEMORY
Active
Anonymous memory that has been used more recently and usually not swapped out.
Inactive
Anonymous memory that has not been used recently and can be swapped out.
Active(file) Pagecache memory that has been used more recently and usually not reclaimed until needed.
Inactive(file) Pagecache memory that can be reclaimed without huge performance impact.
SHARED MEMORY
Shmem Total used shared memory (shared between several processes, thus including RAM disks, SYS-V-
IPC and BSD like SHMEM).
HUGEPAGES STATISTICS
Total
Free
Surp
The number of hugepages in the pool above the value in vm.nr_hugepages . The maximum number of
surplus hugepages is controlled by vm.nr_overcommit_hugepages .
LOCAL PROCESSES
REMOTE PROCESSES
Memory allocated on a node while a process was running on some other node.
SLAB MEMORY
Slab
Allocation is a memory management mechanism intended for the efficient memory allocation of kernel
objects.
SReclaimable
SUnreclaim
The part of the Slab that can’t be reclaimed under memory pressure
Processes Details
The Processes Details dashboard displays Linux process information - PIDs, Threads, and Processes. The
dashboard shows how many processes/threads are either in the kernel run queue (runnable state) or in the
blocked queue (waiting for I/O). When the number of process in the runnable state is constantly higher
than the number of CPU cores available, the load is CPU bound. When the number of process blocked
waiting for I/O is large, the load is disk bound. The running average of the sum of these two quantities is
the basis of the loadavg metric.
The dashboard consists of two parts: the first section describes metrics for all hosts, and the second part
provides charts for each host.
Charts for all hosts, available in the first section, are the following ones:
• States of Processes
• Number of PIDs
• Number of Threads
• Runnable Processes
• Sleeping Processes
• Running Processes
• Stopped Processes
• Zombie Processes
• Dead Processes
The following charts are present in the second part, available for each host:
• Processes
• States of Processes
• Number of PIDs
• Number of Threads
NUMBER OF PIDS
No description
No description
NUMBER OF THREADS
No description
No description
RUNNABLE PROCESSES
Processes
The Processes graph shows how many processes/threads are either in the kernel run queue (runnable
state) or in the blocked queue (waiting for I/O). When the number of process in the runnable state is
constantly higher than the number of CPU cores available, the load is CPU bound. When the number of
process blocked waiting for I/O is large, the load is disk bound. The running average of the sum of these
two quantities is the basis of the loadavg metric.
Processes
The Processes graph shows how many processes/threads are either in the kernel run queue (runnable
state) or in the blocked queue (waiting for I/O). When the number of process in the runnable state is
constantly higher than the number of CPU cores available, the load is CPU bound. When the number of
process blocked waiting for I/O is large, the load is disk bound. The running average of the sum of these
two quantities is the basis of the loadavg metric.
SLEEPING PROCESSES
No description
RUNNING PROCESSES
No description
No description
STOPPED PROCESSES
No description
ZOMBIE PROCESSES
No description
DEAD PROCESSES
No description
The Prometheus Exporter Status dashboard reports the consumption of resources by the Prometheus
exporters used by PMM. For each exporter, this dashboard reveals the following information:
• CPU usage
• Memory usage
• Exporter uptime
This section provides a summary of how exporters are used across the selected hosts. It includes the
average usage of CPU and memory as well as the number of hosts being monitored and the total number
of running exporters.
Shows the average CPU usage in percent per host for all exporters.
Monitored Hosts
Exporters Running
Shows the total number of Exporters running with this PMM Server instance.
The CPU usage and memory usage do not include the additional CPU and memory usage required to
produce metrics by the application or operating system.
This section shows how resources, such as CPU and memory, are being used by the exporters for the
selected hosts.
CPU Usage
Plots the Exporters’ CPU usage across each monitored host (by default, All hosts).
Memory Usage
Plots the Exporters’ Memory usage across each monitored host (by default, All hosts).
This section shows how resources, such as CPU and memory, are being used by the exporters for host
types: MySQL, MongoDB, ProxySQL, and the system.
Shows the Exporters’ CPU Cores used for each type of Exporter.
Memory Usage
LIST OF HOSTS
At the bottom, this dashboard shows details for each running host.
CPU Used
Mem Used
Exporters Running
RAM
Virtual CPUs
You can click the value of the CPU Used, Memory Used, or Exporters Running columns to open the
Prometheus Exporter Status dashboard for further analysis.
Percona blog: Understand Your Prometheus Exporters with Percona Monitoring and Management (PMM)
This graph shows the number of Commits which Amazon Aurora engine performed as well as average
commit latency. Graph Latency does not always correlate with the number of performed commits and can
be quite high in certain situations.
• Number of Amazon Aurora Commits: The average number of commit operations per second.
• Amazon Aurora Commit avg Latency: The average amount of latency for commit operations
This graph shows us what statements contribute most load on the system as well as what load corresponds
to Amazon Aurora transaction commit.
• Write Transaction Commit Load: Load in Average Active Sessions per second for COMMIT operations
• UPDATE load: Load in Average Active Sessions per second for UPDATE queries
• SELECT load: Load in Average Active Sessions per second for SELECT queries
• DELETE load: Load in Average Active Sessions per second for DELETE queries
• INSERT load: Load in Average Active Sessions per second for INSERT queries
An active session is a connection that has submitted work to the database engine and is waiting for a
response from it. For example, if you submit an SQL query to the database engine, the database session is
active while the database engine is processing that query.
This graph shows how much memory is used by Amazon Aurora lock manager as well as amount of
memory used by Amazon Aurora to store Data Dictionary.
• Aurora Lock Manager Memory: the amount of memory used by the Lock Manager, the module
responsible for handling row lock requests for concurrent transactions.
• Aurora Dictionary Memory: the amount of memory used by the Dictionary, the space that contains
metadata used to keep track of database objects, such as tables and indexes.
This graph shows average latency for the most important types of statements. Latency spikes are often
indicative of the instance overload.
Amazon Aurora MySQL allows a number of commands which are not available in standard MySQL. This
graph shows usage of such commands. Regular unit_test calls can be seen in default Amazon Aurora
install, the rest will depend on your workload.
• show_volume_status : The number of executions per second of the command SHOW VOLUME STATUS.
The SHOW VOLUME STATUS query returns two server status variables, Disks and Nodes. These
variables represent the total number of logical blocks of data and storage nodes, respectively, for the
DB cluster volume.
• awslambda : The number of AWS Lambda calls per second. AWS Lambda is an event-drive, server-less
computing platform provided by AWS. It is a compute service that run codes in response to an event.
You can run any kind of code from Aurora invoking Lambda from a stored procedure or a trigger.
• alter_system : The number of executions per second of the special query ALTER SYSTEM, that is a
special query to simulate an instance crash, a disk failure, a disk congestion or a replica failure. It’s a
useful query for testing the system.
This graph shows different kinds of Internal Amazon Aurora MySQL Problems which general should be zero
in normal operation.
This dashboard shows server status variables. On this dashboard, you may select multiple servers and
compare their counters simultaneously.
Server status variables appear in two sections: Commands and Handlers. Choose one or more variables in
the Command and Handler fields in the top menu to select the variables which will appear in the
COMMANDS or HANDLERS section for each host. Your comparison may include from one up to three hosts.
By default or if no item is selected in the menu, PMM displays each command or handler respectively.
The level of zlib compression to use for InnoDB compressed tables and indexes.
The maximum percentage that can be reserved as free space within each compressed page, allowing
room to reorganize the data and modification log within the page when a compressed table or index is
updated and the data might be recompressed.
Specifies whether images of re-compressed pages are written to the redo log. Re-compression may occur
when changes are made to compressed data.
Compress Attempts
Number of compression operations attempted. Pages are compressed whenever an empty page is created
or the space for the uncompressed modification log runs out.
Uncompressed Attempts
Number of uncompression operations performed. Compressed InnoDB pages are uncompressed whenever
compression fails, or the first time a compressed page is accessed in the buffer pool and the
uncompressed page does not exist.
Shows the total amount of used compressed pages into the InnoDB Buffer Pool split by page size.
Shows the total amount of free compressed pages into the InnoDB Buffer Pool split by page size.
INNODB ACTIVITY
Writes (Rows)
Writes (Transactions)
Rows Written Per Transactions which modify rows. This is better indicator of transaction write size than
looking at all transactions which did not do any writes as well.
Rollbacks
Number of Buffer Pool requests per Row Access. High numbers here indicate going through long undo
chains, deep trees and other inefficient data access. It can be less than zero due to several rows being read
from single page.
This graph allows you to see which operations occur and the number of rows affected per operation. A
graph like Queries Per Second will give you an idea of queries, but one query could effect millions of rows.
This graph allows you to see which operations occur and the number of rows affected per operation. A
graph like Queries Per Second will give you an idea of queries, but one query could effect millions of rows.
The InnoDB Transactions Information graph shows details about the recent transactions. Transaction IDs
Assigned represents the total number of transactions initiated by InnoDB. RW Transaction Commits are the
number of transactions not read-only. Insert-Update Transactions Commits are transactions on the Undo
entries. Non Locking RO Transaction Commits are transactions commit from select statement in auto-
commit mode or transactions explicitly started with “start transaction read only”.
If you do not see any metric, try running: SET GLOBAL innodb_monitor_enable=all; in the MySQL client.
InnoDB Tables
Index Size Per Row shows how much space we’re using for indexes on per row basics
Space Allocated
Total Amount of Space Allocated. May not exactly match amount of space used on file system but provided
great guidance.
Space Used
Space used in All InnoDB Tables. Reported Allocated Space Less Free Space.
Data Length
Index Length
Estimated Rows
Estimated number of Rows in InnoDB Storage Engine. It is not exact value and it can change abruptly as
information is updated.
Indexing Overhead
How Much Space is Free. Too high value wastes space on disk.
Free
If Enabled, By Default every Table will have its own Tablespace represented as its own .idb file rather than
all tables stored in single system tablespace.
INNODB DISK IO
Due to difference in timing of Row Write and Data Write the value may be misleading on short intervals.
InnoDB I/O
• Data Reads - The total number of InnoDB data reads (OS file reads).
• Log Writes - The number of physical writes to the InnoDB redo log file.
• Data Fsyncs - The number of fsync() operations. The frequency of fsync() calls is influenced by the
setting of the innodb_flush_method configuration option.
InnoDB Log IO
InnoDB I/O
• Data Reads - The total number of InnoDB data reads (OS file reads).
• Log Writes - The number of physical writes to the InnoDB redo log file.
• Data Fsyncs - The number of fsync() operations. The frequency of fsync() calls is influenced by the
setting of the innodb_flush_method configuration option.
InnoDB FSyncs
InnoDB Pending IO
When Growing InnoDB System Tablespace extend it by this size at the time.
Whether InnoDB Double Write Buffer is enabled. Doing so doubles amount of writes InnoDB has to do to
storage but is required to avoid potential data corruption during the crash on most storage subsystems.
Fast Shutdown means InnoDB will not perform complete Undo Space and Change Buffer cleanup on
shutdown, which is faster but may interfere with certain major upgrade operations.
INNODB IO OBJECTS
InnoDB maintains a storage area called the buffer pool for caching data and indexes in memory. Knowing
how the InnoDB buffer pool works, and taking advantage of it to keep frequently accessed data in memory,
is one of the most important aspects of MySQL tuning. The goal is to keep the working set in memory. In
most cases, this should be between 60%-90% of available memory on a dedicated database host, but
depends on many factors.
InnoDB maintains a storage area called the buffer pool for caching data and indexes in memory. Knowing
how the InnoDB buffer pool works, and taking advantage of it to keep frequently accessed data in memory,
is one of the most important aspects of MySQL tuning. The goal is to keep the working set in memory. In
most cases, this should be between 60%-90% of available memory on a dedicated database host, but
depends on many factors.
NUMA Interleave
Interleave Buffer Pool between NUMA zones to better support NUMA systems.
BP Data
BP Data Dirty
BP Miss Ratio
How often buffer pool read requests have to do read from the disk. Keep this percent low for good
performance.
BP Write Buffering
Size of the “Chunk” for buffer pool allocation. Allocation of buffer pool will be rounded by this number. It
also affects the performance impact of online buffer pool resize.
Number of Buffer Pool Instances. Higher values allow to reduce contention but also increase overhead.
Larger Portion increases dump/load time but get more of original buffer pool content and hence may
reduce warmup time.
Whenever to Include Buffer Pool in Crash Core Dumps. Doing so may dramatically increase core dump file
slow down restart. Only makes a difference if core dumping on crash is enabled.
Percent of The Buffer Pool To be Reserved for “Old Blocks” - which has been touched repeatedly over period
of time.
The Time which has to pass between multiple touches for the block for it to qualify as old block.
This variable defines InnoDB Free Page Target per buffer pool. When number of free pages falls below this
number this number page cleaner will make required amount of pages free, flushing or evicting pages from
the tail of LRU as needed.
When Page is being read (or created) the Page need to be allocated in Buffer Pool.
The most efficient way to get a clean page is to grab one from free list. However if no pages are available in
Free List the LRU scan needs to be performed.
If Free List was empty LRU Get Free Loop will be performed. It may perform LRU scan or may use some
other heuristics and shortcuts to get free page.
LRU Scans
If Page could not be find any Free list and other shortcuts did not work, free page will be searched by
scanning LRU chain which is not efficient.
Pages Scanned Per Second while doing LRU scans. If this value is large (thousands) it means a lot of
resources are wasted.
Number of pages scanned per LRU scan in Average. Large number of scans can consume a lot of resources
and also introduce significant addition latency to queries.
If InnoDB could not find a free page in LRU list and had to sleep. Should be zero.
Number of Pages Flushed from “Flush List” This combines Pages Flushed through Adaptive Flush and
Background Flush.
InnoDB Flush Cycle typically Runs on 1 second intervals. If too far off from this number it can indicate an
issue.
How many pages are flushed per Batch. Large Batches can “choke” IO subsystem and starve other IO
which needs to happen.
Neighbor Flushing is Optimized for Rotational Media and unless you’re Running spinning disks you should
disable it.
The maximum checkpoint age is determined by the total length of all transaction log files
( innodb_log_file_size ).
When the checkpoint age reaches the maximum checkpoint age, blocks are flushed synchronously. The
rules of the thumb is to keep one hour of traffic in those logs and let the check-pointing perform its work as
smooth as possible. If you don’t do this, InnoDB will do synchronous flushing at the worst possible time,
i.e., when you are busiest.
Adaptive Flush Flushes pages from Flush List based on the need to advance Checkpoint (driven by Redo
Generation Rate) and by maintaining number of dirty pages within set limit.
Neighbor Flushing
To optimize IO for rotational Media InnoDB may flush neighbor pages. It can cause significant wasted IO
for flash storage. Generally for flash you should run with innodb_flush_neighbors=0 but otherwise this
shows how much IO you’re wasting.
Flushing from the tail of the LRU list is needed to keep readily-available free pages for new data to be read
when data does not fit in the buffer pool.
Number of Neighbor pages flushed (If neighbor flushing is enabled) from Flush List and LRU List Combined.
If InnoDB could not keep up with Checkpoint Flushing and had to trigger Sync flush. This should never
happen.
Pages Flushed by Background Flush which is activated when server is considered to be idle.
Rate at which LSN (Redo) is Created. It may not match how much data is written to log files due to block
size rounding.
This correspond to number of clean pages which were evicted (made free) from the tail of LRU buffer.
Single Page flushes happen in rare case, then clean page could not be found in LRU list. It should be zero
for most workloads.
InnoDB IO Capacity
Estimated number of IOPS storage system can provide. Is used to scale background activities. Do not set it
to actual storage capacity.
InnoDB IO Capacity to use when falling behind and need to catch up with Flushing.
INNODB LOGGING
The size of buffer InnoDB uses for buffering writes to log files.
At Transaction Commit
What to do with Log file At Transaction Commit. Do nothing and wait for timeout to flush the data from Log
Buffer, Flush it to OS Cache but not FSYNC or Flush only.
This variable can be seen as minimum IO alignment InnoDB will use for Redo log file. High Values cause
waste, low values can make IO less efficient.
How much Writes to Log Are Amplified compared to how much Redo is Generated.
Amount of Redo Generated Per Write Transaction. This is a good indicator of transaction size.
Along with the buffer pool size, innodb_log_file_size is the most important setting when we are working
with InnoDB. This graph shows how much data was written to InnoDB’s redo logs over each hour. When
the InnoDB log files are full, InnoDB needs to flush the modified pages from memory to disk.
The rules of the thumb is to keep one hour of traffic in those logs and let the checkpointing perform its work
as smooth as possible. If you don’t do this, InnoDB will do synchronous flushing at the worst possible time,
i.e., when you are busiest.
This graph can help guide you in setting the correct innodb_log_file_size .
Log Bandwidth
Rate at which LSN (Redo) is Created. It may not match how much data is written to log files due to block
size rounding.
The InnoDB Group Commit Batch Size graph shows how many bytes were written to the InnoDB log files per
attempt to write. If many threads are committing at the same time, one of them will write the log entries of
all the waiting threads and flush the file. Such process reduces the number of disk operations needed and
enlarge the batch size.
INNODB LOCKING
Will Define How much locking will come from working with Auto Increment Columns.
Rollback on Timeout
Percent of Active Sections which are blocked due to waiting on InnoDB Row Locks.
Rows Written Per Transactions which modify rows. This is better indicator of transaction write size than
looking at all transactions which did not do any writes as well.
Rollbacks
Average Number of Sessions blocked from proceeding due to waiting on row level lock.
Current Locks
Undo Tablespaces
Purge Threads
Maximum number of Unpurged Transactions, if this number exceeded delay will be introduced to incoming
DDL statements.
The Delay Injected due to Purge Thread(s) unable to keep up with purge progress.
Rollback Segments
The InnoDB Purge Performance graph shows metrics about the page purging process. The purge process
removed the undo entries from the history list and cleanup the pages of the old versions of modified rows
and effectively remove deleted rows.
If you do not see any metric, try running: SET GLOBAL innodb_monitor_enable=all; in the MySQL client.
The InnoDB Undo Space Usage graph shows the amount of space used by the Undo segment. If the amount
of space grows too much, look for long running transactions holding read views opened in the InnoDB
status.
If you do not see any metric, try running: SET GLOBAL innodb_monitor_enable=all; in the MySQL client.
Transaction History
How Many Undo Operations Are Handled Per Each Undo Log Page.
Purge Invoked
The InnoDB Page Splits graph shows the InnoDB page maintenance activity related to splitting and merging
pages. When an InnoDB page, other than the top most leaf page, has too much data to accept a row
update or a row insert, it has to be split in two. Similarly, if an InnoDB page, after a row update or delete
operation, ends up being less than half full, an attempt is made to merge the page with a neighbor page. If
the resulting page size is larger than the InnoDB page size, the operation fails. If your workload causes a
large number of page splits, try lowering the innodb_fill_factor variable (5.7+).
If you do not see any metric, try running: SET GLOBAL innodb_monitor_enable=all; in the MySQL client.
The InnoDB Page Reorgs graph shows information about the page reorganization operations. When a page
receives an update or an insert that affect the offset of other rows in the page, a reorganization is needed.
If the reorganization process finds out there is not enough room in the page, the page will be split. Page
reorganization can only fail for compressed pages.
If you do not see any metric, try running: SET GLOBAL innodb_monitor_enable=all; in the MySQL client.
The InnoDB Page Reorgs graph shows information about the page reorganization operations. When a page
receives an update or an insert that affect the offset of other rows in the page, a reorganization is needed.
If the reorganization process finds out there is not enough room in the page, the page will be split. Page
reorganization can only fail for compressed pages.
If you do not see any metric, try running: SET GLOBAL innodb_monitor_enable=all; in the MySQL client.
The portion of the page to fill then doing sorted Index Build. Lowering this value will worsen space
utilization but will reduce need to split pages when new data is inserted in the index.
Adaptive Hash Index helps to optimize index Look-ups but can be severe hotspot for some workloads.
How many Partitions Used for Adaptive Hash Index (to reduce contention).
Number of Rows “Hashed” Per Each Page which needs to be added to AHI.
AHI ROI
How Many Successful Searches using AHI are performed per each row maintenance operation.
The InnoDB AHI Usage graph shows the search operations on the InnoDB adaptive hash index and its
efficiency. The adaptive hash index is a search hash designed to speed access to InnoDB pages in memory.
If the Hit Ratio is small, the working data set is larger than the buffer pool, the AHI should likely be
disabled.
If you do not see any metric, try running: SET GLOBAL innodb_monitor_enable=all; in the MySQL client.
The Maximum Size of Change Buffer (as Percent of Buffer Pool Size).
INNODB CONTENTION
If Enabled limits number of Threads allowed inside InnoDB Kernel at the same time.
If Enabled limits number of Threads allowed inside InnoDB Kernel at the same time during Commit Stage.
The Time the thread will Sleep before Re-Entering InnoDB Kernel if high contention.
If Set to Non-Zero Value InnoDB Thread Sleep Delay will be adjusted automatically depending on the load
up to the value specified by this variable.
Number of low level operations InnoDB can do after it entered InnoDB kernel before it is forced to exit and
yield to another thread waiting.
The InnoDB Contention - OS Waits graph shows the number of time an OS wait operation was required
while waiting to get the lock. This happens once the spin rounds are exhausted.
If you do not see any metric, try running: SET GLOBAL innodb_monitor_enable=all; in the MySQL client.
The InnoDB Contention - Spin Rounds graph shows the number of spin rounds executed to get a lock. A
spin round is a fast retry to get the lock in a loop.
If you do not see any metric, try running: SET GLOBAL innodb_monitor_enable=all; in the MySQL client.
INNODB MISC
The InnoDB Main Thread Utilization graph shows the portion of time the InnoDB main thread spent at
various task.
If you do not see any metric, try running: SET GLOBAL innodb_monitor_enable=all; in the MySQL client.
InnoDB Activity
The InnoDB Activity graph shows a measure of the activity of the InnoDB threads.
If you do not see any metric, try running: SET GLOBAL innodb_monitor_enable=all; in the MySQL client.
InnoDB automatically optimized for Dedicated Server Environment (auto scaling cache and some other
variables).
This Buffer is used for Building InnoDB Indexes using Sort algorithm.
Refresh InnoDB Statistics when meta-data queries by SHOW TABLE STATUS or INFORMATION_SCHEMA queries. If
Enabled can cause severe performance issues.
Index Condition Pushdown (ICP) is an optimization for the case where MySQL retrieves rows from a table
using an index. Without ICP, the storage engine traverses the index to locate rows in the base table and
returns them to the MySQL server which evaluates the WHERE condition for the rows. With ICP enabled, and
if parts of the WHERE condition can be evaluated by using only columns from the index, the MySQL server
pushes this part of the WHERE condition down to the storage engine. The storage engine then evaluates the
pushed index condition by using the index entry and only if this is satisfied is the row read from the table.
ICP can reduce the number of times the storage engine must access the base table and the number of
times the MySQL server must access the storage engine.
InnoDB Defragmentation
The InnoDB Defragmentation graph shows the status information related to the InnoDB online
defragmentation feature of MariaDB for the optimize table command. To enable this feature, the variable
innodb-defragment must be set to 1 in the configuration file.
The InnoDB Online DDL graph shows the state of the online DDL (alter table) operations in InnoDB. The
progress metric is estimate of the percentage of the rows processed by the online DDL.
MYSQL SUMMARY
MySQL Uptime
MySQL Uptime
The amount of time since the last restart of the MySQL server process.
Current QPS
Current QPS
Based on the queries reported by MySQL’s SHOW STATUS command, it is the number of statements executed
by the server within the last second. This variable includes statements executed within stored programs,
unlike the Questions variable. It does not count COM_PING or COM_STATISTICS commands.
MySQL Connections
Max Connections
Max Connections is the maximum permitted number of simultaneous client connections. By default, this is
151. Increasing this value increases the number of file descriptors that mysqld requires. If the required
number of descriptors are not available, the server reduces the value of Max Connections.
mysqld actually permits Max Connections + 1 clients to connect. The extra connection is reserved for use by
accounts that have the SUPER privilege, such as root.
Max Used Connections is the maximum number of connections that have been in use simultaneously since
the server started.
Connections is the number of connection attempts (successful or not) to the MySQL server.
Threads Connected is the number of open connections, while Threads Running is the number of threads not
sleeping.
MySQL Handlers
MySQL Handlers
Handler statistics are internal statistics on how MySQL is selecting, updating, inserting, and modifying rows,
tables, and indexes.
This is in fact the layer between the Storage Engine and MySQL.
• read_rnd_next is incremented when the server performs a full table scan and this is a counter you
don’t really want to see with a high value.
• read_next is incremented when the storage engine is asked to ‘read the next index entry’. A high value
means a lot of index scans are being done.
The Com_ statement counter variables indicate the number of times each xxx statement has been executed.
There is one status variable for each type of statement. For example, Com_delete and Com_update count
DELETE and UPDATE statements, respectively. Com_delete_multi and Com_update_multi are similar but
apply to DELETE and UPDATE statements that use multiple-table syntax.
Here we can see how much network traffic is generated by MySQL. Outbound is network traffic sent from
MySQL and Inbound is network traffic MySQL has received.
NODE SUMMARY
System Uptime
The parameter shows how long a system has been up and running without a shut down or restart.
Load Average
The system load is a measurement of the computational work the system is performing. Each running
process either using or waiting for CPU resources adds 1 to the load.
RAM
RAM (Random Access Memory) is the hardware in a computing device where the operating system,
application programs and data in current use are kept so they can be quickly reached by the device’s
processor.
Memory Available
on Modern Linux Kernels amount of Memory Available for application is not the same as
Free+Cached+Buffers.
Virtual Memory
RAM + SWAP
Disk Space
CPU Usage
The CPU time is measured in clock ticks or seconds. It is useful to measure CPU time as a percentage of the
CPU’s capacity, which is called the CPU usage.
When a system is running with maximum CPU utilization, the transmitting and receiving threads must all
share the available CPU. This will cause data to be queued more frequently to cope with the lack of CPU.
CPU Saturation may be measured as the length of a wait queue, or the time spent waiting on the queue.
Disk I/O includes read or write or input/output operations involving a physical disk. It is the speed with
which the data transfer takes place between the hard disk drive and RAM.
Swap Activity is memory management that involves swapping sections of memory to and from physical
storage.
Network Traffic
Network traffic refers to the amount of data moving across a network at a given point in time.
The Key Read Ratio ( Key_reads / Key_read_requests ) ratio should normally be less than 0.01.
The Key Write Ratio ( Key_writes / Key_write_requests ) ratio is usually near 1 if you are using mostly
updates and deletes, but might be much smaller if you tend to do updates that affect many rows at the
same time or if you are using the DELAY_KEY_WRITE table option.
This graph is similar to InnoDB buffer pool reads/writes. aria-pagecache-buffer-size is the main cache for
the Aria storage engine. If you see high reads/writes (physical IO), i.e. reads are close to read requests
and/or writes are close to write requests you may need to increase the aria-pagecache-buffer-size (may
need to decrease other buffers: key_buffer_size , innodb_buffer_pool_size , etc.)
This is similar to InnoDB log file syncs. If you see lots of log syncs and want to relax the durability settings
you can change aria_checkpoint_interval (in seconds) from 30 (default) to a higher number. It is good to
look at the disk IO dashboard as well.
This graph shows the utilization for the Aria pagecache. This is similar to InnoDB buffer pool graph. If you
see all blocks are used you may consider increasing aria-pagecache-buffer-size (may need to decrease
other buffers: key_buffer_size , innodb_buffer_pool_size , etc.)
The MyRocks storage engine developed by Facebook based on the RocksDB storage engine is applicable to
systems which primarily interact with the database by writing data to it rather than reading from it.
RocksDB also features a good level of compression, higher than that of the InnoDB storage engine, which
makes it especially valuable when optimizing the usage of hard drives.
PMM collects statistics on the MyRocks storage engine for MySQL in the Metrics Monitor information for this
dashboard comes from the Information Schema tables.
Metrics
• MyRocks cache
• MyRocks memtable
• MyRocks R/W
• MyRocks WAL
• RocksDB stalls
• RocksDB stops/slowdowns
MYSQL CONNECTIONS
Max Connections
Max Connections is the maximum permitted number of simultaneous client connections. By default, this is
151. Increasing this value increases the number of file descriptors that mysqld requires. If the required
number of descriptors are not available, the server reduces the value of Max Connections.
mysqld actually permits Max Connections + 1 clients to connect. The extra connection is reserved for use by
accounts that have the SUPER privilege, such as root.
Max Used Connections is the maximum number of connections that have been in use simultaneously since
the server started.
Connections is the number of connection attempts (successful or not) to the MySQL server.
Aborted Connections
When a given host connects to MySQL and the connection is interrupted in the middle (for example due to
bad credentials), MySQL keeps that info in a system table (since 5.6 this table is exposed in
performance_schema ).
If the amount of failed requests without a successful connection reaches the value of max_connect_errors ,
mysqld assumes that something is wrong and blocks the host from further connection.
To allow connections from that host again, you need to issue the FLUSH HOSTS statement.
Threads Connected is the number of open connections, while Threads Running is the number of threads not
sleeping.
The thread_cache_size variable sets how many threads the server should cache to reuse. When a client
disconnects, the client’s threads are put in the cache if the cache is not full. It is auto-sized in MySQL 5.6.8
and above (capped to 100). Requests for threads are satisfied by reusing threads taken from the cache if
possible, and only when the cache is empty is a new thread created.
Slow queries are defined as queries being slower than the long_query_time setting. For example, if you
have long_query_time set to 3, all queries that take longer than 3 seconds to complete will show on this
graph.
As with most relational databases, selecting based on indexes is more efficient than scanning an entire
table’s data. Here we see the counters for selects not done with indexes.
• Select Scan is how many queries caused full table scans, in which all the data in the table had to be
read and either discarded or returned.
• Select Range is how many queries used a range scan, which means MySQL scanned all rows in a
given range.
• Select Full Join is the number of joins that are not joined on an index, this is usually a huge
performance hit.
MYSQL SORTS
MySQL Sorts
Due to a query’s structure, order, or other requirements, MySQL sorts the rows before returning them. For
example, if a table is ordered 1 to 10 but you want the results reversed, MySQL then has to sort the rows to
return 10 to 1.
This graph also shows when sorts had to scan a whole table or a given range of a table to return the
results and which could not have been sorted via an index.
Table Locks
MySQL takes a number of different locks for varying reasons. In this graph we see how many Table level
locks MySQL has requested from the storage engine. In the case of InnoDB, many times the locks could
actually be row locks as it only takes table level locks in a few specific cases.
It is most useful to compare Locks Immediate and Locks Waited. If Locks waited is rising, it means you
have lock contention. Otherwise, Locks Immediate rising and falling is normal activity.
MYSQL QUESTIONS
MySQL Questions
The number of statements executed by the server. This includes only statements sent to the server by clients
and not statements executed within stored programs, unlike the Queries used in the QPS calculation.
• COM_PING
• COM_STATISTICS
• COM_STMT_PREPARE
• COM_STMT_CLOSE
• COM_STMT_RESET
Here we can see how much network traffic is generated by MySQL. Outbound is network traffic sent from
MySQL and Inbound is network traffic MySQL has received.
Here we can see how much network traffic is generated by MySQL per hour. You can use the bar graph to
compare data sent by MySQL and data received by MySQL.
InnoDB Buffer Pool Data: InnoDB maintains a storage area called the buffer pool for caching data and
indexes in memory.
TokuDB Cache Size: Similar in function to the InnoDB Buffer Pool, TokuDB will allocate 50% of the
installed RAM for its own cache.
Key Buffer Size: Index blocks for MyISAM tables are buffered and are shared by all threads.
key_buffer_size is the size of the buffer used for index blocks.
Adaptive Hash Index Size: When InnoDB notices that some index values are being accessed very
frequently, it builds a hash index for them in memory on top of B-Tree indexes.
Query Cache Size: The query cache stores the text of a SELECT statement together with the corresponding
result that was sent to the client. The query cache has huge scalability problems in that only one thread can
do an operation in the query cache at the same time.
InnoDB Dictionary Size: The data dictionary is InnoDB’s internal catalog of tables. InnoDB stores the data
dictionary on disk, and loads entries into memory while the server is running.
InnoDB Log Buffer Size: The MySQL InnoDB log buffer allows transactions to run without having to write
the log to disk before the transactions commit.
The Com_xxx statement counter variables indicate the number of times each xxx statement has been
executed. There is one status variable for each type of statement. For example, Com_delete and Com_update
count DELETE and UPDATE statements, respectively. Com_delete_multi and Com_update_multi are similar
but apply to DELETE and UPDATE statements that use multiple-table syntax.
The Com_xxx statement counter variables indicate the number of times each xxx statement has been
executed. There is one status variable for each type of statement. For example, Com_delete and Com_update
count DELETE and UPDATE statements, respectively. Com_delete_multi and Com_update_multi are similar
but apply to DELETE and UPDATE statements that use multiple-table syntax.
MYSQL HANDLERS
MySQL Handlers
Handler statistics are internal statistics on how MySQL is selecting, updating, inserting, and modifying rows,
tables, and indexes.
This is in fact the layer between the Storage Engine and MySQL.
• read_rnd_next is incremented when the server performs a full table scan and this is a counter you
don’t really want to see with a high value.
• read_next is incremented when the storage engine is asked to ‘read the next index entry’. A high value
means a lot of index scans are being done.
The query cache has huge scalability problems in that only one thread can do an operation in the query
cache at the same time. This serialization is true not only for SELECTs, but also for INSERT/UPDATE/
DELETE.
This also means that the larger the query_cache_size is set to, the slower those operations become. In
concurrent environments, the MySQL Query Cache quickly becomes a contention point, decreasing
performance. MariaDB and AWS Aurora have done work to try and eliminate the query cache contention in
their flavors of MySQL, while MySQL 8.0 has eliminated the query cache feature.
• query_cache_type=0
• query_cache_size=0
While you can dynamically change these values, to completely remove the contention point you have to
restart the database.
The query cache has huge scalability problems in that only one thread can do an operation in the query
cache at the same time. This serialization is true not only for SELECTs, but also for INSERT/UPDATE/
DELETE.
This also means that the larger the query_cache_size is set to, the slower those operations become. In
concurrent environments, the MySQL Query Cache quickly becomes a contention point, decreasing
performance. MariaDB and AWS Aurora have done work to try and eliminate the query cache contention in
their flavors of MySQL, while MySQL 8.0 has eliminated the query cache feature.
• query_cache_type=0
• query_cache_size=0
While you can dynamically change these values, to completely remove the contention point you have to
restart the database.
The table_definition_cache and table_open_cache can be left as default as they are auto-sized MySQL 5.6
and above (i.e., do not set them to any value).
The table_definition_cache and table_open_cache can be left as default as they are auto-sized MySQL 5.6
and above (i.e., do not set them to any value).
The table_definition_cache and table_open_cache can be left as default as they are auto-sized MySQL 5.6
and above (i.e., do not set them to any value).
No description
No description
This dashboard helps to analyze Performance Schema wait events. It plots the following metrics for the
chosen (one or more) wait events:
The MySQL Performance Schema dashboard helps determine the efficiency of communicating with
Performance Schema. This dashboard contains the following metrics:
The Average Query Response Time graph shows information collected using the Response Time Distribution
plugin sourced from table INFORMATION_SCHEMA.QUERY_RESPONSE_TIME . It computes this value across all
queries by taking the sum of seconds divided by the count of queries.
Query response time counts (operations) are grouped into three buckets:
• 100 ms - 1 s
• 1 s - 10 s
• > 10 s
Available only in Percona Server for MySQL, provides visibility of the split of READ vs WRITE query response
time.
Available only in Percona Server for MySQL, illustrates READ query response time counts (operations)
grouped into three buckets:
• 100 ms - 1 s
• 1 s - 10 s
• > 10 s
Available only in Percona Server for MySQL, illustrates WRITE query response time counts (operations)
grouped into three buckets:
• 100 ms - 1 s
• 1 s - 10 s
• > 10 s
IO THREAD RUNNING
This metric shows if the IO Thread is running or not. It only applies to a secondary host.
SQL Thread is a process that runs on a secondary host in the replication environment. It reads the events
from the local relay log file and applies them to the secondary server.
Depending on the format of the binary log it can read query statements in plain text and re-execute them or
it can read raw data and apply them to the local host.
Possible values
Yes
No
The thread is not running because it is not launched yet or because an error has occurred connecting to
the primary host
Connecting
No value
IO Thread Running is one of the parameters that the command SHOW SLAVE STATUS returns.
This metric shows if the SQL thread is running or not. It only applies to a secondary host.
Possible values
Yes
SQL Thread is running and is applying events from the relay log to the local secondary host
No
SQL Thread is not running because it is not launched yet or because of an error occurred while applying
an event to the local secondary host
REPLICATION ERROR NO
This metric shows the number of the last error in the SQL Thread encountered which caused replication to
stop.
One of the more common errors is Error: 1022 Duplicate Key Entry. In such a case replication is
attempting to update a row that already exists on the secondary. The SQL Thread will stop replication to
avoid data corruption.
READ ONLY
This metric indicates whether the host is configured to be in Read Only mode or not.
Possible values
Yes
The secondary host permits no client updates except from users who have the SUPER privilege or the
REPLICATION SLAVE privilege.
This kind of configuration is typically used for secondary hosts in a replication environment to avoid a
user can inadvertently or voluntarily modify data causing inconsistencies and stopping the replication
process.
No
This metric shows the number of seconds the secondary host is delayed in replication applying events
compared to when the primary host applied them, denoted by the Seconds_Behind_Master value, and only
applies to a secondary host.
Since the replication process applies the data modifications on the secondary asynchronously, it could
happen that the secondary replicates events after some time. The main reasons are:
• Network round trip time - high latency links will lead to non-zero replication lag values.
• Single threaded nature of replication channels - primary servers have the advantage of applying
changes in parallel, whereas secondary ones are only able to apply changes in serial, thus limiting their
throughput. In some cases Group Commit can help but is not always applicable.
• High number of changed rows or computationally expensive SQL - depending on the replication
format ( ROW vs STATEMENT ), significant changes to the database through high volume of rows
modified, or expensive CPU will all contribute to secondary servers lagging behind the primary.
Generally adding more CPU or Disk resources can alleviate replication lag issues, up to a point.
BINLOG SIZE
This metric shows the overall size of the binary log files, which can exist on both primary and secondary
servers.
The binary log (also known as the binlog) contains events that describe database changes: CREATE TABLE ,
ALTER TABLE , updates, inserts, deletes and other statements or database changes.
The binlog file is read by secondaries via their IO Thread process to replicate database changes
modification on the data and on the table structures. There can be more than one binlog file depending on
the binlog rotation policy (for example using the configuration variables max_binlog_size and
expire_logs_days ) or because of server reboots.
When planning the disk space, take care of the overall dimension of binlog files and adopt a good rotation
policy or think about having a separate mount point or disk to store the binlog data.
This metric shows the amount of data written hourly to the binlog files during the last 24 hours. This metric
can give you an idea of how big is your application in terms of data writes (creation, modification,
deletion).
BINLOG COUNT
This metric shows the overall count of binary log files, on both primary and secondary servers.
This metric shows the number of binlog files created hourly during the last 24 hours.
This metric shows the overall size of the relay log files. It only applies to a secondary host.
The relay log consists of a set of numbered files containing the events to be executed on the secondary host
to replicate database changes.
There can be multiple relay log files depending on the rotation policy adopted (using the configuration
variable max_relay_log_size ).
As soon as the SQL thread completes to execute all events in the relay log file, the file is deleted.
If this metric contains a high value, the variable max_relay_log_file is high too. Generally, this not a
serious issue. If the value of this metric is constantly increased, the secondary is delaying too much in
applying the events.
Treat this metric in the same way as the MySQL Replication Delay metric.
This metric shows the amount of data written hourly into relay log files during the last 24 hours.
OVERVIEW
• PRIMARY Service
• Replication Lag
• Replication Delay
• Transport Time
TRANSACTIONS
• Transaction Details
• Applied Transactions
• Sent Transactions
• Checked Transactions
CONFLICTS
• Detected Conflicts
LARGEST TABLES
PIE
The total size of the database: as data + index size, so freeable one.
TABLE ACTIVITY
The next two graphs are available only for Percona Server and MariaDB and require userstat variable
turned on.
ROWS READ
The number of rows read from the table, shown for the top 5 tables.
ROWS CHANGED
The number of rows changed in the table, shown for the top 5 tables.
The current value of an auto_increment column from information_schema , shown for the top 10 tables.
This dashboard requires Percona Server for MySQL 5.1+ or MariaDB 10.1/10.2 with XtraDB. Also
userstat should be enabled, for example with the SET GLOBAL userstat=1 statement. See Setting up
MySQL.
The number of times user’s connections connected using SSL to the server.
The cumulative number of seconds there was activity on connections from the user.
The cumulative CPU time elapsed, in seconds, while servicing connections of the user.
No description
TOTAL CONNECTIONS
The Cursor is a MongoDB Collection of the document which is returned upon the find method execution.
MONGOS CURSORS
The Cursor is a MongoDB Collection of the document which is returned upon the find method execution.
Ops/sec, classified by legacy wire protocol type ( query , insert , update , delete , getmore ).
Ops/sec, classified by legacy wire protocol type ( query , insert , update , delete , getmore ).
Timespan ‘window’ between oldest and newest ops in the Oplog collection.
COMMAND OPERATIONS
Ops or Replicated Ops/sec classified by legacy wire protocol type ( query , insert , update , delete ,
getmore ). And (from the internal TTL threads) the docs deletes/sec by TTL indexes.
LATENCY DETAIL
CONNECTIONS
CURSORS
DOCUMENT OPERATIONS
Docs per second inserted, updated, deleted or returned. (not 1-to-1 with operation counts.)
QUEUED OPERATIONS
QUERY EFFICIENCY
This panel shows the number of objects (both data ( scanned_objects ) and index ( scanned )) as well as the
number of documents that were moved to a new location due to the size of the document growing. Moved
documents only apply to the MMAPv1 storage engine.
Legacy driver operation: Number of, and Sum of time spent, per second executing getLastError
commands to confirm write concern.
Legacy driver operation: Number of getLastError commands that timed out trying to confirm write
concern.
ASSERT EVENTS
This panel shows the number of assert events per second on average over the given time period. In most
cases assertions are trivial, but you would want to check your log files if this counter spikes or is
consistently high.
PAGE FAULTS
COMMAND OPERATIONS
Shows how many times a command is executed per second on average during the selected interval.
Look for peaks and drops and correlate them with other graphs.
CONNECTIONS
Keep in mind the hard limit on the maximum number of connections set by your distribution.
Anything over 5,000 should be a concern, because the application may not close connections correctly.
CURSORS
Helps identify why connections are increasing. Shows active cursors compared to cursors being
automatically killed after 10 minutes due to an application not closing the connection.
DOCUMENT OPERATIONS
When used in combination with Command Operations, this graph can help identify write amplification. For
example, when one insert or update command actually inserts or updates hundreds, thousands, or even
millions of documents.
QUEUED OPERATIONS
Any number of queued operations for long periods of time is an indication of possible issues. Find the cause
and fix it before requests get stuck in the queue.
This is useful for write-heavy workloads to understand how long it takes to verify writes and how many
concurrent writes are occurring.
ASSERTS
Asserts are not important by themselves, but you can correlate spikes with other graphs.
MEMORY FAULTS
Memory faults indicate that requests are processed from disk either because an index is missing or there is
not enough memory for the data set. Consider increasing memory or sharding out.
CONNECTIONS
No description
CURSORS
No description
LATENCY
SCAN RATIOS
Ratio of index entries scanned or whole docs scanned / number of documents returned
No description
REQUESTS
DOCUMENT OPERATIONS
QUEUED OPERATIONS
The number of operations that are currently queued and waiting for a lock
USED MEMORY
No description
REPLICATION LAG
MongoDB replication lag occurs when the secondary node cannot replicate data fast enough to keep up with
the rate that data is being written to the primary node. It could be caused by something as simple as
network latency, packet loss within your network, or a routing issue.
Operations are classified by legacy wire protocol type (insert, update, and delete only).
This metric can show a correlation with the replication lag value.
Time span between now and last heartbeat from replicaset members.
ELECTIONS
Count of elections. Usually zero; 1 count by each healthy node will appear in each election. Happens when
the primary role changes due to either normal maintenance or trouble events.
Timespan ‘window’ between newest and the oldest op in the Oplog collection.
INMEMORY TRANSACTIONS
INMEMORY CAPACITY
INMEMORY SESSIONS
INMEMORY PAGES
A WT ‘ticket’ is assigned out for every operation running simultaneously in the WT storage engine. “Tickets
available” = hard coded high value - “Tickets Out”.
QUEUED OPERATIONS
DOCUMENT CHANGES
Mixed metrics: Docs per second inserted, updated, deleted or returned on any type of node (primary or
secondary); + replicated write Ops/sec; + TTL deletes per second.
This panel shows the number of pages that have been evicted from the WiredTiger cache for the given time
period. The InMemory storage engine only evicts modified pages which signals a compaction of the data
and removal of the dirty pages.
This panel shows the number of objects (both data ( scanned_objects ) and index ( scanned )) as well as the
number of documents that were moved to a new location due to the size of the document growing. Moved
documents only apply to the MMAPv1 storage engine.
PAGE FAULTS
DOCUMENT ACTIVITY
Docs per second inserted, updated, deleted or returned. Also showing replicated write ops and internal TTL
index deletes.
Average time in ms, over full uptime of mongod process, the MMAP background flushes have taken.
QUEUED OPERATIONS
Queue size of ops waiting to be submitted to storage engine layer. (see WiredTiger concurrency tickets for
number of ops being processed simultaneously in storage engine layer.)
CLIENT OPERATIONS
Ops and Replicated Ops/sec, classified by legacy wire protocol type ( query , insert , update , delete ,
getmore ).
This panel shows the number of objects (both data ( scanned_objects ) and index ( scanned )) as well as the
number of documents that were moved to a new location due to the size of the document growing. Moved
documents only apply to the MMAPv1 storage engine.
WIREDTIGER TRANSACTIONS
Data volume transferred per second between the WT cache and data files. Writes out always imply disk;
Reads are often from OS file buffer cache already in RAM, but disk if not.
WIREDTIGER SESSIONS
A WT ‘ticket’ is assigned out for every operation running simultaneously in the WT storage engine.
“Available” = hard-coded high value - “Out”.
QUEUED OPERATIONS
The time spent in WT checkpoint phase. Warning: This calculation averages the cyclical event (default: 1
min) execution to a per-second value.
DOCUMENT CHANGES
Mixed metrics: Docs per second inserted, updated, deleted or returned on any type of node (primary or
secondary); + replicated write Ops/sec; + TTL deletes per second.
This panel shows the number of objects (both data ( scanned_objects ) and index ( scanned )) as well as the
number of documents that were moved to a new location due to the size of the document growing. Moved
documents only apply to the MMAPv1 storage engine.
PAGE FAULTS
CONNECTED
VERSION
SHARED BUFFERS
Defines the amount of memory the database server uses for shared memory buffers. Default is 128MB .
Guidance on tuning is 25% of RAM, but generally doesn’t exceed 40% .
DISK-PAGE BUFFERS
The setting wal_buffers defines how much memory is used for caching the write-ahead log entries.
Generally this value is small ( 3% of shared_buffers value), but it may need to be modified for heavily
loaded servers.
The parameter work_mem defines the amount of memory assigned for internal sort operations and hash
tables before writing to temporary disk files. The default is 4MB .
PostgreSQL’s effective_cache_size variable tunes how much RAM you expect to be available for disk
caching. Generally adding Linux free+cached will give you a good idea. This value is used by the query
planner whether plans will fit in memory, and when defined too low, can lead to some plans rejecting
certain indexes.
AUTOVACUUM
Whether autovacuum process is enabled or not. Generally the solution is to vacuum more often, not less.
POSTGRESQL CONNECTIONS
Max Connections
The maximum number of client connections allowed. Change this value with care as there are some
memory resources that are allocated on a per-client basis, so setting max_connections higher will
generally increase overall PostgreSQL memory usage.
Connections
Active Connections
POSTGRESQL TUPLES
Tuples
The total number of rows processed by PostgreSQL server: fetched, returned, inserted, updated, and
deleted.
The number of rows read from the database: as returned so fetched ones.
The number of rows changed in the last 5 minutes: inserted, updated, and deleted ones.
POSTGRESQL TRANSACTIONS
Transactions
The total number of transactions that have been either been committed or rolled back.
Duration of Transactions
TEMP FILES
All temporary files are taken into account by these two gauges, regardless of why the temporary file was
created (e.g., sorting or hashing), and regardless of the log_temp_files setting.
Conflicts/Deadlocks
The number of queries canceled due to conflicts with recovery in the database (due to dropped
tablespaces, lock timeouts, old snapshots, pinned buffers, or deadlocks).
Number of Locks
The time spent reading and writing data file blocks by back ends, in milliseconds.
Capturing read and write time statistics is possible only if track_io_timing setting is enabled. This can be
done either in configuration file or with the following query executed on the running system:
Buffers
CANCELED QUERIES
The number of queries that have been canceled due to dropped tablespaces, lock timeouts, old snapshots,
pinned buffers, and deadlocks.
The number of times disk blocks were found already in the buffer cache, so that a read was not necessary.
This only includes hits in the PostgreSQL buffer cache, not the operating system’s file system cache.
CHECKPOINT STATS
The total amount of time that has been spent in the portion of checkpoint processing where files are either
written or synchronized to disk, in milliseconds.
POSTGRESQL SETTINGS
SYSTEM SUMMARY
This section contains the following system parameters of the PostgreSQL server: CPU Usage, CPU
Saturation and Max Core Usage, Disk I/O Activity, and Network Traffic.
Cumulative number of temporary files created by queries in this database since service start. All temporary
files are counted, regardless of why the temporary file was created (e.g., sorting or hashing), and
regardless of the log_temp_files setting.
Cumulative amount of data written to temporary files by queries in this database since service start. All
temporary files are counted, regardless of why the temporary file was created, and regardless of the
log_temp_files setting.
Number of temporary files created by queries in this database. All temporary files are counted, regardless
of why the temporary file was created (e.g., sorting or hashing), and regardless of the log_temp_files
setting.
Total amount of data written to temporary files by queries in this database. All temporary files are counted,
regardless of why the temporary file was created, and regardless of the log_temp_files setting.
CANCELED QUERIES
No description
NETWORK TRAFFIC
Network traffic refers to the amount of data moving across a network at a given point in time.
5.4.10 HA Dashboards
Shows figures for the replication latency on group communication. It measures latency from the time point
when a message is sent out to the time point when a message is received. As replication is a group
operation, this essentially gives you the slowest ACK and longest RTT in the cluster.
Shows the number of FC_PAUSE events sent/received. They are sent by a node when its replication queue
gets too full. If a node is sending out FC messages it indicates a problem.
Shows the average distances between highest and lowest seqno that are concurrently applied, committed
and can be possibly applied in parallel (potential degree of parallelization).
Shows the number of local transactions being committed on this node that failed certification (some other
node had a commit that conflicted with ours) – client received deadlock error on commit and also the
number of local transactions in flight on this node that were aborted because they locked something an
applier thread needed – deadlock error anywhere in an open transaction. Spikes in the graph may indicate
writing to the same table potentially the same rows from 2 nodes.
Shows for how long the node can be taken out of the cluster before SST is required. SST is a full state
transfer method.
Shows the count of transactions received from the cluster (any other node) and replicated to the cluster
(from this node).
Shows the bytes of data received from the cluster (any other node) and replicated to the cluster (from this
node).
Shows the bytes of data received from the cluster (any other node) and replicated to the cluster (from this
node).
No description
5.5 Commands
5.5.1 Commands
• pmm-agent – Daemon process, communicating between PMM Client and PMM Server
NAME
SYNOPSIS
pmm-admin [FLAGS]
pmm-admin add external [FLAGS] [NAME] [ADDRESS] (CAUTION: Technical preview feature)
DESCRIPTION
pmm-admin is a command-line tool for administering PMM using a set of COMMAND keywords and
associated FLAGS.
PMM communicates with the PMM Server via a PMM agent process.
COMMON FLAGS
-h , --help
--help-long
--help-man
--debug
--trace
--json
--version
--server-url=server-url
--server-insecure-tls
--group=<group-name>
COMMANDS
GENERAL COMMANDS
INFORMATION COMMANDS
Show Services and Agents running on this Node, and the agent mode (push/pull).
Show the following information about a local pmm-agent, and its connected server and clients:
• PMM Client: connection status, time drift, latency, vmagent status, pmm-admin version.
FLAGS:
--wait=<period><unit>
Time to wait for a successful response from pmm-agent. period is an integer. unit is one of ms for
milliseconds, s for seconds, m for minutes, h for hours.
Creates an archive file in the current directory with default file name
summary_<hostname>_<year>_<month>_<date>_<hour>_<minute>_<second>.zip . The contents are two
directories, client and server containing diagnostic text files.
FLAGS:
--filename="filename"
--skip-server
--pprof
CONFIGURATION COMMANDS
pmm-admin config
FLAGS:
--node-id=node-id
--node-model=node-model
Node model
--region=region
Node region
--az=availability-zone
--metrics-mode=mode
Metrics flow mode for agents node-exporter. Allowed values: - auto : chosen by server (default) -
push : agent will push metrics - pull : server scrapes metrics from agent
pmm-admin register
--server-url=server-url
--machine-id="/machine_id/9812826a1c45454a98ba45c56cc4f5b0"
--distro="linux"
--container-id=container-id
Container ID.
--container-name=container-name
Container name.
--node-model=node-model
Node model.
--region=region
Node region.
--az=availability-zone
--custom-labels=labels
pmm-admin remove
--service-id=service-id
Service ID.
--force
Remove service with that name or ID and all dependent services and agents.
When you remove a service, collected data remains on PMM Server for the specified retention period.
pmm-admin annotate
<annotation>
--node
--service
Annotate all services running on the current node, or that specified by --service-name .
--tags
A quoted string that defines one or more comma-separated tags for the annotation. Example: "tag
1,tag 2" .
--node-name
--service-name
Combining flags
--node
current node
--node-name
--node --node-name=NODE_NAME
--node --service-name
--node --service
--service
--service-name
--service --service-name
--service --node-name
--service-name --node-name
If node or service name is specified, they are used instead of other parameters.
DATABASE COMMANDS
MongoDB
FLAGS:
--node-id=node-id
--pmm-agent-id=pmm-agent-id
--username=username
MongoDB username.
--password=password
MongoDB password.
--query-source=profiler
--environment=environment
Environment name.
--cluster=cluster
Cluster name.
--replication-set=replication-set
--custom-labels=custom-labels
--skip-connection-check
--tls
--tls-skip-verify
--tls-certificate-key-file=PATHTOCERT
--tls-certificate-key-file-password=IFPASSWORDTOCERTISSET
--tls-ca-file=PATHTOCACERT
--metrics-mode=mode
Metrics flow mode for agents node-exporter. Allowed values: - auto : chosen by server (default) -
push : agent will push metrics - pull : server scrapes metrics from agent
MySQL
FLAGS:
--address
--socket=socket
Path to MySQL socket. (Find the socket path with mysql -u root -p -e "select @@socket" .)
--node-id=node-id
--pmm-agent-id=pmm-agent-id
--username=username
MySQL username.
--password=password
MySQL password.
--query-source=slowlog
Source of SQL queries, one of: slowlog , perfschema , none (default: slowlog ).
--size-slow-logs=N
Rotate slow log file at this size. If 0 , use server-defined default. Negative values disable log rotation. A
unit suffix must be appended to the number and can be one of:
• KiB , MiB , GiB , TiB for base 2 units (1024, 1048576, etc).
--disable-queryexamples
--disable-tablestats
• --collect.auto_increment.columns
• --collect.info_schema.tables
• --collect.info_schema.tablestats
• --collect.perf_schema.indexiowaits
• --collect.perf_schema.tableiowaits
• --collect.perf_schema.file_instances
• --collect.perf_schema.tablelocks
--disable-tablestats-limit=disable-tablestats-limit
Table statistics collection will be disabled if there are more than specified number of tables (default:
server-defined). 0=no limit. Negative value disables collection.
--environment=environment
Environment name.
--cluster=cluster
Cluster name.
--replication-set=replication-set
--custom-labels=custom-labels
--skip-connection-check
--tls
--tls-skip-verify
--tls-cert-file=PATHTOCERT
--tls-key=PATHTOCERTKEY
--tls-ca-file=PATHTOCACERT
--ssl-ca=PATHTOCACERT
The path name of the Certificate Authority (CA) certificate file. If used must specify the same certificate
used by the server. (-ssl-capath is similar but specifies the path name of a directory of CA certificate
files.)
--ssl-cert=PATHTOCERTKEY
--ssl-key
--ssl-skip-verify
--metrics-mode=mode
Metrics flow mode for agents node-exporter. Allowed values: - auto : chosen by server (default) -
push : agent will push metrics - pull : server scrapes metrics from agent
PostgreSQL
FLAGS:
--node-id=<node id>
--username=<username>
PostgreSQL username.
--password=<password>
PostgreSQL password.
--query-source=<query source>
Source of SQL queries, one of: pgstatements , pgstatmonitor , none (default: pgstatements ).
--environment=<environment>
Environment name.
--cluster=<cluster>
Cluster name.
--replication-set=<replication set>
--custom-labels=<custom labels>
--skip-connection-check
--tls
--tls-skip-verify
--metrics-mode=mode
Metrics flow mode for agents node-exporter. Allowed values: - auto : chosen by server (default) -
push : agent will push metrics - pull : server scrapes metrics from agent
ProxySQL
FLAGS:
--node-id=node-id
--pmm-agent-id=pmm-agent-id
--username=username
ProxySQL username.
--password=password
ProxySQL password.
--environment=environment
Environment name.
--cluster=cluster
Cluster name.
--replication-set=replication-set
--custom-labels=custom-labels
--skip-connection-check
--tls
--tls-skip-verify
--metrics-mode=mode
Metrics flow mode for agents node-exporter. Allowed values: - auto : chosen by server (default) -
push : agent will push metrics - pull : server scrapes metrics from agent
--disable-collectors
HAProxy
FLAGS:
--server-url=SERVER-URL
--server-insecure-tls
--username=USERNAME
HAProxy username.
--password=PASSWORD
HAProxy password.
--scheme=SCHEME
--metrics-path=METRICS-PATH
Path under which metrics are exposed, used to generate URI (default: /metrics).
--listen-port=LISTEN-PORT
Listen port of haproxy exposing the metrics for scraping metrics (Required).
--service-node-id=SERVICE-NODE-ID
--environment=ENVIRONMENT
--cluster=CLUSTER
Cluster name.
--replication-set=REPLICATION-SET
--custom-labels=CUSTOM-LABELS
--metrics-mode=MODE
Metrics flow mode for agents node-exporter. Allowed values: - auto : chosen by server (default) -
push : agent will push metrics - pull : server scrapes metrics from agent
--skip-connection-check
OTHER COMMANDS
Add External source of data (like a custom exporter running on a port) to the monitoring
FLAGS:
--service-name="current-hostname"
--agent-node-id=AGENT-NODE-ID
--username=USERNAME
External username
--password=PASSWORD
External password
--scheme=http or https
--metrics-path=/metrics
--listen-port=LISTEN-PORT
--service-node-id=SERVICE-NODE-ID
--environment=prod
--cluster=east-cluster
Cluster name
--replication-set=rs1
--custom-labels=CUSTOM-LABELS
--metrics-mode=auto
Metrics flow mode, can be push : agent will push metrics, pull : server scrape metrics from agent or
auto : chosen by server.
--group="external"
Also, individual parameters can be set instead of --url like: sudo pmm-admin add external-serverless --
scheme=http --host=1.2.3.4 --listen-port=9093 --metrics-path=/metrics --container-name=ddd --
external-name=e125
Notice that some parameters are mandatory depending on the context. For example, if you specify --url ,
--schema and other related parameters are not mandatory but, if you specify --host you must provide
all other parameters needed to build the destination URL or even you can specify --address instead of
host and port as individual parameters.
FLAGS:
--url=URL
--scheme=https
--username=USERNAME
External username
--password=PASSWORD
External password
--address=1.2.3.4:9000
--host=1.2.3.4
--listen-port=9999
--metrics-path=/metrics
--environment=testing
Environment name
--cluster=CLUSTER
Cluster name
--replication-set=rs1
--custom-labels='app=myapp,region=s1'
--group="external"
--machine-id=MACHINE-ID
Node machine-id
--distro=DISTRO
Node OS distribution
--container-id=CONTAINER-ID
Container ID
--container-name=CONTAINER-NAME
Container name
--node-model=NODE-MODEL
Node model
--region=REGION
Node region
--az=AZ
EXAMPLES
pmm-admin status
pmm-admin status --wait=30s
URL : https://x.x.x.x:443/
Version: 2.5.0
PMM Client:
Connected : true
Time drift: 2.152715ms
Latency : 465.658µs
pmm-admin version: 2.5.0
pmm-agent version: 2.5.0
Agents:
/agent_id/aeb42475-486c-4f48-a906-9546fc7859e8 mysql_slowlog_agent Running
DISABLE COLLECTORS
For other collectors that you can disable with the --disable-collectors option, please visit the official
repositories for each exporter:
• node_exporter
• mysqld_exporter
• mongodb_exporter
• postgres_exporter
• proxysql_exporter
NAME
SYNOPSIS
DESCRIPTION
pmm-agent, part of the PMM Client package, runs as a daemon process on all monitored hosts.
COMMANDS
pmm-agent run
LOGGING
By default, pmm-agent sends messages to stderr and to the system log ( syslogd or journald on Linux).
• Parameter: StandardError
Example:
StandardError=file:/var/log/pmm-agent.log
• Parameter: pmm_log
Example:
pmm_log="/var/log/pmm-agent.log"
If you change the default log file name, reflect the change in the log rotation rules file /etc/logrotate.d/
pmm-agent-logrotate .
5.6 API
PMM Server lets you visually interact with API resources representing all objects within PMM. You can
browse the API using the Swagger UI, accessible at the /swagger/ endpoint URL:
Clicking an object lets you examine objects and execute requests on them:
• A Node represents a bare metal server, a virtual machine, a Docker container, or a more specific type
such as an Amazon RDS Node. A node runs zero or more Services and Agents, and has zero or more
Agents providing insights for it.
• A Service represents something useful running on the Node: Amazon Aurora MySQL, MySQL,
MongoDB, etc. It runs on zero (Amazon Aurora Serverless), single (MySQL), or several (Percona XtraDB
Cluster) Nodes. It also has zero or more Agents providing insights for it.
• An Agent represents something that runs on the Node which is not useful in itself but instead provides
insights (metrics, query performance data, etc) about Nodes and/or Services. An agent always runs on
the single Node (except External Exporters), and provides insights for zero or more Services and
Nodes.
Nodes, Services, and Agents have Types which define specific their properties, and the specific logic they
implement.
Nodes and Services are external by nature – we do not manage them (create, destroy), but merely maintain
a list of them (add to inventory, remove from inventory) in pmm-managed . Most Agents, however, are started
and stopped by pmm-agent . The only exception is the External Exporter Type which is started externally.
5.7 VictoriaMetrics
VictoriaMetrics is a third-party monitoring solution and time-series database that replaced Prometheus in
PMM 2.12.0.
VictoriaMetrics allows metrics data to be ‘pushed’ to the server in addition to it being ‘pulled’ by the server.
When setting up services, you can decide which mode to use.
For PMM 2.12.0 the default mode is ‘pull’. Later releases will use the ‘push’ mode by default for newly-
added services.
The mode (push/pull) is controlled by the --metrics-mode flag for the pmm-admin config and
pmm-admin add commands.
If you need to change the metrics mode for an existing Service, you must remove it and re-add it with the
same name and the required flags. (There is currently no ability to “update” a service.)
Direct Prometheus paths return structured information directly from Prometheus, bypassing the PMM
application.
They are accessed by requesting a URL of the form <PMM SERVER URL>/prometheus/<PATH> .
As a result of the move to VictoriaMetrics some direct Prometheus paths are no longer available.
• /prometheus/alerts → No change.
• /prometheus/rules → No change.
• /prometheus/service-discovery → No equivalent.
• /prometheus/targets → /victoriametrics/targets .
5.7.3 Troubleshooting
• Google Groups
• Slack
• Telegram
5.8 Glossary
5.8.1 Annotation
5.8.2 Dimension
In the Query Analytics dashboard, to help focus on the possible source of performance issues, you can
group queries by dimension, one of: Query, Service Name, Database, Schema, User Name, Client Host
5.8.3 EBS
5.8.4 Fingerprint
A normalized statement digest—a query string with values removed that acts as a template or typical
example for a query.
5.8.5 IAM
5.8.6 MM
Metrics Monitor.
5.8.7 NUMA
5.8.8 PEM
5.8.9 QPS
Component of PMM Server that enables you to analyze MySQL query performance over periods of time.
5.8.11 STT
Releases intended for public preview and feedback but with no support or service level agreement (SLA).
Should not be used on production or business-critical systems. May contain breaking changes to UI, API,
CLI. (Read more.)
5.8.13 VG
Volume Group.
6. FAQ
• Discord channel.
• PMM Server
• PMM Client
For example, to add complete MySQL monitoring for two local MySQL servers, the commands would be:
sudo pmm-admin add mysql --username root --password root instance-01 127.0.0.1:3001
sudo pmm-admin add mysql --username root --password root instance-02 127.0.0.1:3002
When you remove a monitoring service, previously collected data remains available in Grafana. However,
the metrics are tied to the instance name. So if you add the same instance back with a different name, it
will be considered a new instance with a new set of metrics. So if you are re-adding an instance and want
to keep its previous data, add it with the same name.
6.9 Can I add an AWS RDS MySQL or Aurora MySQL instance from a non-default
AWS partition?
By default, the RDS discovery works with the default aws partition. But you can switch to special regions,
like the GovCloud one, with the alternative AWS partitions (e.g. aws-us-gov ) adding them to the Settings via
the PMM Server API.
To specify other than the default value, or to use several, use the JSON Array syntax: ["aws", "aws-cn"] .
6.10 How do I troubleshoot communication issues between PMM Client and PMM
Server?
See Troubleshoot PMM Server/PMM Client connection.
With these methods you must configure alerting rules that define conditions under which an alert should be
triggered, and the channel used to send the alert (e.g. email).
Alerting in Grafana allows attaching rules to your dashboard panels. Grafana Alerts are already integrated
into PMM Server and may be simpler to get set up.
Alertmanager allows the creation of more sophisticated alerting rules and can be easier to manage
installations with a large number of hosts. This additional flexibility comes at the expense of simplicity.
We only offer support for creating custom rules to our customers, so you should already have a working
Alertmanager instance prior to using this feature.
See also PMM Alerting with Grafana: Working with Templated Dashboards
6.13 How do I use a custom Prometheus configuration file inside PMM Server?
Normally, PMM Server fully manages the Prometheus configuration file.
However, some users may want to change the generated configuration to add additional scrape jobs,
configure remote storage, etc.
From version 2.4.0, when pmm-managed starts the Prometheus file generation process, it tries to load the /
srv/prometheus/prometheus.base.yml file first, to use it as a base for the prometheus.yml file.
The prometheus.yml file can be regenerated by restarting the PMM Server container, or by using the
SetSettings API call with an empty body.
See also
• API
PMMPASSWORD="mypassword"
echo "Waiting for PMM to initialize to set password..."
until [ "`docker inspect -f {{.State.Health.Status}} pmm2-server`" = "healthy" ]; do sleep 1;
done
docker exec -t pmm2-server bash -c "ln -s /srv/grafana /usr/share/grafana/data; grafana-cli --
homepath /usr/share/grafana admin reset-admin-password $PMMPASSWORD"
7. Release Notes
Percona Monitoring and Management (PMM) is a free and open-source platform for managing and
monitoring MySQL, MongoDB, and PostgreSQL performance.
• Custom certificates help define proper security levels for remotely monitored MySQL instances,
including Google Cloud SQL.
• Usability improvements to the External Monitoring UI. When filling parameters, you can enter the parts
of an endpoint (scheme, host, path) or let PMM automatically extract them from a URL.
• pg_stat_monitor 0.9.0 support. This change will give you compatibility with the latest version. Support
for new features will be in an upcoming release.
• Single-line install of PMM Server on supported Linux distributions (this feature is in Technical Preview).
• It is easier to experience DBaaS functionality; you can quickly turn it ON/OFF in Advanced settings
on the Settings page. (Read more)
• Database components management will enable PMM administrators to limit users in your
organization to specific (admin-approved) database versions in their DBaaS DB Clusters.
• For PXC clusters created using DBaaS, HAProxy will now be used by default. Please note: Monitoring
of the HAProxy in DBaaS will be enabled in an upcoming release.
• Changes to Sign in to Percona Platform. From this release, Registration of the Percona account will be
more secure and require additional confirmation.
• PMM-7863: DBaaS: Ability to specify in K8s configuration the version of HAProxy to be used for DB
creation
• PMM-7848, PMM-7847, PMM-7421: Add support for using SSL certificates between pmm-admin and
monitored MySQL databases
• PMM-7883: Single-line install of PMM Server on supported Linux distributions - [Technical Preview]
• PMM-7013, PMM-7819: DBaaS: Use HAProxy by default instead of ProxySQL for MySQL DB clusters
7.2.3 Improvements
• PMM-7064: Integrated Alerting: Presenting severity of the Alert Rule using different colors
• PMM-7946: Better error message on PMM client if server doesn’t support HAProxy
• PMM-7641, PMM-7820: Add DBaaS to Technical Preview section and allow user to Enable/Disable via UI
• PMM-7966: Telemetry: Collect enabled/disabled status for Integrated Alerting and Security Threat Tool
features
• PMM-7911: DBaaS: Invalid Number of Nodes results in an annoying error message pop-up
• PMM-8037: User can create a Percona Platform account without proper confirmation
• PMM-7920: PostgreSQL Exporter has increased memory usage with pmm-client 2.15.1 & pmm-server
2.16.0
• PMM-7700: Integrated Alerting: Rule API crashing with more than two parameters or invalid values
• PMM-7396: Integrated Alerting: Alerts tab error if user deletes Alert Rule which has Firing alerts
Percona Monitoring and Management (PMM) is a free and open-source platform for managing and
monitoring MySQL, MongoDB, and PostgreSQL performance.
AWS monitoring in PMM now covers PostgreSQL RDS and PostgreSQL Aurora types. PMM will include
them in a Discovery UI where they can be added which will result in node related metrics as well as
PostgreSQL database performance metrics. Before this release, this was available only to MySQL-
related instances from Amazon RDS.
Technical Preview: PMM will have the same level of support for Microsoft Azure Database as a Service
(DBaaS) as we have for AWS’s DBaaS (RDS/Aurora on MySQL or PostgreSQL). You will be able to easily
discover and add Azure databases for monitoring by PMM complete with node-level monitoring. This
feature is available only if you explicitly activate it on the PMM Settings page. Deactivating it will not
remove added services from monitoring, but will just hide the ability to discover and add new Microsoft
Azure Services.
(This is a feature technical preview because we want to release it as soon as possible to get feedback
from users. We are expecting to do more work on this feature to make it more API and resource
efficient.)
Security Threat Tool users are now able to control the Security Check execution time intervals for
groups of checks, move checks between groups, and disable individual checks if necessary.
Added compatibility with pg_stat_monitor plugin v 0.8.0. This is not exposing the new features for the
plugin in PMM yet, but ensures Query Analytics metrics are collected to the same degree it was with
version 0.6.0 of the plugin.
Reworked the PMM Settings page to make it clear what features are in Technical Preview vs General
Availability (GA) and to simplify activation/deactivation of technical preview features. We also provide a
better definition of what a Technical Preview is.
• Migration of Settings and other service pages in PMM from Grafana dashboards
The PMM Settings page and several others (including Add Instance and Inventory) are being converted
to Grafana pages and will no longer be presented as dashboards. Additionally, we’re moving the menu
to the sidebar navigation for consistency and more flexibility compared to the older menu structure.
We released the next stage of improvements in Integrated Alerting functionality of PMM to simplify the
usage of the feature. Together with improvements, we continue fixing known bugs in this feature.
Technical preview: While creating a DB cluster a user can see a prediction of the resources this
cluster will consume with all components as well as the current total and available resources in the
K8s. Users will be warned that if they attempt to create a DB cluster it may be unsuccessful because of
available resources in the K8s.
DBaaS in PMM will be using the recently-released Percona Kubernetes Operator for Percona Server for
MongoDB 1.7.0 to create MongoDB clusters.
• PMM-7313, PMM-7610: Ability to discover and monitor Amazon RDS PostgreSQL instances with
collecting PostgreSQL and RDS node metrics (Thanks to Daniel Guzman Burgos for reporting this
issue).
• PMM-7345: Expose metrics for all available databases on a PMM monitored PostgreSQL server.
• PMM-7344: Update postgres_exporter version from 0.4.6 to 0.8.0. (See the full list of improvements in
the changelog.)
• PMM-7767, PMM-7696: Implement feature flag to enable Microsoft Azure monitoring. Users can use the
UI or set an environment variable ( ENABLE_AZUREDISCOVER=1 ) during container creation.
• PMM-7684, PMM-7498: Ability to discover running and supported Microsoft Azure Databases instances
in a provided account.
• PMM-7678, PMM-7679, PMM-7676, PMM-7499, PMM-7691: Prepare, modify and use azure_exporter to
collect Node related metrics.
• PMM-7681: Use Microsoft Azure metrics on Node/OS-related dashboards to show the metrics on
panels.
• PMM-7339: Security Threat Tool: Ability to execute security checks individually and on-demand.
• PMM-7451, PMM-7337: Security Threat Tool: Ability to change intervals for security checks on the PMM
Settings page.
• PMM-7772, PMM-7338: Security Threat Tool: Ability to change default execution interval per check.
• PMM-7336: Security Threat Tool: Execute checks based on execution interval they belong to.
• PMM-7335: Security Threat Tool: Ship security check files with predefined execution interval.
• PMM-7748: Add an additional experimental menu for Dashboards on the left side panel.
• PMM-7688: Unify UX and layout of all PMM specific pages like Settings, Add Instance etc.
• PMM-7687: Modify links in menus to ensure both menus are working as expected after dashboard URL
change.
• PMM-7705: Simplify display of features in technical preview to easily identify them and their current
state.
• PMM-7522, PMM-7511: Integrated Alerting: Improve Notification Channels UX by Pagination for the
Notification list.
• PMM-7521, PMM-7510: Integrated Alerting: Improve Alert Rule Templates UX by Pagination on Rule
Templates list.
• PMM-7652, PMM-7674, PMM-7503, PMM-7486: DBaaS: While creating the DB cluster see all and
available resources in the K8s cluster, such as Disk, CPU & Memory.
• PMM-7508, PMM-7488: DBaaS: See predicted resource usage for selected DB Cluster configuration.
• PMM-7364: DBaaS: Show warning before starting creating the cluster if there are not enough resources
in the K8s cluster to create DB Cluster with requested configuration.
• PMM-7580, PMM-7359: DBaaS: Users can select the database version to use during DB Cluster
creation.
7.3.3 Improvements
• PMM-7506: Security Threat Tool: Reduce False Positives due to Roles automatically created in PXC with
no password but cannot be used to login.
• PMM-7571: Modified Percona Platform Account registration flow from PMM server UI.
• PMM-7513: Integrated Alerting: Ability to see default values and Threshold values during the Alert Rule
creation.
• PMM-7375: Integrated Alerting: Inform users about the template that they are editing and warn them
about the limitations.
• PMM-7131, PMM-7555: QAN for PostgreSQL attempts to connect to a database with the same name as
the username. (Thanks to Daniel Guzman Burgos for reporting this issue)
• PMM-7481: Query Analytics is not showing “Query with Errors” in the Profile section.
• PMM-7434: Integrated Alerting: Unknown parameters [threshold] error during Add/Update Alert Rule.
• PMM-7379: Integrated Alerting: Can not edit Alert Rule Name through API.
• PMM-7232: Integrated Alerting: Disabling IA does not disable rules evaluation and notifications
sending.
• PMM-7119: Integrated Alerting: Update error notification for adding/update Alert rule Template – There
was inconsistent behavior if you tried to add a new Rule Template with an already-used name.
• PMM-7543: Integrated Alerting: selected section disappears from a breadcrumb after clicking the tab
for a second time.
• PMM-7351: DBaaS: Safari does not accept float numbers as a custom option in the “Create Cluster”
dialogue.
• PMM-7701: DBaaS: PSMDB clusters stuck in initializing due to special characters in secrets.
Percona Monitoring and Management (PMM) is a free and open-source platform for managing and
monitoring MySQL, MongoDB, and PostgreSQL performance.
This patch release fixes performance issues discovered in systems, together with other small fixes.
• PMM-7635: Fix high CPU consumption by Grafana server after upgrade by docker replacement to
2.15.0 with large numbers of services in ‘push’ mode.
• PMM-7713: Fix high CPU and Memory consumption by Victoria Metrics after upgrade by docker
replacement to 2.15.0 with large numbers of services in ‘pull’ mode.
• PMM-7470: MongoDB exporter IndexStatsCollections is assigned values from wrong flag (intended
for 2.15.0, omitted due to missing merge cutoff) (Thanks to Tim for reporting this issue).
• PMM-1531: Metrics not being collected due to rename of MySQL 8 information schema tables.
Percona Monitoring and Management (PMM) is a free and open-source platform for managing and
monitoring MySQL, MongoDB, and PostgreSQL performance.
With this feature users can disable any collector used by PMM to get metrics. When metrics cannot be
collected or are no longer needed, disabling the collector(s) prevents PMM from flooding logs and saves
infrastructure resources.
Our vision for PMM collectors is to provide “stop from collecting” functionality to prevent possible harm
to the user environment. This “disable” feature is an initial step towards the ideal functionality. The full
and flexible management for “What metrics to collect and in what resolution” is slated for future
releases.
Since PMM 1.4.0, users had the ability to monitor external services Percona didn’t currently support
(e.g., Redis). This blog article from 2018 nicely described external services monitoring at that time. (At
that time Percona was not natively supporting a PostgreSQL monitoring service and so this was listed
as an external service. Today, PostgreSQL is natively supported by PMM.)
Until now, PMM 2.x didn’t support external services monitoring. With this release, any non-natively
supported by PMM service will now become supported with external services monitoring. You can see
the list of possible exporters to be used in https://prometheus.io/docs/instrumenting/exporters/.
Natively-supported services will continue to deliver an expanded set of metrics and insights.
With the addition of pt-*-summary in PMM 2, users can now view summary information about services
and nodes on PMM’s dashboard. This summary information is in the industry common format of
pt-*-summary tools output to simplify portability of this data. This format will also be preserved in the
snapshot of the dashboard shared with Percona Support to simplify investigations of issues.
• pt-mysql-summary
• pt-mongodb-summary
• pt-pg-summary
• pt-summary
Users are able to add HAProxy Services for monitoring in PMM2. The support level of them in PMM will
be the same we have for ProxySQL, so they will be presented in Inventory and on Dashboard. This will
allow users who use HAProxy in their HA configuration to also have this component monitored by PMM.
In future releases PMM will start use HAProxy by default for the DBaaS feature and will also use this
functionality to monitor HAProxy.
From now you will be able to see the progress of internal steps the system makes when executing some
operations with DBaaS. The Progress Bar will not be time-related and will present only steps. The
Progress Bar component will also reflect the K8s/Operator-related errors to the user, so in the case of
errors, you will have the error text on the UI, and no need to use K8s tools to see the error. With the
same UI, you will be able to see the latest logs from K8s so they will have even more information about
why the error happened.
Known Limitations: The progress bar will not provide valuable information for the Delete operation (will
be in a later version when we’ll change the API with Operators Team), Operation of DB Cluster
Modification will have “strange” behavior and will start changes from non-zero values of steps. (This
will be modified after API changes.)
• PMM-4172, PMM-4306, PMM-5784, PMM-7177: Services and Nodes Summary presentation. Present
information about DB’s and Node status using pt-mysql-summary , pt-mongodb-summary , pt-pg-summary
outputs (in API and on Dashboards).
• PMM-6711: Add external-group flag for pmm-admin inventory commands for simpler work with
External services.
• PMM-7405: Check connection response format when adding External Service to monitoring.
• PMM-6797: HAProxy monitoring: Ability to add HAProxy services with pmm-admin [inventory] add
[service] haproxy command.
• PMM-7487: HAProxy monitoring: Check connection to HAProxy services when adding them for
monitoring.
• PMM-6924: Integrated Alerting: Show ‘breadcrumbs’ navigation aid on non-dashboard pages as well
as Grafana dashboard pages.
• PMM-7294: Integrated Alerting: Pagination for viewing large numbers of Alert Rules.
• PMM-7417: Security Threat Tool: Show list of all available security checks.
• PMM-7266: DBaaS: Cluster creation progress bar – You can now see the progress of DBaaS DB cluster
creation. (The progress bar is based on the number of back-end technical steps, not the time required
to perform the tasks.)
7.5.3 Improvements
• PMM-4679: Docker: :latest tag for pmm-server and pmm-client images has been moved from v1
latest release to v2 latest release. Note: use of the latest tag is not recommended in production
environments, instead use :2 tag.
• PMM-7472: Remove Prometheus data source – If you were using custom dashboards with a specified
data source (not using empty to use default one) you may need to edit your dashboards to use the
proper data source. PMM is no longer using Prometheus but uses compatible storage for metrics from
VictoriaMetrics. We renamed the data source to be more technology-agnostic.
• PMM-6695: Software update: Grafana 7.1.3 to 7.3.7 (See What’s new in Grafana 7.2 and What’s new
in Grafana 7.3.)
• PMM-7471: Software update: VictoriaMetrics 1.52.0 to 1.53.1 (See VictoriaMetrics 1.53.0 and
VictoriaMetrics 1.53.1.)
• PMM-6693: API keys usage – PMM users can now use API keys (generated in Grafana UI) for
interaction with PMM server instead of username/password pairs. The API key should have the same
level of access (Admin or Viewer) as is required for username/password pairs.
• PMM-7240: DBaaS: Change from Dashboard to Grafana Page – We changed the DBaaS page from a
Grafana Dashboard to a Grafana Page to be better aligned with the DBaaS enable/disable status and
avoid confusion when DBaaS is disabled.
• PMM-7328: Security Threat Tool: Download and run checks when activated, immediately, repeating
every 24 hours thereafter (Previously, downloading and running new checks happened every 24 hours
but the cycle didn’t begin when STT was activated.)
• PMM-7329: Security Threat Tool: Hide check results tab if STT is disabled.
• PMM-7331: Security Threat Tool: Failed checks have ‘Read more’ links with helpful content.
• PMM-7422: Security Threat Tool: View all active and silenced alerts.
• PMM-7257,PMM-7433: Integrated Alerting: Easier-to-read rule details in Alert Rules list (API and UI
presentation).
• PMM-7259: Integrated Alerting: Better UI error reporting for disabled Integrated Alerting. (Hint to users
how to enable it.)
• PMM-5837: pmm-agent reports “Malformed DSN” error when adding PostgreSQL instance with a PMM
user password containing = (equals sign) (Thanks to Alexandre Barth for reporting this issue).
• PMM-5969: Removing Services or Nodes with pmm-admin ... --force mode does not stop running
agents, VictoriaMetrics continues collecting data from exporters.
• PMM-6685: In low screen resolutions Services submenu wraps, becomes obscured, and can’t be
accessed.
• PMM-6681: Not all PMM admin users can download diagnostic logs, only those with Grafana admin
rights.
• PMM-7227: Table stats metrics not being collected in instances with millions of tables.
• PMM-7426: vmagent continually restarts, blocking comms between pmm-agent & pmm-managed – Users
running multiple services on the same PMM agent in ‘push’ mode could face this issue when restarting
the agent after bulk-adding services.
• PMM-6636: Dashboards: MySQL Replication Summary: ‘Binlog Size’, ‘Binlog Data Written Hourly’,
‘Node’ not being charted when the instance is RDS.
• PMM-7325: Dashboards: MySQL User Details: user labels unreadable with high number (>20) of users
(Thanks to Andrei Fedorov for reporting this issue).
• PMM-7416: Dashboards: PostgreSQL Instance Summary: Some panels (e.g. Tuple) not using selected
database.
• PMM-7235: Integrated Alerting: Filtered out alerts are shown in the UI as firing.
• PMM-7324: Integrated Alerting: Add Pager Duty Notification Channel: after user pastes copied key Add
button is not enabled.
• PMM-7346: Integrated Alerting: It is possible to create Alert Rule with negative duration time.
• PMM-7366: Integrated Alerting: Entities (e.g. templates, channels, rules) are in inconsistent states.
• PMM-7467: Integrated Alerting: < (less-than symbol) wrongly interpreted by Alert templates (as
< ).
• PMM-7591: Integrated Alerting: User can not receive notifications on email after password update.
• PMM-7343: Security Threat Tool: Check results show previously failed checks after STT re-enabled.
• PMM-7250: DBaaS: Confusing error “Cannot get PSMDB/PXC cluster” appears after removing DB
cluster.
• PMM-7349: DBaaS: Host and Password occasionally disappearing from Connection column.
Percona Monitoring and Management (PMM) is a free and open-source platform for managing and
monitoring MySQL, MongoDB, and PostgreSQL performance.
In PMM 2.12.0, Percona replaced its metrics collection engine (formerly Prometheus) with
VictoriaMetrics. Historically, PMM used a pull method with Prometheus while VictoriaMetrics can operate
in either a pull or push method. When PMM 2.12.0 was released, Percona kept the default method as
pull. Now with PMM 2.14.0, Percona is shifting the default to push for all newly-added instances. This
blog post describes the two methods and why push benefits users. Also, here is a post by Peter Zaitzev
of FAQs relating to the move to VictoriaMetrics and the push model. Documentation on the push
method is here.
Note: Installing the 2.14.0 or newer PMM server will change the default behavior on 2.12.0 and 2.13.0
clients from “pull” method to “push” for any newly added services. Existing services will remain in
whatever mode they were prior to upgrade.
In 2.13.0 we introduced Percona’s Database as a Service (DBaaS) which enables non-DBAs (software
architects, developers, site reliability engineers, etc.) to perform typical DBA tasks to manage an
organization’s database environment via user interfaces and automation orchestration. This release
contains several enhancements and fixes, many directly from user feedback.
Note: This capability is feature-flagged and turned off by default. Users require a variable to be passed
to PMM to expose this functionality.
Improvements to the user experience for adding and viewing external services (any data that can be
monitored by a Prometheus exporter such as: non-Percona supported databases like Redis,
ElasticSearch, Cassandra, etc. or an organization’s external application) on the Node Summary
dashboard of PMM.
• PMM-5765: Ability to monitor External Services for situations where PMM Client can’t be installed – Uses
a new command pmm-admin add external-serverless . (See pmm-admin.) (This is a Technical
Preview feature)
7.6.3 Improvements
• PMM-7145: ‘Push’ metrics mode is default when adding services and nodes (All agents collecting data
from Services and Nodes will now use PUSH model if not specified explicitly. You will still be able to use
--metrics-mode flag to use Pull metrics if needed. All previously set up agents will keep their existing
mode. To change these you need to remove and re-add them.)
• PMM-7282: Integrated Alerting: Ability to create rule without channels and filters
• PMM-7065: Integrated Alerting: Show rule details for items in Alert Rules list
• PMM-7048: DBaaS: Simplify Cluster creation by moving Create Cluster button to earlier steps
• PMM-6993: Protect against possible problems with EXPLAIN of stored functions in MySQL – We are
fixing possible problems caused by an attempt to analyze queries covered in https://bugs.mysql.com/
bug.php?id=67632.
• PMM-7312: Error when accessing Metrics data on Dashboards for large installations
• PMM-7144: DBaaS: Creating DB cluster with same name (Thanks to Beata Handzelova for reporting
this issue)
• PMM-7323: DBaaS: ‘Remove DB Cluster from Kubernetes Cluster’ removes wrong one
• PMM-7251: Integrated Alerting: Error Rule with ID "mysql_version" not found if both Security Threat
Tool and Integrated Alerting enabled
• PMM-7169: Old data (from Prometheus) not deleted when Retention period expires
• PMM-7213: MySQL InnoDB Details dashboard: remove color-coding on ‘Data Buffer Pool Fit’ element
• PMM-7167: Some panels not visible when using long time intervals (e.g. 30 days)
• PMM-7103: VictoriaMetrics build logs not deleted from PMM Server Docker image
• PMM-6490: rds_exporter crashes when more than 100 AWS RDS instances added (Thanks to https://
github.com/vlinevych for fixing this)
• PMM-6096: pmm-agent connection checker does not check authentication for MongoDB
• PMM-7303: Disk Details, Nodes Compare dashboards: ‘Disk Utilization’ description is confusing
Percona Monitoring and Management (PMM) is a free and open-source platform for managing and
monitoring MySQL, MongoDB, and PostgreSQL performance.
Allows PMM administrators to set up configured SSL certificate “keys” to authenticate the connection to
PMM, specifically for setting up MongoDB. This is a critical security requirement especially in large
enterprise infrastructure environments.
2. Technical Previews
Note: We do not recommend the use of technical preview features in enterprise or production
environments until the functionality is released as general availability (GA). While in Technical Preview
status, these features are not supported by Percona Support SLA, except by Product/Engineering on a
best-efforts basis.
A new feature in PMM to set up parameters and revive alerts about the Services and Nodes monitored
by PMM.
Improves the user experience for adding and viewing external services on the Node Summary
dashboard of PMM. External services means any data that can be monitored by a Prometheus
exporter, for example, non-Percona supported databases like Redis, ElasticSearch, Cassandra, etc. or
an organization’s external application.
We are also releasing the first preview of DBaaS functionality; when combined with a compatible
Kubernetes environment and Percona Operators, you can create Percona XtraDB or MongoDB clusters
with just a few clicks. (Read more about configuration and usage.)
7.7.2 Improvements
• PMM-6713: Node Summary/Nodes Overview dashboards: External exporters can now be added to
dashboard and shown as part of grouping of a broader service
• PMM-7173: VictoriaMetrics updated to 1.50.2: Includes HTML pages vs JSON output and new functions
available for alerting rules (see all tags)
• PMM-7092: PMM Server Docker update from 2.11.1 to 2.12.0 leaves container in unhealthy state
(Thanks to Hubertus Krogmann for reporting this issue)
• PMM-7208: Confusing “Access denied” message for ‘Viewer’ users on many dashboards
• PMM-6987: No IP address shown in log file of OVF appliance running in headless mode
• PMM-7146: MongoDB Instance Summary dashboard: ReplSet element showing metric name instead
of replication set
Percona Monitoring and Management (PMM) is a free and open-source platform for managing and
monitoring MySQL, MongoDB, and PostgreSQL performance.
• VictoriaMetrics replaces Prometheus and is now the default data source. VictoriaMetrics supports both
PUSH (client to server) and PULL metrics collection modes. (Read more.)
• The ‘Add Instance’ page and forms have been redesigned and look much better.
• PMM-5799: PMM Client now available as docker image in addition to RPM, DEB and .tgz
• PMM-6968: Integrated Alerting: Basic notification channels actions API Create, Read, Update, Delete
• PMM-6395: Replace Prometheus with VictoriaMetrics in PMM for better performance and additional
functionality
7.8.3 Improvements
• PMM-6744: Prevent timeout of low resolution metrics in MySQL instances with many tables (~1000’s)
• PMM-6504: MySQL Replication Summary: MySQL Replication Delay graph not factoring in value of
intentionally set SQL_Delay thus inflating time displayed
• PMM-6820: pmm-admin status --wait option added to allow for configurable delay in checking status
of pmm-agent
• PMM-6710: pmm-admin : Allow user-specified custom ‘group’ name when adding external services
• PMM-6825: Allow user to specify ‘listen address’ to pmm-agent otherwise default to 127.0.0.1
• PMM-6759: Enable Kubernetes startup probes to get status of pmm-agent using ‘GET HTTP’ verb
• PMM-6736: MongoDB Instance Summary dashboard: Ensure colors for ReplSet status matches those
in MongoDB ReplSet Summary dashboard for better consistency
• PMM-6730: Node Overview/Summary Cleanup: Remove duplicate service type ‘DB Service Connections’
• PMM-6542: PMM Add Instance: Redesign page for more intuitive experience when adding various
instance types to monitoring
• PMM-6518: Update default data source name from ‘Prometheus’ to ‘Metrics’ to ensure graphs are
populated correctly after upgrade to VictoriaMetrics
• PMM-6428: Query Analytics dashboard - Ensure user-selected filter selections are always visible even if
they don’t appear in top 5 results
• PMM-5020: PMM Add Remote Instance: User can specify ‘Table Statistics Limit’ for MySQL and AWS
RDS MySQL to disable table stat metrics which can have an adverse impact on performance with too
many tables
• PMM-6811: MongoDB Cluster Summary: when secondary optime is newer than primary optime, lag
incorrectly shows 136 years
• PMM-6650: Custom queries for MySQL 8 fail on 5.x (on update to pmm-agent 2.10) (Thanks to user
debug for reporting this issue)
• PMM-6751: PXC/Galera dashboards: Empty service name with MySQL version < 5.6.40
• PMM-5823: PMM Server: Timeout when simultaneously generating and accessing logs via download or
API
• PMM-4547: MongoDB dashboard replication lag count incorrect (Thanks to user vvol for reporting this
issue)
• PMM-7057: MySQL Instances Overview: Many monitored instances (~250+) gives ‘too long query’ error
• PMM-6883: Query Analytics: ‘Reset All’ and ‘Show Selected’ filters behaving incorrectly
• PMM-6007: PMM Server virtual appliance’s IP address not shown in OVF console
• PMM-6752: Query Analytics: Time interval not preserved when using filter panel dashboard shortcuts
• PMM-6537: MySQL InnoDB Details - Logging - Group Commit Batch Size: giving incorrect description
• PMM-6055: PMM Inventory - Services: ‘Service Type’ column empty when it should be ‘External’ for
external services
Workaround: A folder is not created on container upgrade and will need to be created manually for one of
the components. Before starting the new pmm-server 2.12.0, execute:
Percona Monitoring and Management (PMM) is a free and open-source platform for managing and
monitoring MySQL, MongoDB, and PostgreSQL performance.
Percona Monitoring and Management (PMM) is a free and open-source platform for managing and
monitoring MySQL, MongoDB, and PostgreSQL performance.
• PMM-6515: Link added directly to Node/Service page from Query Analytics filters, opens in new window
7.10.2 Improvements
• PMM-6609: MySQL Instances Compare & Summary dashboards: Changed metric in ‘MySQL Internal
Memory Overview’
• PMM-6598: Dashboard image sharing (Share Panel): Improved wording with link to configuration
instructions
• PMM-6554: MySQL InnoDB Details dashboard: Add “sync flushing” to “InnoDB Flushing by Type”
• PMM-4547: MongoDB dashboard replication lag count incorrect (Thanks to user vvol for reporting this
issue)
• PMM-6765: Tables information tab reports ‘table not found’ with new PostgreSQL extension
pg_stat_monitor
• PMM-6764: Query Analytics: cannot filter items that are hidden - must use “Show all”
• PMM-6532: Click-through URLs lose time ranges when redirecting to other dashboards
Percona Monitoring and Management (PMM) is a free and open-source platform for managing and
monitoring MySQL, MongoDB, and PostgreSQL performance.
• PMM-6643: New MongoDB exporter has higher CPU usage compared with old
Percona Monitoring and Management (PMM) is a free and open-source platform for managing and
monitoring MySQL, MongoDB, and PostgreSQL performance.
• PMM-5738: Enhanced exporter: replaced original mongodb-exporter with a completely rewritten one
with improved functionality
• PMM-5126: Query Analytics Dashboard: Search by query substring or dimension (Thanks to user
debug for reporting this issue)
• PMM-6568: Reusable user interface component: Pop-up dialog. Allows for more consistent interfaces
across PMM
• PMM-6375, PMM-6373, PMM-6372: Sign in, Sign up and Sign out UI for Percona Account inside PMM
Server
• PMM-3831: Node Summary Dashboard: Add pt-summary output to dashboard to provide details on
system status and configuration
7.12.2 Improvements
• PMM-6647: MongoDB dashboards: RocksDB Details removed, MMAPv1 & Cluster Summary changed
• PMM-6536: Query Analytics Dashboard: Improved filter/time search message when no results
• PMM-6336: Suppress sensitive data: honor pmm-admin flag --disable-queryexamples when used in
conjunction with --query-source=perfschema
• PMM-6244: MySQL InnoDB Details Dashboard: Inverted color scheme on “BP Write Buffering” panel
• PMM-6294: Query Analytics Dashboard doesn’t resize well for some screen resolutions (Thanks to user
debug for reporting this issue)
• PMM-5701: Home Dashboard: Incorrect metric for DB uptime (Thanks to user hubi_oediv for
reporting this issue)
• PMM-6427: Query Analytics dashboard: Examples broken when switching from MongoDB to MySQL
query
• PMM-5684: Use actual data from INFORMATION_SCHEMA vs relying on cached data (which can be 24 hrs
old by default)
• PMM-6440: MongoDB ReplSet Summary Dashboard: Primary shows more lag than replicas
• PMM-6436: Query Analytics Dashboard: Styles updated to conform with upgrade to Grafana 7.x
• PMM-6415: Node Summary Dashboard: Redirection to database’s Instance Summary dashboard omits
Service Name
• PMM-6324: Query Analytics Dashboard: Showing stale data while fetching updated data for query
details section
• PMM-6276: PMM Inventory: Long lists unclear; poor contrast & column headings scroll out of view
• PMM-6643: High CPU usage for new MongoDB exporter (fixed in Percona Monitoring and Management
2.10.1)
Percona Monitoring and Management (PMM) is a free and open-source platform for managing and
monitoring MySQL, MongoDB, and PostgreSQL performance.
7.13.1 Improvements
• PMM-6300: Query Analytics Dashboard: Column sorting arrows made easier to use (Thanks to user
debug for reporting this issue)
• PMM-6208: Security Threat Tool: Temporarily silence viewed but un-actioned alerts
• PMM-6274: MySQL User Details Dashboard: View selected user’s queries in Query Analytics Dashboard
• PMM-6266: Query Analytics Dashboard: Pagination device menu lists 25, 50 or 100 items per page
• PMM-6262: PostgreSQL Instance Summary Dashboard: Descriptions for all ‘Temp Files’ views
• PMM-6211: Query Analytics Dashboard: Loading activity spinner added to Example, Explain and Tables
tabs
• PMM-5783: Bulk failure of SHOW ALL SLAVES STATUS scraping on PS/MySQL distributions triggers
errors
• PMM-6294: Query Analytics Dashboard doesn’t resize well for some screen resolutions (Thanks to user
debug for reporting this issue)
• PMM-6319: Query Analytics Dashboard: Query scrolls out of view when selected
• PMM-6256: Query Analytics Dashboard: InvalidNamespace EXPLAIN error with some MongoDB queries
• PMM-6259: Query Analytics Dashboard: Slow appearance of query time distribution graph for some
queries
• PMM-6189: Disk Details Dashboard: Disk IO Size chart larger by factor of 512
• PMM-6269: Query Analytics Dashboard: Metrics drop-down list obscured when opened
• PMM-6247: Query Analytics Dashboard: Overview table not resizing on window size change
Percona Monitoring and Management (PMM) is a free and open-source platform for managing and
monitoring MySQL, MongoDB, and PostgreSQL performance.
7.14.1 Highlights
This release brings a major rework of the Query Analytics (QAN) component, completing the migration from
Angular to React, and adding new UI functionality and features.
• PMM-6124: New dashboards: MongoDB Replica Set Summary and MongoDB Cluster Summary
• PMM-5563: Per-Service and per-Node Annotations (This completes the work on improvements to the
Annotation functionality.)
7.14.3 Improvements
• PMM-6114: Sort Agents, Nodes, and Services alphabetically by name in Inventory page (Thanks to user
debug for reporting this issue)
• PMM-5800: QAN explain and tables tabs not working after removing MySQL metrics agent
• PMM-6191: Incorrect computation for Prometheus Process CPU Usage panel values in Prometheus
dashboard
• PMM-6175: Node Overview dashboard shows unit for unit-less value ‘Top I/O Load’
Percona Monitoring and Management (PMM) is a free and open-source platform for managing and
monitoring MySQL, MongoDB, and PostgreSQL performance.
7.15.1 Improvements
• PMM-544: Agents, Services and Nodes can now be removed via the ‘PMM Inventory’ page
• PMM-5365: Client fails to send non-UTF-8 query analytics content to server (Thanks to user romulus
for reporting this issue)
• PMM-5920: Incorrect metric used in formula for “Top Users by Rows Fetched/Read” graph
Percona Monitoring and Management (PMM) is a free and open-source platform for managing and
monitoring MySQL, MongoDB, and PostgreSQL performance.
In this release, we have updated Grafana to version 6.7.4 to fix CVE-2020-13379. We recommend updating
to the latest version of PMM as soon as possible.
• PMM-5257, PMM-5256, & PMM-5243: pmm-admin socket option ( –-socket ) to specify UNIX socket path
for connecting to MongoDB, PostgreSQL, and ProxySQL instances
7.16.2 Improvements
• PMM-2244: pmm-admin status command output shows both pmm-admin and pmm-agent versions
• PMM-5946: MySQL Table Details dashboard filter on Service Name prevents display of services without
data
• PMM-6004: MySQL exporter reporting wrong values for cluster status ( wsrep_cluster_status )
• PMM-5949: Unwanted filters applied when moving from QAN to Add Instance page
• PMM-5870: MySQL Table Details dashboard not showing separate service names for tables
• PMM-5839: PostgreSQL metrics disparity between query time and block read/write time
• PMM-5348: Inventory page has inaccessible tabs that need reload to access
• PMM-5348: Incorrect access control vulnerability fix (CVE-2020-13379) by upgrading Grafana to 6.7.4
Percona Monitoring and Management (PMM) is a free and open-source platform for managing and
monitoring MySQL, MongoDB, and PostgreSQL performance.
7.17.1 Improvements
• PMM-5936: Improved Summary dashboard for Security Threat Tool ‘Failed Checks’
• PMM-5937: Improved Details dashboard for Security Threat Tool ‘Failed Database Checks’
• PMM-5924: Alertmanager not running after PMM Server upgrade via Docker
• PMM-5915: supervisord not restarting after restart of PMM Server virtual appliances (OVF/AMI)
• PMM-5870: MySQL Table Details dashboard not showing separate service names for tables
Percona Monitoring and Management (PMM) is a free and open-source platform for managing and
monitoring MySQL, MongoDB, and PostgreSQL performance.
• PMM-5728: Technical preview of External Services monitoring feature. A new command provides
integration with hundreds of third-party systems (https://prometheus.io/docs/instrumenting/
exporters/) via the Prometheus protocol so that you can monitor external services on a node where
PMM agent is installed.
• PMM-5822: PMM now includes a Security Threat Tool to help users avoid the most common database
security issues. Read more here.
• PMM-5559: Global annotations can now be set with the pmm-admin annotate command.
• PMM-4931: PMM now checks Docker environment variables and warns about invalid ones.
7.18.2 Improvements
• PMM-1962: The PMM Server API (via /v1/readyz ) now also returns Grafana status information in
addition to that for Prometheus.
• PMM-5854: The Service Details dashboards were cleaned up and some unused selectors were removed.
• PMM-5775: It is now clearer which nodes are Primary and which are Secondary on MongoDB Instance
dashboards.
• PMM-5393: There’s a new ‘Node Summary’ row in the services Summary and Details dashboards
summarizing the system update, load average, RAM and memory.
• PMM-5734: Temporary files activity and utilization charts ( rate & irate ) were added to the
PostgreSQL Instance overview.
• PMM-5695: The error message explains better when using the –-socket option incorrectly.
• PMM-4829: The MongoDB Exporter wasn’t able to collect metrics from hidden nodes without either the
latest driver or using the connect-direct parameter.
• PMM-5056: The average values for Query time in the Details and Profile sections were different.
• PMM-2717: Updating MongoDB Exporter resolves an error ( Failed to execute find query on
'config.locks': not found. ) when used with shardedCluster 3.6.4.
• PMM-4541: MongoDB exporter metrics collection was including system collections from collStats and
indexStats , causing “log bloat”.
• PMM-5903: When applying a filter the QAN Overview was being refreshed twice.
• PMM-5821: The Compare button was missing from HA Dashboard main menus.
• PMM-5687: Cumulative charts for Disk Details were not showing any data if metrics were returning
NaN results.
• PMM-5663: The ‘version’ value was not being refreshed in various MySQL dashboards.
• PMM-5643: Advanced Data Exploration charts were showing ‘N/A’ for Metric Resolution and ‘No data to
show’ in the Metric Data Table.
• PMM-4562: MongoDB and MySQL registered instances with empty cluster labels ( –environment=<label>
) were not visible in the dashboard despite being added instances.
• PMM-4906: The MongoDB exporter for MongoDB 4.0 and above was causing a “log bloat” condition.
Help us improve our software quality by reporting any bugs you encounter using our bug tracking system.
Percona Monitoring and Management (PMM) is a free and open-source platform for managing and
monitoring MySQL, MongoDB, and PostgreSQL performance.
• PMM-5042 and PMM-5272: PMM can now connect to MySQL instances by specifying a UNIX socket. This
can be done with a new --socket option of the pmm-admin add mysql command. (Note: Updates to
both PMM Client and PMM Server were done to allow UNIX socket connections.)
• PMM-4145: Amazon RDS instance metrics can now be independently enabled/disabled for Basic and/or
Enhanced metrics.
7.19.2 Improvements
• PMM-5581: PMM Server Grafana plugins can now be updated on the command line with the grafana-
cli command-line utility.
• PMM-5536: Three Grafana plugins were updated to the latest versions: vertamedia-clickhouse-
datasource to 1.9.5, grafana-polystat-panel to 1.1.0, and grafana-piechart-panel to 1.4.0.
• PMM-4252: The resolution of the PMM Server favicon image has been improved.
• PMM-5547: PMM dashboards were failing when presenting data from more than 100 monitored
instances (error message proxy error: context canceled ).
• PMM-5624: Empty charts were being shown in some Node Temperature dashboards.
• PMM-5637: The Data retention value in Settings was incorrectly showing the value as minutes instead
of days.
• PMM-5613: Sorting data by Query Time was not working properly in Query Analytics.
• PMM-5554: Totals in charts were inconsistently plotted with different colors across charts.
• PMM-4919: The force option ( --force ) in pmm-admin config was not always working.
Help us improve our software quality by reporting any bugs you encounter using our bug tracking system.
Percona Monitoring and Management (PMM) is a free and open-source platform for managing and
monitoring MySQL, MongoDB, and PostgreSQL performance.
• PMM-3387: Prometheus custom configuration is now supported by PMM Server. The feature is targeted
at experienced users and is done by adding the base configuration file into the PMM Server container to
be parsed and included into the managed Prometheus configuration.
• PMM-5186: Including –-pprof option in the pmm-admin summary command adds pprof debug profiles
to the diagnostics data archive
• PMM-5102: The new “Node Details” dashboard now displays data from the hardware monitoring
sensors in hwmon . The new dashboard is based on the hwmon collector data from the node_exporter .
Please note that data may be unavailable for some nodes because of the configuration or virtualization
parameters.
7.20.2 Improvements
• PMM-4915: The Query Analytics dashboard now shows Time Metrics in the Profile Section as “AVG per
query” instead of “AVG per second”
• PMM-5470: ClickHouse query optimized for Query Analytics to improve its speed and reduce the load
on the back-end
• PMM-5448: The default high and medium metrics resolutions were changed to 1-5-30 and 5-10-60 sec.
To reduce the effect of this change on existing installations, systems having the “old” high resolution
chosen on the PMM Settings page (5-5-60 sec.) will be automatically re-configured to the medium one
during an upgrade. If the resolution was changed to some custom values via API, it will not be affected
• PMM-5531: A health check indicator was implemented for the PMM Server Docker image. It is based on
the Docker HEALTHCHECK. This feature can be used as follows:
• PMM-5489: The “Total” line in all charts is now drawn with the same red color for better consistency
• PMM-5461: Memory graphs on the node-related dashboards were adjusted to have fixed colors that
are more distinguishable from each other
• PMM-5329: Prometheus in PMM Server was updated to version 2.16.0. This update has brought several
improvements. Among them are significantly reduced memory footprint of the loaded TSDB blocks,
lower memory footprint for the compaction process (caused by the more balanced choice of what to
buffer during compaction), and improved query performance for the queries that only touch the most
recent 2 hours of data.
• PMM-5210: Data Retention is now specified in days instead of seconds on the PMM Settings page.
Please note this is a UI-only change, so the actual data retention precision is not changed
• PMM-5182: The logs.zip archive available on the PMM Settings page now includes additional self-
monitoring information in a separate client subfolder. This subfolder contains information collected
on the PMM Server and is equivalent to the one collected on a node by the pmm-admin summary
command.
• PMM-5112: The Inventory API List requests now can be filtered by the Node/Service/Agent type
• PMM-5178: Query Detail Section of the Query Analytics dashboard didn’t show tables definitions and
indexes for the internal PostgreSQL database
• PMM-5465: MySQL Instance related dashboards had row names not always matching the actual
contents. To fix this, elements were re-ordered and additional rows were added for better matching of
the row name and the corresponding elements
• PMM-5455: Dashboards from the Insight menu were fixed to work correctly when the low resolution is
set on the PMM Settings page
• PMM-5446: A number of the Compare Dashboards were fixed to work correctly when the low resolution
is set on the PMM Settings page
• PMM-5430: MySQL Exporter section on the Prometheus Exporter Status dashboard now collapsed by
default to be consistent with other database-related sections
• PMM-5445, PMM-5439, PMM-5427, PMM-5426, PMM-5419: Labels change (which occurs e.g. when the
metrics resolution is changed on the PMM Settings page) was breaking dashboards
• PMM-5347: Selecting queries on the Query Analytics dashboard was generating errors in the browser
console
• PMM-5305: Some applied filters on the Query Analytics dashboard were not preserved after changing
the time range
• PMM-5267: The Refresh button was not working on the Query Analytics dashboard
• PMM-5003: pmm-admin list and status use different JSON naming for the same data
Help us improve our software quality by reporting any bugs you encounter using our bug tracking system.
Percona Monitoring and Management (PMM) is a free and open-source platform for managing and
monitoring MySQL, MongoDB, and PostgreSQL performance.
For PMM install instructions, see Installing PMM Server and Installing PMM Client.
Important PMM 2 is designed to be used as a new installation — please don’t try to upgrade your existing
PMM 1 environment.
• PMM-5064 and PMM-5065: Starting from this release, users will be able to integrate PMM with an
external Alertmanager by specifying the Alertmanager URL and the Alert Rules to be executed inside the
PMM server (This feature is for advanced users only at this point)
• PMM-4954: Query Analytics dashboard now shows units both in the list of queries in a summary table
and in the Details section to ease understanding of the presented data
• PMM-5179: Relations between metrics are now specified in the Query Analytics Details section
• PMM-5115: The CPU frequency and temperature graphs were added to the CPU Utilization dashboard
• PMM-5394: A special treatment for the node-related dashboards was implemented for the situations
when the data resolution change causes new metrics to be generated for existing nodes and services,
to make graphs show continuous lines of the same colors
• PMM-4620: The high CPU usage by the pmm-agent process related to MongoDB Query Analytics was
fixed
• PMM-5377: singlestats showing percentage had sparklines scaled vertically along with the graph
swing, which made it difficult to visually notice the difference between neighboring singlestats .
• PMM-5204: Changing resolution on the PMM settings page was breaking some singlestats on the
Home and MySQL Overview dashboards
• PMM-5251: Vertical scroll bars on the graph elements were not allowed to do a full scroll, making last
rows of the legend unavailable for some graphs
• PMM-5410: The “Available Downtime before SST Required” chart on the PXC/Galera Node Summary
dashboard was not showing data because it was unable to use metrics available with different scraping
intervals
Percona Monitoring and Management (PMM) is a free and open-source platform for managing and
monitoring MySQL, MongoDB, and PostgreSQL performance.
For PMM install instructions, see Installing PMM Server and Installing PMM Client.
Important PMM 2 is designed to be used as a new installation — please don’t try to upgrade your existing
PMM 1 environment.
• PMM-5321: The optimization of the Query Analytics parser code for PostgreSQL queries allowed us to
reduce the memory resources consumption by 1-5%, and the parsing time of an individual query by 30
to 40%
• PMM-5184: The pmm-admin summary command have gained a new --skip-server flag which makes it
operating in a local-only mode, creating summary file without contacting the PMM Server
• PMM-5340: The Scraping Time Drift graph on the Prometheus dashboard was showing wrong values
because the actual metrics resolution wasn’t taken into account
• PMM-5060: Query Analytics Dashboard did not show the row with the last query of the first page, if the
number of queries to display was 11
Percona Monitoring and Management (PMM) is a free and open-source platform for managing and
monitoring MySQL, MongoDB, and PostgreSQL performance.
For PMM install instructions, see Installing PMM Server and Installing PMM Client.
Important PMM 2 is designed to be used as a new installation — please don’t try to upgrade your existing
PMM 1 environment.
PMM Server version 2.2.0 suffered an unauthenticated denial of service vulnerability (CVE-2020-7920). Any
other PMM versions do not carry the same code logic, and are thus unaffected by this issue. Users who
have already deployed PMM Server 2.2.0 are advised to upgrade to version 2.2.1 which resolves
this issue.
• PMM-5229: The new RDS Exporter section added to the Prometheus Exporter Status dashboard shows
singlestats and charts related to the rds_exporter
• PMM-5228 and PMM-5238: The Prometheus dashboard and the Exporters Overview dashboard were
updated to include the rds_exporter metrics in their charts, allowing better understanding of the
impacts of monitoring RDS instances
• PMM-4830: The consistency of the applied filters between the Query Analytics and the Overview
dashboards was implemented, and now filters selected in QAN will continue to be active after the switch
to any of the Overview dashboards available in the Services menu
• PMM-5235: The DB uptime singlestats in node rows on the Home dashboard were changed to show
minimal values instead of average ones to be consistent with the top row
• PMM-5127: The “Search by” bar on the Query Analytics dashboard was renamed to “Filter by” to make
its purpose more clear
• PMM-5131: The Filter panel on the Query Analytics dashboard now shows the total number of available
Labels within the “See all” link, which appears if the Filter panel section shows only top 5 of its Labels
• PMM-5232: The pmm-managed component of the PMM Server 2.2.0 is vulnerable to DoS attacks, that
could be carried out by anyone who knows the PMM Server IP address (CVE-2020-7920). Versions
other than 2.2.0 are not affected.
• PMM-5226: The handlebars package was updated to version 4.5.3 because of the Prototype Pollution
vulnerability in it (CVE-2019-19919). Please note PMM versions were not affected by this vulnerability,
as handlebars package is used as a build dependency only.
• PMM-5206: Switching to the Settings dashboard was breaking the visual style of some elements on the
Home dashboard
• PMM-5139: The breadcrumb panel, which shows all dashboards visited within one session starting
from the root, was unable to fully show breadcrumb longer than one line
• PMM-5212: The explanatory text was added to the Download PMM Server Logs button in the Diagnostic
section of the PMM Settings dashboard, and a link to it was added to the Prometheus dashboard which
was the previous place to download logs
• PMM-5215: The unneeded mariadb-libs package was removed from the PMM Server 2.2.0 OVF image,
resulting in both faster updating with the yum update command and avoiding dependency conflict
messages in the update logs
• PMM-5216: PMM Server Upgrade to 2.2.0 was showing Grafana Update Error page with the Refresh
button which had to be clicked to start using the updated version
• PMM-5211: The “Where do I get the security credentials for my Amazon RDS DB instance” link in the
Add AWS RDS MySQL or Aurora MySQL instance dialog was not targeted at the appropriate instruction
• PMM-5217: PMM 2.x OVF Image memory size was increased from 1 Gb to 4 Gb with the additional 1
Gb swap space because the previous amount was hardly housing the PMM Server, and it wasn’t enough
in some cases like performing an upgrade
• PMM-5271: LVM logical volumes were wrongly resized on AWS deployment, resulting in “no space left
on device” errors
• PMM-5295: InnoDB Transaction Rollback Rate values on the MySQL InnoDB Details dashboard were
calculated incorrectly
• PMM-5270: PXC/Galera Cluster Summary dashboard was showing empty Cluster drop-down list,
making it impossible to choose the cluster name
• PMM-4769: The wrongly named “Timeout value used for retransmitting” singlestat on the Network
Details dashboard was renamed to “The algorithm used to determine the timeout value” and updated
to show the algorithm name instead of a digital code
• PMM-5260: Extensive resource consumption by pmm-agent took place in case of Query Analytics for
PostgreSQL; it was fixed by a number of optimizations in the code, resulting in about 4 times smaller
memory usage
• PMM-5261: CPU usage charts on all dashboards which contain them have undergone colors update to
make softIRQ and Steal curves better differentiated
• PMM-5244: High memory consumption in the PMM Server with a large number of agents sending data
simultaneously was fixed by improving bulk data insertion to the ClickHouse database
Percona Monitoring and Management (PMM) is a free and open-source platform for managing and
monitoring MySQL, MongoDB, and PostgreSQL performance. You can run PMM in your own environment for
maximum security and reliability. It provides thorough time-based analysis for MySQL, MongoDB, and
PostgreSQL servers to ensure that your data works as efficiently as possible.
• Alternative installation methods available for PMM 1.x are re-implemented for PMM 2: now PMM Server
can be installed as a virtual appliance, or run using AWS Marketplace
• AWS RDS and remote instances monitoring re-added in this release include AWS RDS MySQL / Aurora
MySQL instances, and remote PostgreSQL, MySQL, MongoDB, and ProxySQL ones
• The new Settings dashboard allows configuring PMM Server via the graphical interface
For PMM install instructions, see Installing PMM Server and Installing PMM Client.
Important PMM 2 is designed to be used as a new installation — please don’t try to upgrade your existing
PMM 1 environment.
• PMM-4575: The new PMM Settings dashboard allows users to configure various PMM Server options:
setting metrics resolution and data retention, enabling or disabling send usage data statistics back to
Percona and checking for updates; this dashboard is now the proper place to upload your public key
for the SSH login and to download PMM Server logs for diagnostics
• PMM-4907 and PMM-4767: The user’s AMI Instance ID is now used to setup running PMM Server using
AWS Marketplace as an additional verification on the user, based on the Amazon Marketplace rules
• PMM-4950 and PMM-3094: Alternative AWS partitions are now supported when adding an AWS RDS
MySQL or Aurora MySQL Instance to PMM
• PMM-4976: Home dashboard clean-up: “Systems under monitoring” and “Network IO” singlestats
were refined to be based on the host variable; also avoiding using color as an indicator of state; “All”
row elements were relinked to the “Nodes Overview” dashboard with regards to the selected host.
• PMM-4800: The pmm-admin add mysql command has been modified to make help text more descriptive:
now when you enable tablestats you will get more detail on if they’re enabled for your environment
and where you stand with respect to the auto-disable limit
• PMM-5053: A tooltip was added to the Head Block graph on the Prometheus dashboard
• PMM-5068: Drill-down links were added to the Node Summary dashboard graphs
• PMM-5050: Drill-down links were added to the graphs on all Services Compare dashboards
• PMM-5037: Drill-down links were added to all graphs on the Services Overview dashboards
• PMM-4988: Filtering in Query Analytics have undergone improvements to make group selection more
intuitive: Labels unavailable under the current selection are shown as gray/disabled, and the
percentage values are dynamically recalculated to reflect Labels available within the currently applied
filters
• PMM-4966: All passwords are now substituted with asterisk signs in the exporter logs for security
reasons when not in debug mode
• PMM-3198: Instead of showing All graphs for all services by default, MySQL Command/Handler
Counters Compare dashboard now shows the predefined set of ten most informative ones, to reduce
load on PMM Server at its first open
• PMM-4978: The “Top MySQL Questions” singlestat on the MySQL Instances Overview dashboard was
changed to show ops instead of percentage
• PMM-4917: The “Systems under monitoring” and “Monitored DB Instances” singlestats on the Home
dashboard now have a sparkline to make situation more clear with recently shut down nodes/instances
• PMM-4979: Set decimal precision 2 for all the elements, including charts and singlestats , on all
dashboards
• PMM-4980: Fix “Load Average” singlestat on the Node Summary dashboard to show decimal value
instead of percent
• PMM-4941: Some charts were incorrectly showing empty fragments with high time resolution turned on
• PMM-5022: Fix outdated drill-down links on the Prometheus Exporters Overview and Nodes Overview
dashboards
• PMM-5023: Make the All instances uptime singlestat on the Home dashboard to show Min values
instead of Avg
• PMM-5029: Option to upload dashboard snapshot to Percona was disappearing after upgrade to 2.1.x
• PMM-4946: Rename singlestats on the Home dashboard for better clarity: “Systems under monitoring”
to “Nodes under monitoring” and “Monitored DB Instances” to “Monitored DB Services”, and make the
last one to count remote DB instances also
• PMM-5015: Fix format of Disk Page Buffers singlestat on the Compare dashboard for PostgreSQL to
have two digits precision for the consistency with other singlestats
• PMM-5014: LVM logical volumes were wrongly sized on a new AWS deployment, resulting in “no space
left on device” errors.
• PMM-4804: Incorrect parameters validation required both service-name and service-id parameters
of the pmm-admin remove command to be presented, while the command itself demanded only one of
them to identify the service.
• PMM-3298: Panic errors were present in the rds_exporter log after adding an RDS instance from the
second AWS account
• PMM-5089: The serialize-javascript package was updated to version 2.1.1 because of the possibility
of regular expressions cross-site scripting vulnerability in it (CVE-2019-16769). Please note PMM
versions were not affected by this vulnerability, as serialize-javascript package is used as a build
dependency only.
• PMM-5149: Disk Space singlestat was unable to show data for RDS instances because of not taking
into account sources with unknown file system type
Percona Monitoring and Management (PMM) is a free and open-source platform for managing and
monitoring MySQL, MongoDB, and PostgreSQL performance. You can run PMM in your own environment for
maximum security and reliability. It provides thorough time-based analysis for MySQL, MongoDB, and
PostgreSQL servers to ensure that your data works as efficiently as possible.
Important PMM 2 is designed to be used as a new installation — please don’t try to upgrade your existing
PMM 1 environment.
• PMM-4063: Update QAN filter panel to show only labels available for selection under currently applied
filters
• PMM-815: Latency Detail graph added to the MongoDB Instance Summary dashboard
• PMM-4768: Disable heavy-load collectors automatically when there are too many tables
• PMM-4733: Add more log and configuration files to the downloadable logs.zip archive
• PMM-4616: Rename column in the Query Details section in QAN from Total to Sum
• PMM-4918: Update Grafana plugins to newer versions, including the clickhouse-datasource plugin
• PMM-4935: Wrong instance name displayed on the MySQL Instance Summary dashboard due to the
incorrect string crop
• PMM-4916: Wrong values are shown when changing the time range for the Node Summary Dashboard
in case of remote instances
• PMM-4895 and PMM-4814: The update process reports completion before it is actually done and
therefore some dashboards, etc. may not be updated
• PMM-4876: PMM Server access credentials are shown by the pmm-admin status command instead of
hiding them for security reasons
• PMM-4875: PostgreSQL error log gets flooded with warnings when pg_stat_statements extension is
not installed in the database used by PMM Server or when PostgreSQL user is unable to connect to it
• PMM-4852: Node name has an incorrect value if the Home dashboard opened after QAN
• PMM-4847: Drill-downs from the Environment Overview dashboard doesn’t show data for the
preselected host
• PMM-4819: In case of the only one monitored host, its uptime is shown as a smaller value than the all
hosts uptime due to the inaccurate rounding
• PMM-4816: Set equal thresholds to avoid confusing singlestat color differences on a Home dashboard
• PMM-4718: Labels are not fully displayed in the filter panel of the Query Details section in QAN
• PMM-4545: Long queries are not fully visible in the Query Examples section in QAN
Help us improve our software quality by reporting any Percona Monitoring and Management bugs you
encounter using our bug tracking system.
Percona Monitoring and Management (PMM) is a free and open-source platform for managing and
monitoring MySQL, MongoDB, and PostgreSQL performance. You can run PMM in your own environment for
maximum security and reliability. It provides thorough time-based analysis for MySQL, MongoDB, and
PostgreSQL servers to ensure that your data works as efficiently as possible.
Important PMM 2 is designed to be used as a new installation — please don’t try to upgrade your existing
PMM 1 environment.
7.26.1 Improvements
• PMM-4444: Return “what’s new” URL with the information extracted from the pmm-update package
change log
• PMM-4749: Navigation from Dashboards to QAN when some Node or Service was selected now applies
filtering by them in QAN
• PMM-4734: A fix was made for the collecting node_name formula at MySQL Replication Summary
dashboard
• PMM-4640: It was not possible to add MongoDB remotely if password contained a # symbol
Help us improve our software quality by reporting any Percona Monitoring and Management bugs you
encounter using our bug tracking system.
Percona Monitoring and Management (PMM) is a free and open-source platform for managing and
monitoring MySQL, MongoDB, and PostgreSQL performance. You can run PMM in your own environment for
maximum security and reliability. It provides thorough time-based analysis for MySQL, MongoDB, and
PostgreSQL servers to ensure that your data works as efficiently as possible.
Important PMM 2 is designed to be used as a new installation — please don’t try to upgrade your existing
PMM 1 environment.
The new PMM2 introduces a number of enhancements and additional feature improvements, including:
• Detailed query analytics and filtering technologies which enable you to identify issues faster than ever
before.
• A better user experience: Service-level dashboards give you immediate access to the data you need.
• Our new API allows you to extend and interact with third-party tools.
More details about new and improved features available within the release can be found in the
corresponding blog post.
Help us improve our software quality by reporting any Percona Monitoring and Management bugs you
encounter using our bug tracking system.