Prometheus Promql For Humans
Prometheus Promql For Humans
PromQL Cheatsheet
Basics
Instant Vectors
http_requests_total
This gives us all the http requests, but we've got 2 issues.
1. There are too many data points to decipher what's going on.
2. You'll notice that http_requests_total only goes up, because it's a counter.
These are common in Prometheus, but not useful to graph.
http_requests_total{job="prometheus", code="200"}
You Can Check A Substring Using Regex Matching.
http_requests_total{status_code=~"2.*"}
Range Vectors
http_requests_total[5m]
You can also use (s, m, h, d, w, y) to represent (seconds, minutes, hours, ...)
respectively.
Important Functions
For Range Vectors
You'll notice that we're able to graph all these functions. Since only Instant Vectors
can be graphed, they take a Range Vector as a parameter and return a Instant
Vector.
rate(http_requests_total[5m])
Irate
Looks at the 2 most recent samples (up to 5 minutes in the past), rather than
averaging like rate
irate(http_requests_total[5m])
It's best to use rate when alerting, because it creates a smooth graph since the
data is averaged over a period of time. Spikey graphs can cause alert overload,
fatigue, and bad times for all due to repeatedly triggering thresholds.
increase(http_requests_total[1h])
These are a small fraction of the functions, just what we found most popular. You
can find the rest here.
sum(rate(http_requests_total[5m]))
You can also use min , max , avg , count , and quantile similarly.
This query tells you how many total HTTP requests there are, but isn't directly useful
in deciphering issues in your system. I'll show you some functions that allow you to
gain insight into your system.
You can also use without rather than by to sum on everything not passed as a
parameter to without.
Now, you can see the difference between each status code.
Offset
You can use offset to change the time for Instant and Range Vectors. This can
be helpful for comparing current usage to past usage when determining the
conditions of an alert.
Operators
Operators can be used between scalars, vectors, or a mix of the two. Operations
between vectors expect to find matching elements for each side (also known as
one-to-one matching), unless otherwise specified.
There are Arithmetic (+, -, *, /, %, ^), Comparison (==, !=, >, <, >=, <=) and Logical
(and, or, unless) operators.
Vector Matching
One-to-One
You can use on to compare using certain labels or ignoring to compare on all
labels except.
Many-to-One
It's possible to use comparison and arithmetic operations where an element on one
side can be matched with many elements on the other side. You must explicitly tell
Prometheus what to do with the extra dimensions.
You can use group_left if the left side has a higher cardinality, else use group
_right .
Examples
Disclaimer: We've hidden some of the information in the pictures using the Legend
Format for privacy reasons.
Memory Usage
node_filesystem_avail{fstype!~"tmpfs|fuse.lxcfs|squashfs"} / node_
filesystem_size{fstype!~"tmpfs|fuse.lxcfs|squashfs"}
Percentage of disk space being used by instance. We're looking for the available
space, ignoring instances that have tmpfs , fuse.lxcfs , or squashfs in their
fstype and dividing that by their total size.
rate(http_requests_total{status_code=~"5.*"}[5m]) / rate(http_requ
ests_total[5m])
3 Pillars Of Observability
It's important to understand where metrics fit in when it comes to observing your
application. I recommend you take a look at the 3 pillars of observability principle.
Metrics are an important part of your observability stack, but logs and tracing are
equally so.