Intro To Prometheus Workshop - Grafana
Intro To Prometheus Workshop - Grafana
https://www.linkedin.com/in/willie-engelbrecht/
Aengus Rooney, Principal Solutions Engineer
- Enjoys cycling and swimming
- Can load a Pez dispenser in one go
@aengusrooney
https://www.linkedin.com/in/aengusrooney/
Nabeel Saad, Principal Solutions Engineer,
- Sci-Fi and gaming aficionado
- Enjoys ultimate frisbee, trapezing, and walks
on the beach
@saadnabs
www.linkedin.com/in/nabeelsaad
Emil A. Siemes, Principal Solutions Engineer
- Running
- Baking sourdough bread
https: //www.linkedin.com/in/emil-andreas-siemes-a793926/
Prometheus Introduction
Hands-on breakout
Prometheus UI & Queries (20 min)
Alerting and HA
| 7
Meet your team
Introduce yourself to your breakout group!
● Conducive to mathematical
functions, capacity planning,
predictions, and alerting
Prometheus Data Model
Timeseries
tns_response_message_bytes_count
Metric name
Prometheus Data Model
tns_response_message_bytes_count{job="tns-app", status_code=”200”}
tns_response_message_bytes_count{job="tns-app", status_code=”404”}
tns_response_message_bytes_count{job="tns-app", status_code=”500”}
Targets
web app
clientlib
API
server clientlib
Linux VM
exporter
mysql
exporter
Windows
VM exporter
Prometheus Architecture
Instrumentation & Exposition
Targets
web app
clientlib
API
server clientlib
Linux VM
Prometheus
exporter
TSDB
mysqld
exporter
cgroups
exporter
Service Discovery
Targets
(DNS, Kubernetes, AWS, Consul,
custom...)
web app
clientlib
API
server clientlib
Linux VM
Prometheus
exporter
TSDB
mysqld
exporter
cgroups
exporter
Service Discovery
Targets
(DNS, Kubernetes, AWS, Consul,
custom...)
web app
clientlib
API
server clientlib
Grafana
Linux VM
Prometheus Web UI
exporter
TSDB
mysqld
exporter
cgroups
exporter Querying, Dashboards
Collection, Storage & Processing
Poll Q
Installing Prometheus
prometheus.io
Get and Untar
Start Node Exporter
prometheus.yml
Prometheus UI Breakout
See PDF in chat window
Poll Q
PromQL Data Types
● Scalars
● Instant vectors
● Range vectors
PromQL Data Types
● Instant vectors:
prometheus_http_requests_total
● Range vectors
prometheus_http_requests_total{code="200"}[5m]
Prometheus Metric Types
● Counters
● Gauges
● Histograms
● Summary
Counters
●
rate()
● Range vectors
prometheus_http_requests_total{code="200"}[5m]
Gauges
●
Histogram
# TYPE prometheus_http_request_duration_seconds histogram
prometheus_http_request_duration_seconds_bucket{handler="/",le="0.1"} 100
prometheus_http_request_duration_seconds_bucket{handler="/",le="0.2"} 200
prometheus_http_request_duration_seconds_bucket{handler="/",le="0.4"} 300
prometheus_http_request_duration_seconds_bucket{handler="/",le="1"} 400
prometheus_http_request_duration_seconds_bucket{handler="/",le="+Inf"} 1000
prometheus_http_request_duration_seconds_sum{handler="/"}
1847.000596540000001
prometheus_http_request_duration_seconds_count{handler="/"} 1000
histogram_quantile(0.95, rate(prometheus_http_request_duration_seconds_bucket[1m]))
Summary
# HELP prometheus_rule_evaluation_duration_seconds The duration for a
rule to execute.
# TYPE prometheus_rule_evaluation_duration_seconds summary
prometheus_rule_evaluation_duration_seconds{quantile="0.5"} 6.4853e-05
prometheus_rule_evaluation_duration_seconds{quantile="0.9"} 0.00010102
prometheus_rule_evaluation_duration_seconds{quantile="0.99"}
0.000177367
prometheus_rule_evaluation_duration_seconds_sum 1.623860968846092e+06
prometheus_rule_evaluation_duration_seconds_count 1.112293682e+09
PromQL: Binary Operators
Arithmetic
+ (addition)
- (subtraction)
* (multiplication)
/ (division)
% (modulo)
^ (exponentiation)
Operators
PromQL: Binary Operators
Arithmetic Comparison Binary logic
| 53
The Four
Golden Signals
| 54
USE and RED Methods
Utilization (U): The proportion of the resource that is used
Saturation (S): A measure of how “full” a service is, often measured by
latency.
Errors (E): The count of error events or rate of failed requests.
| 55
Alertmanager
● Separate component that sits alongside Prometheus
● Handles alerts received from Prometheus (built-in alerting)
| 56
What Does Alertmanager Do?
● 📨 Routes them
○ Determines who should
receive an alert
○ Sends them along to a
notification channel
■ E.g. email, Slack, PagerDuty,
etc.
■ Webhooks
| 57
What Does Alertmanager Do?
● 📥 Deduplication
| 58
Example rule
- alert: KubernetesPodCrashLooping
expr: increase(kube_pod_container_status_restarts_total[1m]) > 3
for: 2m
labels:
severity: warning
annotations:
summary: Kubernetes pod crash looping (instance {{ $labels.instance }})
description: "Pod {{ $labels.pod }} is crash looping\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
https://awesome-prometheus-alerts.grep.to/rules#kubernetes
| 59
Scaling
Prometheus HA
| 61
Federation or…
Global Federation
| 62
Mimir
Grafana Cloud
| 65
Wrap Up
AMA
Thank You