0% found this document useful (0 votes)
45 views4 pages

Prometheus Concepts

Uploaded by

saiakkina
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views4 pages

Prometheus Concepts

Uploaded by

saiakkina
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 4

Prometheus is used for automated monitoring and alerting

1. memory usage

One of the servers memory usage is more than 70% for more than an hour and keeps
increasing

notify the admin to resolve the issue

2. logs unavailable

Elasticsearch does not accept any new logs due to the disk space or the storage
limit allocated to it completely run out.

50% is ideal for setting up notifications considering organizations bureaucratic


approval processes

3. App gets slow down

one of the services breaks down and starts sending the eror message continuously
consuming all the network bandwidth and slows down the other services

continuous monitoring and timely alerts could fix the issue before it gets out of
hand

Components of Prometheus

Prometheus Server
1. Time Series Database (Storage) - stores metrics data like CPU,
memory, disk soace utilization, number of requests, exceptions and etc
2. Data Retrieval Worker (Retrieval) - pulls the metrics from
applications, services, servers and other target resources and stores them on the
Storage database
3. HTTP Web Server - accepts queries through Server API for the stored
data using PromQL and displays them on the Prometheus Web UI/ Dashboard or any
other Data Visualization tools like Grafana and etc

Prometheus can monitor a wide range of items like Windows / Linux Server, Apache
Server, Single Application, Services like Database and etc.
These are called Prometheus Targets.

There are units for each target for monitoring

For Servers
CPU Utilization
Memory Usage
Disk Space Consumption

For Applications and Services


Requests Count
Exceptions Count
Request Processing Duration

The unit that we want to monitor for a target is called as a metric and these
metrics are stored in the Prometheus Storage database.

These metrics are stored in human readable format.


Metric entries are of 2 types
- TYPE - there are of 3 types mainly
- counter: how many times something happened? eg. number of exceptions,
number of requests
- gauge: what is the current value of something? eg. CPU Utilization,
Memory Consumption Disk Space utilization and etc
- histogram: How long did something take to finish? eg. Request
Processing Duration
- HELP - description of what the metrics is about

Metrics Collection happens for linux server from HTTP endpoints.

Linux Server - hostname/metrics

And for that to work,


1. the target should expose the /metrics endpoint and
2. the data must be in the correct format that Prometheus understands.

Some targets / server expose the metrics endpoint by default

Other targets which do not expose the endpoint by default, needs another component
called Exporter

Exporter is a script which does the below


1. fetches the metrics
2. converts to correct format
3. exposes to /metrics endpoint so that the Data Worker Retriever starts
scraping

There are various exporters available for MySQL, Elasticsearch, Linux Servers,
Build Tools, Cloud Platforms and so on

- For Linux server: node exporter tar file from Prometheus repository

Exporters are also available as docker images

- For mysql image: exporter sidecar image is available

For monitoring our own application metrices, we can use Prometheus provided client
libraries to expose /metrics endpoint to start scraping

If applications are pushing the metrics to the monitoring system, the network gets
flooded with traffic, eg. AWS CloudWatch, New Relic

Pulling the metrics by the monitoring tool is the better and Prometheus has this
advantage.

For services that are short-lived, the services can push the metrics to
Pushgateway, from where the Prometheus pulls those metrices

Prometheus can be configured using prometheus.yaml.

It configures
1. what to scrape and
2. when to scrape.
3. on which targets
4. at what interval

Prometheus uses service discovery to discover those target endpoints


global:
scrape_interval: 15s
evaluation_interval: 15s

rule_files:
# - "first.rules"
# - "second.rules"

scrape_configs:
- job_name: prometheus
static_configs:
- targets: ['localhost:9090']

scrape_interval can be used to define how often the Prometheus will scrape its
target
rule_files define the threshold at which the alerts are created, like 50% disk
space utilization and so on
evaluation_interval can be used to define how often the above rules are evaluated
scrape_configs defines what resources Prometheus monitors

scrape_configs:
- job_name: prometheus
static_configs:
- targets: ['localhost:9090']
- job_name: node_exporter
scrape_interval: 1m
scrape_timeout: 1m
static_configs:
- targets: ['localhost:9100']

the job_name node_exporter scrapes the metrics from localhost:9100

Below are defaults for all jobs and we do not need to provide them, unless we want
to change them.

metrics_path: "/metrics"
scheme: "http"

Prometheus reads the rules defined in the config files and when the condition is
met, it pushes respective alert to its another component called Alertmanager which
can be used to notify users / other systems via email / slack and etc

Prometheus stores this metrics data on disk, it could be local or remote storage
system, but a time series database on disk.

It uses Custom Time Series format and cannot be writen to a relational database

We can use Server API via Prometheus Web UI using PromQL to query Prometheus

We can also use more powerful data visualization tools like Grafana to query
Prometheus using promQL

PromQL examples

http_requests_total{status!~"4.."} - Queries all HTTP status codes except 4xx


ones
rate(http_requests_total[5m]) [30m:] - Returns the 5 minute rate of the
http_requests_total metric for the past 30 minutes
All below tasks are pretty complex and there is very limited documentation
available

- Learning PromQL
- Configuring Prometheus YAML Files
- Creating Grafana Dashboards

Prometheus is designed to be
- reliable,
- stand alone and self-containing
- works well even if other parts are broken
- no extensive setup required
- single node is less complex

Prometheus is not designed to


- difficult to scale
- limitation on number of metrics it can monitor and we need below work
arounds, if required
- Either increase the Prometheus server capacity
- or limit the number of metrics to monitor, as is.

Prometheus is fully compatible with Docker and Kubernetes

Prometheus components are available as docker images which helps to deploy them in
Kubernetes or other containerized clusters

It integrates very well with Kubernetes infrastructure and provides cluster node
resource monitoring out of the box. Once it is deployed on Kubernetes, it starts
gathering metrics data from each node server without any extra configuration.

You might also like