Documentation - Prometheus
Documentation - Prometheus
Table of Contents
Table of Contents 2
Introduction 3
Installations 6
Setting up Docker and Docker Compose with Prometheus 6
Setting up the Prometheus Data Generator 9
Visualizing in Grafana 12
References 13
3
I. Introduction
Prometheus is a tool that scrapes and processes data from single sources like a local machine up to hundreds or
thousands of machines across a server, such as when delivered by a Kubernetes cluster. Prometheus also allows
for complex querying and data processing on the data it scrapes.
4
For the purposes of this project, we created a custom Prometheus setup through Docker that would create
backfilled and live randomly generated metric data. This data is set based on values set in a configuration file.
Docker is a program that allows for local or cloud-based virtualization. It is a powerful platform allowing for
many images of tools like Grafana and Prometheus to be deployed with very flexible automation. One such
automation tool is Docker Compose, which can create an entire environment with custom volumes, commands,
and configuration values across multiple images operating in harmony.
5
After creating our custom Docker Compose configuration, and our Prometheus Data Generator image, we
created a custom Grafana Dashboard that Docker Compose inserts into Grafana. Grafana is a metrics and data
visualization tool that allows for a myriad of types of data sources and visual outputs and graphs. The steps
below outline how preset backfill data and live data were set to flow into Prometheus and then into Grafana as a
preset data source. A separate file detailing how to create and customize a Grafana Dashboard like this can be
found here.
6
II. Installations
1. Install Docker Desktop from here.
2. Install Git from here if it is not already locally installed.
3. From home user folder in command line, run this command:
4. The remainder of the tools and platforms, including Grafana, Prometheus, and the Prometheus Data
Generator, are automatically pulled as Docker images through Docker Compose.
volumes:
grafana-data: {}
services:
prometheus:
image: prom/prometheus:latest
container_name: prometheus
ports:
- 9090:9090
command:
- --config.file=/etc/prometheus/prometheus.yml
- --storage.tsdb.retention.time=1y
- --storage.tsdb.allow-overlapping-blocks
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
grafana:
image: grafana/grafana
container_name: grafana
ports:
- 3000:3000
volumes:
7
- ./provisioning/dashboards:/etc/grafana/provisioning/dashboards
- ./provisioning/datasources:/etc/grafana/provisioning/datasources
- grafana-data:/var/lib/grafana
- ./FSMD.json:/var/lib/grafana/dashboards/FSMD.json
restart: always
prometheus-data-generator:
image: joshfireforever/prometheus-data-generator:1.0
container_name: prometheus-data-generator
ports:
- 9000:9000
volumes:
- ./config.yml:/home/appuser/config.yml
privileged: true
● The volumes configuration at the top of the file can create persistent storage that Docker images utilize.
The “grafana-data” volume here creates an empty volume which persistently stores Grafana information
for continuous iteration of the dashboard.
● All programs that are launched by Docker Compose are listed under services, which the first line
representing the name of the service.
● The image name for each service determines which Docker image will be used, and the container_name
sets the name of the container that represents an actual instance of the image running in Docker.
● Any ports that are used by the container will be delineated under the ports section.
● The command section initiates the listed commands upon instantiating the container. Two dashes
preceding the command indicate the setting of a value in the container.
○ In this case we have commands for the Prometheus container instructing it to set its
prometheus.yml YAML config file to be one that is set in the root level of the container. We also
direct the time-series database (storage.tsdb) system within Prometheus to keep data up to one
year prior, and to allow for overlapping blocks. This makes importing and working with
backfilled data in Prometheus possible.
● Each container also has its own local volume section:
○ The volume set by Prometheus makes a local copy of the prometheus.yml config file available to
the Prometheus container, and the command that directs Prometheus to use that file instead of the
image’s default config file.
8
○ The Grafana-specific volumes make a copy of local dashboard and data source files available to
Grafana so that the tool comes preconfigured upon launching Docker-Compose. This eliminates
the need for manual configuration of Grafana to visualize data coming from Prometheus upon
running Docker Compose.
○ The Prometheus Data Generator local volume makes a copy of the local config file that this
program uses available to the container. This means that local changes to the duplicate config file
in the ~/FSMD folder are copied over and overwrite the default config.yml file stored within the
program files and within the Docker image.
● Other values set to each service include “restart” and “privileged”. Setting these to “always” and “true”
respectively will cause the container to always reload and more functions available without explicit
approval from the user each run.
Once Docker Engine is up, then from the FSMD folder in command-line run this command:
docker-compose up
● It’s recommended that you be logged into a Docker account in the Docker desktop application.
● The FSMD folder must be in the home directory.
● Whenever stopping this Docker Compose project, it’s recommended that you control+C multiple times
in the command-line window that you initiated docker-compose from, to force all threads to stop, if Live
mode is set to True in the config file. This is because the data generator program threads sleep on an
interval while waiting for the next simulated forecast. If you get stuck with getting the docker project to
stop because of these threads, see instructions to force Docker containers to quit here.
9
In addition, an entirely new file, backfill.py, was coded and added to this program that generates random data as
set in new config.yml parameters. This data is generated in bulk in OpenMetrics format and added to a text file
that is ready for ingestion by Prometheus.
The configuration file at FSMD/config.yml is set by Docker Compose to overwrite the duplicated default config
file that is present in the prometheus-data-generator Docker image. This controls the generation of both
backfilled and live metrics.
Overall Parameters
● In config.yml, each “- name:” field denotes a metric with description and metric type. The metric type is
always set to gauge because of the weird way the source code was designed.
● There can be multiple sequences for each metric that will generate data with multiple parameters.
However, we just have a single sequence per metric for simplicity.
● bad_data_rate - the percentage value as decimal (0.0-1.0) at which bad data (within 2 ) is generated for
regular metric.
● median, standard_deviation, minimum, maximum - used in a formula to generate both backfilled and
live metric data. The min and max control the range that the randomly generated values can fall in.
● You can add in new metrics, but keep in mind that you will need to manually add a new panel to the
Grafana dashboard for each.
● After making any changes to config.yml, to get them to apply to live data you will need to run these
command lines from the FSMD folder:
10
● live_mode - can be set to True or False in the config file. If set to False, then after generating backfill
data the program and the container will quit. You can restart the container from Docker to instantly
create new backfill data, if backill_mode is enabled.
● eval_time - is how long each sequence runs, which we have set to once second for now as we intend for
a single value for each forecast.
● interval - number of seconds between each live data generation (a.k.a. a new forecast).
○ By default this is set to 5 seconds to prevent the threads from sleeping forever, and allow for
easier testing in Grafana and stopping of the containers/project in Docker. This causes new live
metrics to be generated every 5 seconds, which is much less than the specified 6 hours but is
much more interesting for testing and Grafana design purposes.
○ To change this value to represent the actual 6 hour interval between forecasts, change this
interval value to 21600 (seconds). Or set it to whatever you like for your testing purposes. This is
completely up to preference right now because this will cause the actual threads to sleep for the
exact time set. If you get stuck with getting the docker project to stop because of these threads,
see instructions to force Docker containers to quit here.
● backfill_range_hours - controls how many hours back each backfilled data metric goes.
● backfill_interval_hours - controls the amount of time between each backfilled metric.
● backfill_starting_hour - controls which hour of the day each backfilled metric initiates on.
The prometheus-data-generator container and program creates backfilled data at runtime automatically. If you
wanted to recreate the backfill.txt file, such as after changing config.yml, then run this command:
In either case, if you want to insert this backfilled data into Prometheus, follow the steps below:
macOS Commands
11
Copy and paste this whole block to run all commands in command-line in a different window than the one you
ran docker-compose from:
docker cp prometheus-data-generator:/home/appuser/backfill.txt
~/FSMD/backfill.txt && docker cp ~/FSMD/backfill.txt
prometheus:/prometheus/backfill.txt && docker exec -it prometheus promtool tsdb
create-blocks-from openmetrics /prometheus/backfill.txt && docker restart
prometheus
docker cp prometheus-data-generator:/home/appuser/backfill.txt
~/FSMD/backfill.txt
Windows commands
● Run these commands one at a time in command-line in a different window than the one you ran
docker-compose from:
You should now be able to see backfilled metric data and newly generated live data if you search for them by
name using the local Prometheus URL.
V. Visualizing in Grafana
As mentioned above, the Grafana dashboard data, the dashboard itself, and the data source configuration file are
linked to the Grafana container through the Docker Compose config file. Thus, an initial provisioned Grafana
dashboard should already be loaded through the local Grafana URL. You should be able to immediately see
backfill data from prometheus coming in over the intervals set in config.yml, as well as new live data coming in
from prometheus-data-generator.
13
If you make a change to this, the change won’t be saveable to this provisioned dashboard as Grafana forces it to
be read-only. However, you can save a copy. If you wish, you can export this and replace FSMD.json in the
FSMD root folder, which is where the initial dashboard is provisioned from if you recreate the Grafana
container through docker compose.
A separate file detailing how to create and customize a Grafana Dashboard like this can be found here.
VI. References
TypeOfNaN tutorial on stopping Docker containers: How to Stop All Docker Containers | TypeOfNaN
Docker Desktop download: Docker Desktop
Git downloads: Git - Downloads
FSMD github: joshfireforever/FSMD
Grafana Documentation: Documentation - Grafana
Original source for Prometheus Data Generator: little-angry-clouds/prometheus-data-generator
Cloned and modified Prometheus Data Generator: joshfireforever/prometheus-data-generator