0% found this document useful (0 votes)

21 views18 pages

Hpcsa Block Monitoring Tutorial

This tutorial provides a step-by-step guide for installing and configuring a monitoring stack on a cloud server, specifically using InfluxDB, Telegraf, and Grafana. Participants will learn to set up InfluxDB, configure it, and create dashboards to visualize data. The tutorial includes detailed instructions for installation, configuration, and optional tasks for further exploration.

Uploaded by

vmkkolli

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views18 pages

Hpcsa Block Monitoring Tutorial

Uploaded by

vmkkolli

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

GWDG Tutorial 1 / March 1, 2023

AG Computing HPC System Administration / WiSe 2022/23

Marcus Merz 75 Minutes Total

Learning Objectives

The learning objectives in the tutorial are:

• Installing a monitoring stack on the cloud server
• Understand how the components of the monitoring stack work together/are interlinked
• Being able to create panels with plots to display data collected by the monitoring stack

Tools

• InfluxDB, Telegraf
• Grafana
• Centos 8 server
• an editor (vim, nano,. . . )
• bash
• browser

Contents

Installing InfluxDB 1: Tutorial (20 min) 2

1.1 Install the packages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Configuring InfluxDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Change port of InfluxDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Installing Telegraf 2: Tutorial (20 min) 5

2.1 Install package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Configure telegraf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Smoke test for Telegraf-InfluxDB connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Installing Grafana and setting up a simple dashboard 3: Tutorial (35 min) 8

3.1 Installing Grafana package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 First admin login . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.3 Create datasource . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.4 Create simple dashboard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Optional: Explore data and create own queries 4: Tutorial (0 min) 15

Optional: Install another Telegraf Plugin 5: Tutorial (0 min) 17

Optional: Install telegraf on the worker nodes 6: Tutorial (0 min) 17

Optional: Setup Database and Grafana with specific users 7: Tutorial (0 min) 17
Optional: Central setup for telegraf 8: Tutorial (0 min) 17

Installing InfluxDB 1: Tutorial (20 min)

Goal of this task is to set up the InfluxDB v2x, which is the foundation for the rest of the tutorial. After this
the InfluxDB should be running and the following information is available for the further tasks:
• < inf luxip > - IP of the InfluxDB server: not the floating ip
• < inf luxport > - Port of InfluxDB: the port the InfluxDB will listen to
• < org > - Organization Name - the organization of the database within InfluxDB
• < bucket > - Bucket Name - the name of the database within InfluxDB
• < inf luxuser > - the main user for the Bucket
• < inf luxpassword > Password - password of the user for the bucket
• < token > Token - the access token to access the bucket
• < graf anaadminpass > - the password for the admin access of the grafana server (application)
• < f rontendf loatingip > - the floating ip of the frontend server
The term bucket will be explained a little bit more later.
It should be ensured that the data listed above is stored somewhere - e.g. a pad or in a local editor - for easy
access.
The < f rontendf loatingip > can be found in the cloud administration tool cloud.gwdg.de

1.1 Install the packages

The database for storing the metrics provided by the different agent on the systems to monitor has to be set
up first. This will be done on the frontend server.
• Login to the server via ssh.
• $ ifconfig
• note down the inet ip adress of eth0 as < inf luxip > w/o the netmask: e.g if the inet is 10.254.1.9/24
< inf luxip > would be 10.254.1.9
• note down “8086” as < inf luxport > (will be changed later - just in case)
The standard Redhat/Centos package manager yum is used to install the InfluxDB. The repository is not yet
in the repository list of yum and has to be added. Copy the following bash code block to the bash and execute
it:

cat <<EOF | sudo tee /etc/yum.repos.d/influxdb.repo

[influxdb]
name = InfluxDB Repository { RHEL \$releasever
baseurl = https://repos.influxdata.com/rhel/\$releasever/\$basearch/stable
enabled = 1
gpgcheck = 1
gpgkey = https://repos.influxdata.com/influxdb.key
EOF

HPCSA – Tutorial 1 2/18

The InfluxDB repository should have been added. Now the InfluxDB and its according commandline tools can
be installed:
• $ sudo yum install influxdb2 influxdb2-cli --nogpgcheck
• confirm the installation of the two packages
In order to set the InfluxDB2 up the according service has to be started:
$ sudo systemctl start influxdb
This gives no feedback. In order to check if the system is running, the status can be checked via
$ sudo systemctl status influxdb
The output should be similar to this and should not contain errors or the info that the service did not start:
1 influxdb.service - InfluxDB is an open-source, distributed, time series database
2 Loaded: loaded (/usr/lib/systemd/system/influxdb.service; enabled; vendor preset: disabled)
3 Active: active (running) since Wed 2023-02-15 20:59:20 CET; 1h 16min ago
4 Docs: https://docs.influxdata.com/influxdb/
5 Main PID: 1777 (influxd)
6 Tasks: 10 (limit: 11167)
7 Memory: 22.0M
8 CGroup: /system.slice/influxdb.service
9 1777 /usr/bin/influxd -config /etc/influxdb/influxdb.conf
10
11 Feb 15 21:04:12 influx.novalocal influxd-systemd-start.sh[1777]: ts=2023-02-15T20:04:12.537837Z lvl=info
,→ msg="Executing query" log_id=0g1fPRtW000 service=query query="SHOW DATABASES"
12 Feb 15 21:04:12 influx.novalocal influxd-systemd-start.sh[1777]: [httpd] ::1 - - [15/Feb/2023:21:04:12
,→ +0100] "GET /query?q=SHOW+DATABASES HTTP/1.1" 200 109 "-" "curl/7.61.1"
,→ ea7acf28-ad6b-11ed-8003-fa163e2e1076 332
13 Feb 15 21:18:13 influx.novalocal influxd-systemd-start.sh[1777]: [httpd] ::1 - - [15/Feb/2023:21:18:13
,→ +0100] "POST /api/v2/query?org=hpcsa HTTP/1.1 " 403 100 "-" "influx/2.6.1 (linux) Sha/61c5b4d
,→ Date/2022-12-29T15:41:09Z" dfa73235-ad6d-11ed-8004-fa163e2e1076 107
14 Feb 15 21:18:24 influx.novalocal influxd-systemd-start.sh[1777]: [httpd] ::1 - - [15/Feb/2023:21:18:24
,→ +0100] "POST /api/v2/query?org=hpcsa HTTP/1.1 " 403 100 "-" "influx/2.6.1 (linux) Sha/61c5b4d
,→ Date/2022-12-29T15:41:09Z" e6389dcb-ad6d-11ed-8005-fa163e2e1076 89
15 Feb 15 21:29:19 influx.novalocal influxd-systemd-start.sh[1777]: ts=2023-02-15T20:29:19.633746Z lvl=info
,→ msg="Retention policy deletion check (start)" log_id=0g1fPRtW000 service=retention
,→ trace_id=0g1h7J_G000 op_name=retention_delete_check op_event=start
16 Feb 15 21:29:19 influx.novalocal influxd-systemd-start.sh[1777]: ts=2023-02-15T20:29:19.633798Z lvl=info
,→ msg="Retention policy deletion check (end)" log_id=0g1fPRtW000 service=retention
,→ trace_id=0g1h7J_G000 op_name=retention_delete_check op_event=end op_elapsed=0.065ms
17 Feb 15 21:43:24 influx.novalocal influxd-systemd-start.sh[1777]: [httpd] ::1 - - [15/Feb/2023:21:43:24
,→ +0100] "GET /health HTTP/1.1" 200 107 "-" "influx/2.6.1 (linux) Sha/61c5b4d
,→ Date/2022-12-29T15:41:09Z" 647410e2-ad71-11ed-8006-fa163e2e1076 110
18 Feb 15 21:59:19 influx.novalocal influxd-systemd-start.sh[1777]: ts=2023-02-15T20:59:19.633744Z lvl=info
,→ msg="Retention policy deletion check (start)" log_id=0g1fPRtW000 service=retention
,→ trace_id=0g1iqApG000 op_name=retention_delete_check op_event=start
19 Feb 15 21:59:19 influx.novalocal influxd-systemd-start.sh[1777]: ts=2023-02-15T20:59:19.633849Z lvl=info
,→ msg="Retention policy deletion check (end)" log_id=0g1fPRtW000 service=retention
,→ trace_id=0g1iqApG000 op_name=retention_delete_check op_event=end op_elapsed=0.116ms
20 Feb 15 22:13:52 influx.novalocal influxd-systemd-start.sh[1777]: [httpd] ::1 - - [15/Feb/2023:22:13:52
,→ +0100] "GET /health HTTP/1.1" 200 107 "-" "influx/2.6.1 (linux) Sha/61c5b4d
,→ Date/2022-12-29T15:41:09Z" a61e6767-ad75-11ed-8007-fa163e2e1076 64

To ensure that this service will be started after a boot it needs to be enabled permanentely:
$ sudo systemctl enable influxdb

Hints

• The reason to use the option –nogpgcheck is an issue with the gpg-keys that confirm the identity of the
binary packages that are installed via yum. Usually these fingerprints should be adapted by the package
provider when building new versions, but this seems to be not the case. It is usually not recommended

HPCSA – Tutorial 1 3/18

to install packages that fail to install because of gpg-fingerprints issue are just installed with this option.
The repositories and the situation should be checked before doing so.
• If a service does not start and the status from systemctl is not helpfull one should check the log files: e.g.
$ journalctl -eu influxdb

1.2 Configuring InfluxDB

When the InfluxDB service is installed and started it can be configured via the according commandline tools.
To set up an initial database a username, the password for this user and a bucket name - the database name
- has to be defined. Then the following command can be executed to set up InfluxDB (the values should be
replaced by the chosen ones). The general form of the setup command is:
$ influx setup --username <influx user> --password <influxpassword> --bucket <bucket>
It has to be ensured that the values are noted down and are accessible for later user.
The setup process will ask for < org > - enter it and write it down. Select “0” for the retention time. An
example setup command execution:
1 influx setup --username hpcuser --password hpcsa_user --bucket hpcsa
2 ? Please type your primary organization name gwdg
3 ? Please type your retention period in hours, or 0 for infinite 0 0
4 ? Setup with these parameters? --- confirm with 'y'

The output should be similar to this:

1 > Welcome to InfluxDB 2.0!
2 ? Please type your primary organization name gwdg
3 ? Please type your retention period in hours, or 0 for infinite 0
4 ? Setup with these parameters?
5 Username: hpcuser
6 Organization: gwdg
7 Bucket: hpcsa
8 Retention Period: infinite
9 Yes
10 User Organization Bucket
11 hpcuser gwdg hpcsa

Two tasks are performed by this command

• the initial database for the organization with the user to access it is created. The database is called
bucket in InfluxDB v2x as it is a combination of the database and the retention policy.
• a profile for accessing the freshly created database is created for the user. This allows easy access w/o
the need to provide the access token to the DB manually every time the user wants to utilize the influx
command to access a database. The file created is .influxdbv2/configs in the users home.
Now it is possible to check the authentication information stored in the DB:
$ influx auth list
1 ID Description Token User Name User ID Permissions
2 0ac1bf2d2a40b000 hpcuser's Token slgaGixZXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXw== hpcuser
,→ 0ac1bf2d0580b000 [read:/authorizations write:/authorizations read:/buckets write:/buckets
,→ read:/dashboards write:/dashboards read:/orgs write:/orgs read:/sources write:/sources read:/tasks
,→ write:/tasks read:/telegrafs write:/telegrafs read:/users write:/users read:/variables
,→ write:/variables read:/scrapers write:/scrapers read:/secrets write:/secrets read:/labels
,→ write:/labels read:/views write:/views read:/documents write:/documents read:/notificationRules
,→ write:/notificationRules read:/notificationEndpoints write:/notificationEndpoints read:/checks
,→ write:/checks read:/dbrp write:/dbrp read:/notebooks write:/notebooks read:/annotations
,→ write:/annotations read:/remotes write:/remotes read:/replications write:/replications]

HPCSA – Tutorial 1 4/18

The token will be important later and allows accessing the database via the port. Note the token down as
< token >.
In this example output the token is
“slgaGixZXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXw==”
- as this is different for every single config and installation the example token shown here can not be used, so
ensure that you use your token from your output.

1.3 Change port of InfluxDB

In case that another port has to be used by influx it can be changed by editing the config files. In this case
that is required in order to allow later access to the influx web interface. This can be done by appending the
according address to the InfluxDB server config and of course by adapting the config for the Influx CLI tool.
The following has to be executed line by line.

echo http-bind-address = \":8009\" | sudo tee -a /etc/influxdb/config.toml

sudo systemctl restart influxdb
sed -i s/8086/8009/g ~/.influxdbv2/configs

Important!
The $ tee command with the -a option appends a line to an existing file. If the given command is executed
multiple times, e.g. to configure another port for the http-bind-address the config file will end up with having
multiple lines defining that option. When restarting the InfluxDB service will fail due to a malformed config
while giving a non explainatory log output. In that case the wrong http-bind-address lines in the config file
/etc/influxdb/config.toml have to be removed.

To check if the InfluxDB has been restarted with the correct port:
$ sudo systemctl status influxdb | grep port
The output should show that InfluxDB uses port 8009 now:
1 Feb 16 21:58:42 worker.novalocal influxd-systemd-start.sh[11730]: ts=2023-02-16T20:58:42.601641Z lvl=info
,→ msg=Listening log\_id=0g30CKUG000 service=tcp-listener transport=http addr=:8009 port=8009

The < inf luxport > value in the notes has to be modified to be “8009”, too.

Installing Telegraf 2: Tutorial (20 min)

In order to fill the database at least one node agent has to run in order provide metrics to the InfluxDB. This
will be done in this tutorial.

2.1 Install package

The initial Telegraf agent will be installed on the frontend as this is the only machine with access to the internet
which is required for an easy download and install of the Telegraf package.
Telegraf is part of the InfluxDB repository and needs the same repository setup as InfluxDB. Therefore, Telegraf
can just be installed as InfluxDB repos has already been added to the repository list of the frontend server.
$ sudo yum install telegraf --nogpgcheck
The service is now installed but not started yet. Telegraf will not run with the default configuration - it has
to be modified first.

HPCSA – Tutorial 1 5/18

2.2 Configure telegraf

In order to run Telegraf it has to be configured properly. At least one input and one output plugin has to be
configured. The standard input plugin is configured, but the output is not yet done.
The information to be provided in the config in the following steps are the < inf luxip >, < inf luxport >,
< bucket >, < org > and < token > from the InfluxDB setup. This information is used by the Telegraf agent
to contact InfluxDB and access the database (bucket) that has been defined.

Steps

1. open the file /etc/telegraf/telegraf.conf with an editor using sudo (e.g sudo nano)
• if nano is not installed it must be installed via $ sudo yum install nano
2. search for the section [[outputs.influxdb v2]]
3. uncomment (remove the leading #) and modify following entries according to the data collected during
the influxdb config:
• # [[outputs.influxdb v2]] −→ just uncomment
• # urls = [”http://127.0.0.1:8086”] −→ urls = [”http://< inf luxip >:< inf luxport >”]
• # token = ”” −→ token = ”< token >”
• # organization = ”” −→ organization = ”< org >”
• # bucket = ”” −→ bucket = ”< bucket >”
4. save the file and leave the editor (for nano CTRL-o, Return, CTRL-X)

To check if the configuration is correct:

$ telegraf
If there are no errors in the Telegraf output the config is ok:
1 023-02-16T06:07:26Z I! Using config file: /etc/telegraf/telegraf.conf
2 2023-02-16T06:07:26Z I! Starting Telegraf 1.25.2
3 2023-02-16T06:07:26Z I! Available plugins: 228 inputs, 9 aggregators, 26 processors, 21 parsers, 57
,→ outputs, 2 secret-stores
4 2023-02-16T06:07:26Z I! Loaded inputs: cpu disk diskio kernel mem processes swap system
5 2023-02-16T06:07:26Z I! Loaded aggregators:
6 2023-02-16T06:07:26Z I! Loaded processors:
7 2023-02-16T06:07:26Z I! Loaded secretstores:
8 2023-02-16T06:07:26Z I! Loaded outputs: influxdb_v2
9 2023-02-16T06:07:26Z I! Tags enabled: host=worker2.novalocal
10 2023-02-16T06:07:26Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"worker2.novalocal", Flush
,→ Interval:10s
11 ^C2023-02-16T06:07:43Z I! [agent] Hang on, flushing any cached metrics before shutdown

This output shows which plugins are loaded for input and output.
To stop telegraf:
$ CTRL-C
Now the telegraf service can be started and enabled:
$ sudo systemctl start telegraf
$ sudo systemctl enable telegraf
Check if telegraf has been started:
$ sudo systemctl status telegraf

HPCSA – Tutorial 1 6/18

Hints

• the pre-configured standard plugin configuration can be found in the section [[inputs.cpu]] of the /etc/tele-
graf/telegraf.conf file. More details to the settings/metrics for this plugin can be found on the according
≪CPU Input Plugin≫ webpage.

2.3 Smoke test for Telegraf-InfluxDB connection

The influx commandline tools can be used to check if data is arriving in the influxdb. On the frontend do:
$ influx query
This opens a query pipeline. The influx tool now waits for a query. Copy this query to the shell with the query
pipeline

from(bucket: "<bucket>") |> range(start: -10m)

Press Ctrl-d, Ctrl-d to execute the query

This should give some output with table info, system info and data. Depending of the amount of data it will
look like:
1 Table: keys: [_start, _stop, _field, _measurement, device, fstype, host, mode, path]
2 _start:time _stop:time _field:string _measurement:string device:string fstype:string
,→ host:string mode:string path:string _time:time _value:int
3 ------------------------------ ------------------------------ ----------------------
,→ ---------------------- ---------------------- ---------------------- ----------------------
,→ ---------------------- ---------------------- ------------------------------
,→ --------------------------
4 2023-02-16T06:05:22.994420188Z 2023-02-16T06:15:22.994420188Z free disk vda1 ext4 worker2.novalocal rw /
,→ 2023-02-16T06:07:30.000000000Z 18264272896
5 Table: keys: [_start, _stop, _field, _measurement, device, fstype, host, mode, path]
6 _start:time _stop:time _field:string _measurement:string device:string fstype:string
,→ host:string mode:string path:string _time:time _value:int
7 ------------------------------ ------------------------------ ----------------------
,→ ---------------------- ---------------------- ---------------------- ----------------------
,→ ---------------------- ---------------------- ------------------------------
,→ --------------------------
8 2023-02-16T06:05:22.994420188Z 2023-02-16T06:15:22.994420188Z inodes_free disk vda1 ext4 worker2.novalocal
,→ rw / 2023-02-16T06:07:30.000000000Z 1274475
9 Table: keys: [_start, _stop, _field, _measurement, device, fstype, host, mode, path]
10 _start:time _stop:time _field:string _measurement:string device:string fstype:string
,→ host:string mode:string path:string _time:time : _value:int
11 ------------------------------ ------------------------------ ----------------------
,→ ---------------------- ---------------------- ---------------------- ----------------------
,→ ---------------------- ---------------------- ------------------------------
,→ --------------------------
12 2023-02-16T06:05:22.994420188Z 2023-02-16T06:15:22.994420188Z inodes_total disk vda1 ext4
,→ worker2.novalocal rw / 2023-02-16T06:07:30.000000000Z 1310720
13 Table: keys: [_start, _stop, _field, _measurement, device, fstype, host, mode, path]
14 _start:time _stop:time _field:string _measurement:string device:string fstype:string
,→ host:string mode:string path:string _time:time _value:int
15 ------------------------------ ------------------------------ ----------------------
,→ ---------------------- ---------------------- ---------------------- ----------------------
,→ ---------------------- ---------------------- ------------------------------
,→ --------------------------
16 2023-02-16T06:05:22.994420188Z 2023-02-16T06:15:22.994420188Z inodes_used disk vda1 ext4 worker2.novalocal
,→ rw / 2023-02-16T06:07:30.000000000Z 36245
17 Table: keys: [_start, _stop, _field, _measurement, device, fstype, host, mode, path]
18 _start:time _stop:time _field:string _measurement:string device:string fstype:string
,→ host:string mode:string path:string _time:time _value:int

HPCSA – Tutorial 1 7/18

19 ------------------------------ ------------------------------ ----------------------
,→ ---------------------- ---------------------- ---------------------- ----------------------
,→ ---------------------- ---------------------- ------------------------------
,→ --------------------------
20 2023-02-16T06:05:22.994420188Z 2023-02-16T06:15:22.994420188Z total disk vda1 ext4 worker2.novalocal rw /
,→ 2023-02-16T06:07:30.000000000Z 21046689792
21 Table: keys: [_start, _stop, _field, _measurement, device, fstype, host, mode, path]
22 _start:time _stop:time _field:string _measurement:string device:string fstype:string
,→ host:string mode:string path:string _time:time _value:int

Installing Grafana and setting up a simple dashboard 3: Tutorial (35 min)

Grafana will be used to display data collected by the Telegraf agent. In the following the Grafana service/server
will be setup and configured. At the end a simple dashboard will be created.

3.1 Installing Grafana package

Grafana will be the outward facing user interface that will be used to display plots from time-series data.
Therefor it needs to be installed on the frontend :
$ sudo yum install grafana
This will install two packages.
Before starting Grafana the port has to be adjusted to avoid conflicts:

• open /etc/grafana/grafana.ini in an editor with sudo right (e.g sudo nano - install it if not installed)
• search for the option http port
• set the value to 8000
• save the file and exit
Now the Grafana server can be started:
$ sudo systemctl start grafana-server
The status should be checked as before with InfluxDB and Telegraf.
$ sudo systemctl status grafana-server
If the server is running it should be checked if the port is set correctly to “8000”:
$ sudo systemctl status grafana-server | grep address
The output should show the correct port to be used by grafana-server:
1 Feb 17 06:57:16 frontend.novalocal grafana-server[27906]: t=2023-02-17T06:57:16+0100 lvl=info msg="HTTP
,→ Server Listen" logger=http.server address=[::]:8000 protocol=http subUrl= socket=

If everything is ok the service can be enabled permanetly:

$ sudo systemctl enable grafana-server

3.2 First admin login

Setting up Grafana is done via the web-interface. Grafana provides its own web server. First the “admin”
account has to be setup. This happens automatically when trying to login to Grafana for the first time.

HPCSA – Tutorial 1 8/18

Steps

1. open a browser
2. URL to use: < f rontendf loatingip >:8000
3. enter “admin” as user name
4. enter “admin” as password
5. Grafana ask for a new password and the according confirmation - enter an arbitrary password
6. the password should be noted down as < graf anaadminpass > - just in case

Hints

• If you loose the Grafana admin password it can be reset on the frontend:
$ sudo grafana-cli --homepath "/usr/share/grafana" admin reset-admin-password <new password>

3.3 Create datasource

In order to display data in Grafana the application requires the information where and how to retrieve the
data from. This is information a “Datasource”. It is possible to define multiple datasources, but in this case
only one is created.
As “admin” on the Grafana server select “Configuration – Data Sources” from the toolbar on the left:

To add a datasource select the “Add data source” button:

HPCSA – Tutorial 1 9/18

Now the data gathered before has to be provided in the form:
Select the option “InfluxDB” in the section time series db:

Fill out the given form with the collected information, check the image and the text below for hints:

HPCSA – Tutorial 1 10/18

• Name — select any name

HPCSA – Tutorial 1 11/18

• Query Language — select “Flux”
• Password for < inf luxuser > is < inf luxpassword >
• Token — < token >

When done press the “Save and test” button at the bottom of the form. If everything is setup correclty there
will be green feedback:

Grafana is now able to connect to the given database and retrieve metrics to display

3.4 Create simple dashboard

As the datasource is setup it is possible to create panels showing plots from metrics of this source. In the
following a simple dashboard is created using the Flux query language. Flux is a topic on its own, so the
according query will be provided.
This can be done as user “admin” in Grafana as no other user is created yet.
Select the menu “Dashboards — Manage” from the toolbar on the left:

Create a new dashboard by pressing “New dashboard”:

The dashboard creation interface shows up. Select the “Add an empty panel” option in the “Add panel” field:

HPCSA – Tutorial 1 12/18

The panel editor is now presented. It is already setup to the correct datasource as it is the only one created
yet.
Enter a panel title in the top right:

On the bottom is the query editor:

HPCSA – Tutorial 1 13/18

Copy and paste this Flux query into the editor and modify the bucket to match < bucket > from the InfluxDB
config:

from(bucket: "<bucket>")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "cpu")
|> filter(fn: (r) => r["_field"] == "usage_user")
|> filter(fn: (r) => r["cpu"] == "cpu-total" or r["cpu"] == "cpu0" or r["cpu"] == "cpu1")
|> yield(name: "mean")

This query request data from the given bucket with in the time-range given (this is a dynamic timerange). The
InfluxDB will first select all data in the given timespan, and they apply the filters on it. In this case the query
filters for the general CPU usage of the users on the systems which provide the according measurements.
Press apply in the top right corner:

The panel is shown, but it is empty - the timeframe has to be selected in the dropdown menu first

Now there should be data displayed in the panel

To save this dashboard for future use press the name/title of the dashboard and select “edit”. The editor
shows up again. Save it with the according button in the top right.

HPCSA – Tutorial 1 14/18

Optional: Explore data and create own queries 4: Tutorial (0 min)

The dashboard created is quite simple. The issue with creating more interesting panel is to know the available
metrics and the syntax. InfluxDB provides a web interface for data exploration to make this easier for the
users.
This web interface can be reached via web browser on < f rontendf loatingip:< inf luxport >.
Login is < inf luxuser > and < inf luxpassword >.
Select the dataexplorer from the toolbar on the left:

This will open the dataexplorer. At the bottom are columns letting users select, the bucket, measurements,
according metrics - with this it is easy to dig down through the data in InfluxDB.

HPCSA – Tutorial 1 15/18

To execute a query press the “Submit” button. This will build the graph accordingly.
In order to get the script for this query to use in other tools, e.g. Grafana, the script editor has to be openened
by pressing “Script Editor”.

HPCSA – Tutorial 1 16/18

To use the script displayed there in Grafana just mark the script and copy/paste it to the query editor in the
panel creation in Grafana.

Optional: Install another Telegraf Plugin 5: Tutorial (0 min)

This requires a working TIG Stack as it has been installed in the previous tasks.
Check the available plugins on the according webpage https://docs.influxdata.com/telegraf/v1.20/plugins/ ,
select one of them, install and configure it.

Optional: Install telegraf on the worker nodes 6: Tutorial (0 min)

Requires at least the PXE booting worker nodes setup before.

Two worker nodes have been set up during previous sessions. Install telegraf on those machines and integrate
them into a dashboard.

Optional: Setup Database and Grafana with specific users 7: Tutorial (0 min)

Requires a basic TIG stack running as setup in the first 3 tasks.

Recap the previous installation. Just one main user has been used to access the database and the Grafana
dashboard. Usually this is considered bad practice in regards to security. Setup users for the InfluxDB and
Grafana and modify all parts of the stack accordingly.

Hints

• Think about the following: Wow many users would be useful for the InfluxDB and for Grafana? What
rights should they have? Why?
• Do not forget to change parts the telegraf configuration.

Optional: Central setup for telegraf 8: Tutorial (0 min)

Requires the working TIG stack from this tutorial and telegraf already integrated on the worker nodes.
In a previous lecture slurm has been setup using a central installation and configuration. Is this possible for
telegraf, too? Or has something to be considered and adapted to telegraf. Implement you solution.

HPCSA – Tutorial 1 17/18

Hints

• HInt for tutor: you may have different hardware installed which utilize different plugins. Therefor at
least the configuration has to be local for the specific machine types/hardware.

HPCSA – Tutorial 1 18/18

Datalogging With MQTT, NODE-RED, INFLUXDB, GRAFANA Using RASPBERRY PI
No ratings yet
Datalogging With MQTT, NODE-RED, INFLUXDB, GRAFANA Using RASPBERRY PI
16 pages
Influxdb Introduction
No ratings yet
Influxdb Introduction
16 pages
(English (Auto-Generated) ) SuperHouse #41 - Datalogging With MQTT, Node-RED, InfluxDB, and Grafana (DownSub - Com) - Copie
No ratings yet
(English (Auto-Generated) ) SuperHouse #41 - Datalogging With MQTT, Node-RED, InfluxDB, and Grafana (DownSub - Com) - Copie
42 pages
Apache Nifi Tutorial
No ratings yet
Apache Nifi Tutorial
19 pages
System Development Models PDF
100% (1)
System Development Models PDF
25 pages
The Logstash Book
No ratings yet
The Logstash Book
219 pages
IBM CEPH Storage
No ratings yet
IBM CEPH Storage
180 pages
Collecting IoT Data in InfluxDB
No ratings yet
Collecting IoT Data in InfluxDB
48 pages
Influxdb Client Readthedocs Io en Stable
No ratings yet
Influxdb Client Readthedocs Io en Stable
123 pages
Graylog2 Docs
No ratings yet
Graylog2 Docs
105 pages
Hpcsa Block Monitoring Slides
No ratings yet
Hpcsa Block Monitoring Slides
19 pages
InfluxDB OSS Onboarding Guide
No ratings yet
InfluxDB OSS Onboarding Guide
73 pages
InfluxDB Documentation
No ratings yet
InfluxDB Documentation
34 pages
Hepsysman Influxdb Grafana v1
No ratings yet
Hepsysman Influxdb Grafana v1
35 pages
Open-Nti Presentation ESNOG
No ratings yet
Open-Nti Presentation ESNOG
29 pages
Intrototelegraf 20220509 220511182617 696b97ef
No ratings yet
Intrototelegraf 20220509 220511182617 696b97ef
45 pages
The Log Stash Book
No ratings yet
The Log Stash Book
220 pages
Influxdb Python
No ratings yet
Influxdb Python
25 pages
AOS8 Influxdb Grafana
No ratings yet
AOS8 Influxdb Grafana
66 pages
Tools & Integrations With InfluxDB
No ratings yet
Tools & Integrations With InfluxDB
25 pages
Influxdb
No ratings yet
Influxdb
7 pages
InfluxDB Enterprise v1.10 Setup - 0923
No ratings yet
InfluxDB Enterprise v1.10 Setup - 0923
13 pages
Influxdbr
No ratings yet
Influxdbr
13 pages
Collecting IoT Data in InfluxDB PDF
No ratings yet
Collecting IoT Data in InfluxDB PDF
48 pages
Andrei Dumitru - Influxdb
No ratings yet
Andrei Dumitru - Influxdb
26 pages
Installation Instructions For Script and Tools
No ratings yet
Installation Instructions For Script and Tools
14 pages
SRE-Practical Work 3 Monitoring and Alerting Setup
No ratings yet
SRE-Practical Work 3 Monitoring and Alerting Setup
6 pages
Carlos Fenoy García: Real-Time Monitoring Slurm Jobs With Influxdb September 2016
No ratings yet
Carlos Fenoy García: Real-Time Monitoring Slurm Jobs With Influxdb September 2016
19 pages
Nice Openedge DB Charts With Docker + Influxdb + Grafana
No ratings yet
Nice Openedge DB Charts With Docker + Influxdb + Grafana
3 pages
Install and Use The Influx CLI - InfluxDB OSS v2 Documentation
No ratings yet
Install and Use The Influx CLI - InfluxDB OSS v2 Documentation
5 pages
Global
No ratings yet
Global
6 pages
Install Grafana
No ratings yet
Install Grafana
8 pages
InfluxDB Grafana Performance Monitoring
No ratings yet
InfluxDB Grafana Performance Monitoring
3 pages
Grafana - ArchWiki
No ratings yet
Grafana - ArchWiki
2 pages
Grafana Setup
No ratings yet
Grafana Setup
3 pages
Graphing Arista EOS With GrafanaTelegraf and influxDB
No ratings yet
Graphing Arista EOS With GrafanaTelegraf and influxDB
6 pages
InfluxDB in Grafana
No ratings yet
InfluxDB in Grafana
5 pages
7.IT Infra Support Q&A
No ratings yet
7.IT Infra Support Q&A
3 pages
Banerjee P. From Novice To Ninja. Mastering DSA in C++ 2023
No ratings yet
Banerjee P. From Novice To Ninja. Mastering DSA in C++ 2023
104 pages
JMeter InfluxDB Graphana
No ratings yet
JMeter InfluxDB Graphana
1 page
Sap Abap Interview Questions
No ratings yet
Sap Abap Interview Questions
47 pages
Computer Hardware Disassembly and Assembly
No ratings yet
Computer Hardware Disassembly and Assembly
10 pages
Solution Manual For Modern Processor Design by John Paul Shen and Mikko H. Lipasti
No ratings yet
Solution Manual For Modern Processor Design by John Paul Shen and Mikko H. Lipasti
11 pages
COA Course File For Data Science
No ratings yet
COA Course File For Data Science
50 pages
General 1830 PSS Questions
No ratings yet
General 1830 PSS Questions
5 pages
Linux TCS
No ratings yet
Linux TCS
229 pages
Sheets Electronics
No ratings yet
Sheets Electronics
8 pages
Stack in Data Structures
No ratings yet
Stack in Data Structures
23 pages
Data Sheet: TDA1516BQ
No ratings yet
Data Sheet: TDA1516BQ
12 pages
User Manual: HMC9000A Diesel Engine Controller
No ratings yet
User Manual: HMC9000A Diesel Engine Controller
46 pages
Computer Vision Based Robotic Arm Controlled Using Interactive GUI
No ratings yet
Computer Vision Based Robotic Arm Controlled Using Interactive GUI
18 pages
FYP - Report Template
No ratings yet
FYP - Report Template
39 pages
Poornima Gupta Email: PH: 650 703 2554 Lead Java Engineer Summary of Qualifications
No ratings yet
Poornima Gupta Email: PH: 650 703 2554 Lead Java Engineer Summary of Qualifications
7 pages
Using JUCE ValueTrees and Modern C++ To Build Large Scale Applications
No ratings yet
Using JUCE ValueTrees and Modern C++ To Build Large Scale Applications
59 pages
Z-Source Inverter For Adjustable Speed Drives: M. Purushotham Roll - No: 9781D5405 M.Tech (PE&ED), II Year
No ratings yet
Z-Source Inverter For Adjustable Speed Drives: M. Purushotham Roll - No: 9781D5405 M.Tech (PE&ED), II Year
15 pages
Hadron Xtorm Series Presentation
No ratings yet
Hadron Xtorm Series Presentation
46 pages
Marc
No ratings yet
Marc
188 pages
Edc Q Bank
No ratings yet
Edc Q Bank
9 pages
Memory Systems
No ratings yet
Memory Systems
32 pages
Logic Gates Useful Power Point Contents
No ratings yet
Logic Gates Useful Power Point Contents
12 pages
CV - Nguyễn Đình Quý
No ratings yet
CV - Nguyễn Đình Quý
1 page
Opsis LD500
No ratings yet
Opsis LD500
16 pages
MikroTik Price List-May 2023-01.05.2023
No ratings yet
MikroTik Price List-May 2023-01.05.2023
5 pages
HFA16TB120: Ultrafast, Soft Recovery Diode Hexfred
No ratings yet
HFA16TB120: Ultrafast, Soft Recovery Diode Hexfred
6 pages
Qualified Vendors List (QVL) For: GA-F2A68HM-H
No ratings yet
Qualified Vendors List (QVL) For: GA-F2A68HM-H
3 pages
Untitled
No ratings yet
Untitled
8 pages
User Manual: FW Version: 1.0.5
No ratings yet
User Manual: FW Version: 1.0.5
9 pages
UFSBI Installation
No ratings yet
UFSBI Installation
5 pages
Build your own Blockchain: Make your own blockchain and trading bot on your pc
From Everand
Build your own Blockchain: Make your own blockchain and trading bot on your pc
Magelan Cybersecurity
No ratings yet
Professional Node.js: Building Javascript Based Scalable Software
From Everand
Professional Node.js: Building Javascript Based Scalable Software
Pedro Teixeira
No ratings yet
Evaluation of Some Cloud Based Virtual Private Server (VPS) Providers
From Everand
Evaluation of Some Cloud Based Virtual Private Server (VPS) Providers
Dr. Hidaia Mamood Alassouli
No ratings yet
Kubernetes Made Easy
From Everand
Kubernetes Made Easy
Pankaj Joshi
No ratings yet
Cybersecurity Blue Team Toolkit
From Everand
Cybersecurity Blue Team Toolkit
Nadean H. Tanner
2/5 (1)
The Little Book of Sitecore® Tips: Volume 1
From Everand
The Little Book of Sitecore® Tips: Volume 1
Neil P Shack
No ratings yet
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
From Everand
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
MARIO FRANCO
No ratings yet
Linux Services Deployment
From Everand
Linux Services Deployment
Fabian Mestre
No ratings yet
Configuration of a Simple Samba File Server, Quota and Schedule Backup
From Everand
Configuration of a Simple Samba File Server, Quota and Schedule Backup
Dr. Hedaya Alasooly
No ratings yet
Cisco Packet Tracer for Beginners
From Everand
Cisco Packet Tracer for Beginners
kalyan chinta
5/5 (3)
Linux DevOps Tools Engineer (701) Practice Tests: 400 Questions to Ace Your Certification
From Everand
Linux DevOps Tools Engineer (701) Practice Tests: 400 Questions to Ace Your Certification
Steve Brown
No ratings yet
Cisco CCNA Command Guide: An Introductory Guide for CCNA & Computer Networking Beginners: Computer Networking, #3
From Everand
Cisco CCNA Command Guide: An Introductory Guide for CCNA & Computer Networking Beginners: Computer Networking, #3
Ramon Nastase
4.5/5 (2)
CISCO PACKET TRACER LABS: Best practice of configuring or troubleshooting Network
From Everand
CISCO PACKET TRACER LABS: Best practice of configuring or troubleshooting Network
Mulayam Singh
No ratings yet
Network with Practical Labs Configuration: Step by Step configuration of Router and Switch configuration
From Everand
Network with Practical Labs Configuration: Step by Step configuration of Router and Switch configuration
Mulayam Singh
No ratings yet
Evaluation of Some Cloud Based Virtual Private Server (VPS) Providers
From Everand
Evaluation of Some Cloud Based Virtual Private Server (VPS) Providers
Dr. Hidaia Mahmood Alassouli
No ratings yet
Evaluation of Some Intrusion Detection and Vulnerability Assessment Tools
From Everand
Evaluation of Some Intrusion Detection and Vulnerability Assessment Tools
Dr. Hedaya Mahmood Alasooly
No ratings yet
A Practical Guide Wireshark Forensics
From Everand
A Practical Guide Wireshark Forensics
alasdair gilchrist
5/5 (4)
Evaluation of Some Windows and Linux Intrusion Detection Tools
From Everand
Evaluation of Some Windows and Linux Intrusion Detection Tools
Dr. Hidaia Mahmood Alassouli
No ratings yet
Evaluation of Some Windows and Linux Intrusion Detection Tools
From Everand
Evaluation of Some Windows and Linux Intrusion Detection Tools
Dr. Hedaya Alasooly
No ratings yet
Some Tutorials in Computer Networking Hacking
From Everand
Some Tutorials in Computer Networking Hacking
Dr. Hidaia Mahmood Alassouli
No ratings yet
Footprinting, Reconnaissance, Scanning and Enumeration Techniques of Computer Networks
From Everand
Footprinting, Reconnaissance, Scanning and Enumeration Techniques of Computer Networks
Dr. Hidaia Mahmood Alassouli
No ratings yet
Overview of Some Windows and Linux Intrusion Detection Tools
From Everand
Overview of Some Windows and Linux Intrusion Detection Tools
Dr. Hidaia Mahmood Alassouli
No ratings yet

Hpcsa Block Monitoring Tutorial

Uploaded by

Hpcsa Block Monitoring Tutorial

Uploaded by

GWDG Tutorial 1 / March 1, 2023

AG Computing HPC System Administration / WiSe 2022/23

The learning objectives in the tutorial are:

Installing InfluxDB 1: Tutorial (20 min) 2

Installing Telegraf 2: Tutorial (20 min) 5

Installing Grafana and setting up a simple dashboard 3: Tutorial (35 min) 8

Optional: Explore data and create own queries 4: Tutorial (0 min) 15

Optional: Install another Telegraf Plugin 5: Tutorial (0 min) 17

Optional: Install telegraf on the worker nodes 6: Tutorial (0 min) 17

Installing InfluxDB 1: Tutorial (20 min)

1.1 Install the packages

cat <<EOF | sudo tee /etc/yum.repos.d/influxdb.repo

HPCSA – Tutorial 1 2/18

HPCSA – Tutorial 1 3/18

1.2 Configuring InfluxDB

The output should be similar to this:

Two tasks are performed by this command

HPCSA – Tutorial 1 4/18

1.3 Change port of InfluxDB

echo http-bind-address = \":8009\" | sudo tee -a /etc/influxdb/config.toml

Installing Telegraf 2: Tutorial (20 min)

2.1 Install package

HPCSA – Tutorial 1 5/18

To check if the configuration is correct:

HPCSA – Tutorial 1 6/18

2.3 Smoke test for Telegraf-InfluxDB connection

from(bucket: "<bucket>") |> range(start: -10m)

Press Ctrl-d, Ctrl-d to execute the query

HPCSA – Tutorial 1 7/18

Installing Grafana and setting up a simple dashboard 3: Tutorial (35 min)

3.1 Installing Grafana package

If everything is ok the service can be enabled permanetly:

3.2 First admin login

HPCSA – Tutorial 1 8/18

3.3 Create datasource

To add a datasource select the “Add data source” button:

HPCSA – Tutorial 1 9/18

HPCSA – Tutorial 1 10/18

HPCSA – Tutorial 1 11/18

3.4 Create simple dashboard

Create a new dashboard by pressing “New dashboard”:

HPCSA – Tutorial 1 12/18

On the bottom is the query editor:

HPCSA – Tutorial 1 13/18

Now there should be data displayed in the panel

HPCSA – Tutorial 1 14/18

HPCSA – Tutorial 1 15/18

HPCSA – Tutorial 1 16/18

Optional: Install another Telegraf Plugin 5: Tutorial (0 min)

Optional: Install telegraf on the worker nodes 6: Tutorial (0 min)

Requires at least the PXE booting worker nodes setup before.

Requires a basic TIG stack running as setup in the first 3 tasks.

Optional: Central setup for telegraf 8: Tutorial (0 min)

HPCSA – Tutorial 1 17/18

HPCSA – Tutorial 1 18/18

You might also like