Age | Commit message (Collapse) | Author |
|
|
|
|
|
|
|
|
|
This is the same approach as it is with PSQL variable (the $NO_PSQL_OPTION
variable har-coded *MUST* be set to 0 to use them).
PGBINDIR can be set in ENV, hard-coded in check_postgres.pl or set via
command-line argument.
PGCONTROLDATA and PSQL can still be used but should be deprecated.
the logic is if NO_PSQL_OPTION=1 :
* that PGBINDIR can be set via environment variable, but not via config
file or command-line argument.
* that PSQL can not be set explicitly, but derived from PGBINDIR
|
|
Add check for pgagent jobs (David E. Wheeler)
|
|
From: "David E. Wheeler" <[email protected]>
This patch adds support for checking for failed pgAgent jobs within a specified
period of time. You can specify either --critical or --warning as a period of
time, and it will report on failures within that period of time previous to the
current time. Job failures are determined by a non-0 status in a job step
record.
Using this test obviously requiers that the pgAgent schema be installed. I've
also included a bunch of unit tests to make sure it works the way I would expect
(the test will create a schema for testing) and documentation.
As part of this, I've introduced the `any_warning` argument to
`validate_range()`. The `pgagent_jobs` test does not care if you specify a
warning value greater than the critical value (indeed, I expect that if one used
both at all, the warning would be much longer). So this new argument prevents
the `range-warnbigtime` or `range-warnbigsize` failures from being triggered.
Cedric: I sorted the POD and added the action_info so that t/05_docs.t is ok.
I also built and push the new .html
|
|
|
|
|
|
|
|
|
|
|
|
|
|
notation.
|
|
|
|
|
|
|
|
|
|
See http://eulerto.blogspot.com/2011/11/understanding-wal-nomenclature.html
|
|
No more late night coding for me. #easilybrokenpromises
|
|
|
|
|
|
|
|
Thanks to Emmanuel Lesouef for the bug report and help in tracking this down.
There are probably other incorrect inner joins to pg_user in the code.
|
|
|
|
This is based on --assume-standby-mode. Reduce the option name per suggestion
from Greg (but I kept the original one for standby mode).
The option is only used in check_postgres_checkpoint and allows to confirm or
emit a critical if the server is not in the expected mode.
Note: this can be used in other places, and maybe improved (to reduce the
number og open_controldata calls)
TODO/FIXME:
* I found that --assume-p or --assume-s are viewed by GetOpt like the longer
version of the option, a bug ?
* The original code to call pg_controldata does not work in French (because of
regex/locale). Why not use LANG=C in those checks where there is NO point to
use locale and error prone regex ?
|
|
Use the open_controldata where pg_controldata was used previously.
Also split the code for make_sure_standby_mode to reduce code for the future
option make_sure_prod.
|
|
This check is responsible to confirm that the Database System Identifier found
by pg_controldata is the one expected.
warning and critical allowed (like check_postgres_checksum) and must be run on
PostgreSQL server (like check_postgres_checkpoint)
While here, I created a new function open_controldata which can be used in
other places where pg_controldata is used.
|
|
|
|
|
|
dbservice)
|
|
|
|
|
|
|
|
|
|
|
|
Thanks to Cindy Wise for the bug report.
|
|
|
|
|
|
When using --dbservice option, $db->{dbname} is not set. So the $whodunit
variable initialization should first check if the $db->{dbname} is set.
Actually, we only initialize it for the MRTG output as it is not important in
the Nagios output.
|
|
The query_time action works before 8.1 if the query doesn't include the
xact_start column. But we need this column for the txn_time action. So, I
changed the query so that the query_time can work with 8.1 and upwards, and
that the txn_time works with 8.3 and upwards.
|
|
We still stop at the first error.
Per request from Aziz Boultabi.
|
|
|
|
|
|
|
|
* backends test issue
Critical and warning values were wrong for the negative number check.
And the output message for the --include check was wrong too.
* check_replicate_row issue
The UPDATE must be executed on the first server only.
* fsm_pages and fsm_relations test issue
The version test (ie max_fsm_* not available on 8.4 and later releases) must be done earlier.
* doc test issue
check_standby_mode rename to make_sure_standby_mode because all check_*
functions are expected to have documentation, but check_standby_mode is an
internal function.
* another doc test issue
pgbouncer_checksum documentation wasn't at the right location.
* drop_schema_if_exists issue
There was an unexpected return in the middle of the function, and so the
schema was never dropped.
|
|
|
|
|
|
|
|
discussion on the mailing list.
|