-
Notifications
You must be signed in to change notification settings - Fork 39
Description
Hi Alexander,
I've done a review and a bit of testing of the extension today, and I've ran into some strange issues in high-concurrency environments. Essentially, I do have two pgbench tests running at the same time:
-
a regular pgbench with 72 clients, using the standard workload (so "pgbench -c 72 ...")
-
a pgbench reading the collected wait data, essentially running this custom SQL script (16 clients)
select count() from pg_wait_sampling_current;
select count() from pg_wait_sampling_history;
select count(*) from pg_wait_sampling_profile;
After a short while, I get these errors in the second pgbench:
client 13 aborted in state 1: ERROR: Error reading mq.
client 4 aborted in state 1: ERROR: Error reading mq.
What's worse, running "pg_ctl restart" on the cluster times out - there's no CPU or I/O activity, the cluster should restart without any issue, but I suppose there are some locking issues or so, caused by the mq read failures.
Regarding the code - I'm not sure what is the purpose of setup_gucs(). Why not to simply define the GUC variables? If anything, get_guc_variables() is only meant to be used in help_config.c (per comment in guc.c).
Also, should the bgworker main method really do proc_exit(1) instead of proc_exit(0)? At least that's what the other workers I've seen do.