POC: track vacuum/analyze cumulative time per relation

Lists: pgsql-hackers
From: Sami Imseih <samimseih(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: POC: track vacuum/analyze cumulative time per relation
Date: 2025-01-02 18:24:06
Message-ID: CAA5RZ0uVOGBYmPEeGF2d1B_67tgNjKx_bKDuL+oUftuoz+=Y1g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

After a recent question regarding tracking vacuum start_time in
pg_stat_all_tables [1], it dawned on me that this view is missing
an important cumulative metric, which is how much time is spent
performing vacuum per table.

Currently, the only way a user can get this
information is if they enable autovacuum logging or track timing
for manual vacuums. Even then, if a user wants to trend
the time spent vacuuming over time, they must store the
timing data somewhere and perform the calculations.

Also, unless autovacuum logging is enabled for all a/v
operations, they could have gaps in their analysis.

Having the total (auto)vacuum elapsed time
along side the existing (auto)vaccum_count
allows a user to track the average time an
operating overtime and to find vacuum tuning
opportunities.

The same can also be said for (auto)analyze.

attached a patch ( without doc changes)
that adds 4 new columns:

total_autovacuum_time
total_vacuum_time
total_autoanalyze_time
total_analyze_time

Below is an example of output and how it
can be used to derive the average vacuum
operation time.

postgres=# select
relname,
autovacuum_count,
total_autovacuum_time,
total_autovacuum_time/NULLIF(autovacuum_count,0) average_autovac_time,
vacuum_count,
total_vacuum_time,
total_vacuum_time/NULLIF(vacuum_count,0) average_vac_time
from pg_catalog.pg_stat_all_tables
where relname = 'pgbench_history';
-[ RECORD 1 ]---------+-----------------
relname | pgbench_history
autovacuum_count | 3
total_autovacuum_time | 1689
average_autovac_time | 563
vacuum_count | 1
total_vacuum_time | 1
average_vac_time | 1

It should be noted that the timing is only tracked
when the vacuum or analyze operation completes,
as is the case with the other metrics.

Also, there is another discussion in-flight [2] regarding
tracking vacuum run history in a view, but this serves a
different purpose as this will provide all the metrics
that are other wise exposed in vacuum logging
via sql. This history will also be required to drop
entries using some criteria to keep the cache from
growing infinitely.

Feedback for the attached patch is appreciated!

Regards,

Sami Imseih
Amazon Web Services (AWS)

[1] https://www.postgresql.org/message-id/flat/CAGjGUAKQ4UBNdkjunH2qLsdUVG-3F9gCuG0Kb0hToo%2BuMmSteQ%40mail.gmail.com
[2] https://www.postgresql.org/message-id/flat/b68ab452-c41f-4d04-893f-eaab84f1855b%40vondra.me

Attachment Content-Type Size
POC-track-vacuum-analyze-cumulative-time-per-table.patch application/octet-stream 13.2 KB

From: wenhui qiu <qiuwenhuifx(at)gmail(dot)com>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: track vacuum/analyze cumulative time per relation
Date: 2025-01-04 03:41:24
Message-ID: CAGjGUAJEhvsjni3q5y0iGAptb6s5UpxKJiLj3A_1CUwGdrujdA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hi Sami
Thank you for your path,it seems some path monitor vacuum status,Can we
synthesize their good ideas together。

On Fri, 3 Jan 2025 at 02:24, Sami Imseih <samimseih(at)gmail(dot)com> wrote:

> Hi,
>
> After a recent question regarding tracking vacuum start_time in
> pg_stat_all_tables [1], it dawned on me that this view is missing
> an important cumulative metric, which is how much time is spent
> performing vacuum per table.
>
> Currently, the only way a user can get this
> information is if they enable autovacuum logging or track timing
> for manual vacuums. Even then, if a user wants to trend
> the time spent vacuuming over time, they must store the
> timing data somewhere and perform the calculations.
>
> Also, unless autovacuum logging is enabled for all a/v
> operations, they could have gaps in their analysis.
>
> Having the total (auto)vacuum elapsed time
> along side the existing (auto)vaccum_count
> allows a user to track the average time an
> operating overtime and to find vacuum tuning
> opportunities.
>
> The same can also be said for (auto)analyze.
>
> attached a patch ( without doc changes)
> that adds 4 new columns:
>
> total_autovacuum_time
> total_vacuum_time
> total_autoanalyze_time
> total_analyze_time
>
> Below is an example of output and how it
> can be used to derive the average vacuum
> operation time.
>
> postgres=# select
> relname,
> autovacuum_count,
> total_autovacuum_time,
> total_autovacuum_time/NULLIF(autovacuum_count,0) average_autovac_time,
> vacuum_count,
> total_vacuum_time,
> total_vacuum_time/NULLIF(vacuum_count,0) average_vac_time
> from pg_catalog.pg_stat_all_tables
> where relname = 'pgbench_history';
> -[ RECORD 1 ]---------+-----------------
> relname | pgbench_history
> autovacuum_count | 3
> total_autovacuum_time | 1689
> average_autovac_time | 563
> vacuum_count | 1
> total_vacuum_time | 1
> average_vac_time | 1
>
> It should be noted that the timing is only tracked
> when the vacuum or analyze operation completes,
> as is the case with the other metrics.
>
> Also, there is another discussion in-flight [2] regarding
> tracking vacuum run history in a view, but this serves a
> different purpose as this will provide all the metrics
> that are other wise exposed in vacuum logging
> via sql. This history will also be required to drop
> entries using some criteria to keep the cache from
> growing infinitely.
>
> Feedback for the attached patch is appreciated!
>
> Regards,
>
> Sami Imseih
> Amazon Web Services (AWS)
>
> [1]
> https://www.postgresql.org/message-id/flat/CAGjGUAKQ4UBNdkjunH2qLsdUVG-3F9gCuG0Kb0hToo%2BuMmSteQ%40mail.gmail.com
> [2]
> https://www.postgresql.org/message-id/flat/b68ab452-c41f-4d04-893f-eaab84f1855b%40vondra.me
>


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: wenhui qiu <qiuwenhuifx(at)gmail(dot)com>, Sami Imseih <samimseih(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: track vacuum/analyze cumulative time per relation
Date: 2025-01-10 14:37:26
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hi!

On 04.01.2025 06:41, wenhui qiu wrote:
> Hi Sami
>  Thank you for your path,it seems some path monitor vacuum status,Can
> we synthesize their good ideas together。
>
I've been working on a patch that collects vacuum statistics since May
[0]. It includes heap and index relation vacuum statistics, and of
course, database vacuum statistics. We welcome reviews and any feedback!

It collects a bit more information than total_time, such as delay_time,
buffer statistics, wal statistics, number of processed, skipped pages
and tuples during vacuum operations, so
this information gives a complete picture of vacuum efficiency, because
analyzing only total_time statistics does not give a complete picture of
what vacuum did: was it cleaning almost huge index, cleaning tables or
just sleeping.

[0] https://commitfest.postgresql.org/51/5012/

--
Regards,
Alena Rybakina
Postgres Professional


From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: track vacuum/analyze cumulative time per relation
Date: 2025-01-10 15:09:50
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On Thu, Jan 02, 2025 at 12:24:06PM -0600, Sami Imseih wrote:
> Having the total (auto)vacuum elapsed time
> along side the existing (auto)vaccum_count
> allows a user to track the average time an
> operating overtime and to find vacuum tuning
> opportunities.
>
> The same can also be said for (auto)analyze.

I think that makes sense to expose those metrics through SQL as you're proposing
here. The more we expose through SQL the better I think.

> attached a patch ( without doc changes)
> that adds 4 new columns:

Thanks!

A few random comments:

=== 1

+ endtime = GetCurrentTimestamp();
pgstat_report_vacuum(RelationGetRelid(rel),
rel->rd_rel->relisshared,
Max(vacrel->new_live_tuples, 0),
vacrel->recently_dead_tuples +
- vacrel->missed_dead_tuples);
+ vacrel->missed_dead_tuples,
+ starttime,
+ endtime);
pgstat_progress_end_command();

if (instrument)
{
- TimestampTz endtime = GetCurrentTimestamp();

What about keeping the endtime assignment after the pgstat_progress_end_command()
call? I think that it makes more sense that way.

=== 2

pgstat_report_vacuum(RelationGetRelid(rel),
rel->rd_rel->relisshared,
Max(vacrel->new_live_tuples, 0),
vacrel->recently_dead_tuples +
- vacrel->missed_dead_tuples);
+ vacrel->missed_dead_tuples,
+ starttime,
+ endtime);

What about doing the elapsedtime computation prior the this call and passed
it as a parameter (then remove the starttime one and keep the endtime as it
is needed)?

=== 3

pgstat_report_analyze(onerel, totalrows, totaldeadrows,
- (va_cols == NIL));
+ (va_cols == NIL), starttime, endtime);

Same as 2 for pgstat_report_analyze() except that the endtime could be removed
too.

=== 4

+/* pg_stat_get_vacuum_time */
+PG_STAT_GET_RELENTRY_INT64(total_vacuum_time)
+
+/* pg_stat_get_autovacuum_time */
+PG_STAT_GET_RELENTRY_INT64(total_autovacuum_time)
+
+/* pg_stat_get_analyze_time */
+PG_STAT_GET_RELENTRY_INT64(total_analyze_time)
+
+/* pg_stat_get_autoanalyze_time */
+PG_STAT_GET_RELENTRY_INT64(total_autoanalyze_time)

I wonder if it wouldn't be better to use FLOAT8 here (to match things like
pg_stat_get_checkpointer_write_time(), pg_stat_get_checkpointer_sync_time() among
others).

=== 5

+
+ PgStat_Counter total_vacuum_time; /* user initiated vacuum */
+ PgStat_Counter total_autovacuum_time; /* autovacuum initiated */
+ PgStat_Counter total_analyze_time; /* user initiated vacuum */
+ PgStat_Counter total_autoanalyze_time; /* autovacuum initiated */

Those comments look weird to me.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: wenhui qiu <qiuwenhuifx(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: track vacuum/analyze cumulative time per relation
Date: 2025-01-10 16:07:43
Message-ID: CAA5RZ0tBsAa-CY=-z_VPjFvMRWnKLzkt0QAUQxUx+CO1-YiS8g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

> this information gives a complete picture of vacuum efficiency, because
> analyzing only total_time statistics does not give a complete picture of
> what vacuum did: was it cleaning almost huge index, cleaning tables or
> just sleeping.

The purpose of total_time is to be able to calculate the average
time of a vacuum per relation from pg_stat_all_tables (by
total_time/vacuum_count) This is useful on its own to be able to
trend slow downs of vacuum ( or improvements after vacuum tuning )
for specific relations.

The work going into [1] is more detailed and will be useful
for deep dive investigations.In short, I think both proposals
have merit.

[1] https://commitfest.postgresql.org/51/5012/

Regards,

Sami


From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: wenhui qiu <qiuwenhuifx(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: track vacuum/analyze cumulative time per relation
Date: 2025-01-10 16:21:56
Message-ID: CAA5RZ0uVgHa2JLSEsq8jDMRdUmyfhz9gg0UHhx0g2zoB7h_faA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

> > this information gives a complete picture of vacuum efficiency, because
> > analyzing only total_time statistics does not give a complete picture of
> > what vacuum did: was it cleaning almost huge index, cleaning tables or
> > just sleeping.
>
> The purpose of total_time is to be able to calculate the average
> time of a vacuum per relation from pg_stat_all_tables (by
> total_time/vacuum_count) This is useful on its own to be able to
> trend slow downs of vacuum ( or improvements after vacuum tuning )
> for specific relations.
>
> The work going into [1] is more detailed and will be useful
> for deep dive investigations.In short, I think both proposals
> have merit.

Missed this point in my last reply. This patch also tracks
(auto)analyze per relation cumulative timing which can
be useful along with (auto)analyze_count to derive
the average.

Regards,

Sami


From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: track vacuum/analyze cumulative time per relation
Date: 2025-01-10 22:26:05
Message-ID: CAA5RZ0tP6my7Mbr_syP7hRAiFANC_bG8SyoQF=-vfV9VawpV=g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Thanks for the review!

> === 1
>
> + endtime = GetCurrentTimestamp();
> pgstat_report_vacuum(RelationGetRelid(rel),
> rel->rd_rel->relisshared,
> Max(vacrel->new_live_tuples, 0),
> vacrel->recently_dead_tuples +
> - vacrel->missed_dead_tuples);
> + vacrel->missed_dead_tuples,
> + starttime,
> + endtime);
> pgstat_progress_end_command();
>
> if (instrument)
> {
> - TimestampTz endtime = GetCurrentTimestamp();
>
> What about keeping the endtime assignment after the pgstat_progress_end_command()
> call? I think that it makes more sense that way.

Yes, that can be done. I see a few more examples in
which we report stats after end_command, i.e.
AbortTransaction

> What about doing the elapsedtime computation prior the this call and passed
> it as a parameter (then remove the starttime one and keep the endtime as it
> is needed)?

Yes, makes sense. It will keep the changes to pgstat_report_ to a minimum.

> === 4
>
> +/* pg_stat_get_vacuum_time */
> +PG_STAT_GET_RELENTRY_INT64(total_vacuum_time)
> +
> +/* pg_stat_get_autovacuum_time */
> +PG_STAT_GET_RELENTRY_INT64(total_autovacuum_time)
> +
> +/* pg_stat_get_analyze_time */
> +PG_STAT_GET_RELENTRY_INT64(total_analyze_time)
> +
> +/* pg_stat_get_autoanalyze_time */
> +PG_STAT_GET_RELENTRY_INT64(total_autoanalyze_time)
>
> I wonder if it wouldn't be better to use FLOAT8 here (to match things like
> pg_stat_get_checkpointer_write_time(), pg_stat_get_checkpointer_sync_time() among
> others).
>
> === 5

I see the checkpointer write_time for example is converted
to double "for presentation". I am not sure why, expect for maybe
historical reasons. I will do the same for this new cumulative
time field for the sake of consistency. There is no macro for
relentry to return a stat as FLOAT8, so adding it as part of this
patch.

Datum
pg_stat_get_checkpointer_write_time(PG_FUNCTION_ARGS)
{
/* time is already in msec, just convert to double for presentation */
PG_RETURN_FLOAT8((double)
pgstat_fetch_stat_checkpointer()->write_time);
}

> +
> + PgStat_Counter total_vacuum_time; /* user initiated vacuum */
> + PgStat_Counter total_autovacuum_time; /* autovacuum initiated */
> + PgStat_Counter total_analyze_time; /* user initiated vacuum */
> + PgStat_Counter total_autoanalyze_time; /* autovacuum initiated */
>
> Those comments look weird to me.
>

Ok, Will remove these.

I also updated the comments for the instrument code path to reflect the
fact starttime is now set for all cases.Also, added documentation.

See the attached v2.

Regards,

Sami

Attachment Content-Type Size
v2-0001-Track-per-relation-cumulative-time-spent-in-vacuu.patch application/octet-stream 16.4 KB

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: track vacuum/analyze cumulative time per relation
Date: 2025-01-13 15:55:50
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On Fri, Jan 10, 2025 at 04:26:05PM -0600, Sami Imseih wrote:
> {
> /* time is already in msec, just convert to double for presentation */
> PG_RETURN_FLOAT8((double)
> pgstat_fetch_stat_checkpointer()->write_time);
> }
>
> > +
> > + PgStat_Counter total_vacuum_time; /* user initiated vacuum */
> > + PgStat_Counter total_autovacuum_time; /* autovacuum initiated */
> > + PgStat_Counter total_analyze_time; /* user initiated vacuum */
> > + PgStat_Counter total_autoanalyze_time; /* autovacuum initiated */
> >
> > Those comments look weird to me.
> >
>
> Ok, Will remove these.
>
> I also updated the comments for the instrument code path to reflect the
> fact starttime is now set for all cases.Also, added documentation.
>
> See the attached v2.

Thanks for the patch update!

A few random comments:

=== 1

+/* pg_stat_get_vacuum_time */
+PG_STAT_GET_RELENTRY_FLOAT8(total_vacuum_time)
+
+/* pg_stat_get_autovacuum_time */
+PG_STAT_GET_RELENTRY_FLOAT8(total_autovacuum_time)
+
+/* pg_stat_get_analyze_time */
+PG_STAT_GET_RELENTRY_FLOAT8(total_analyze_time)
+
+/* pg_stat_get_autoanalyze_time */
+PG_STAT_GET_RELENTRY_FLOAT8(total_autoanalyze_time)
+

The comments do not reflect the function names ("total" is missing to give
pg_stat_get_total_vacuum_time() and such).

=== 2

+#define PG_STAT_GET_RELENTRY_FLOAT8(stat)
+Datum
+CppConcat(pg_stat_get_,stat)(PG_FUNCTION_ARGS)
+{
+ Oid relid = PG_GETARG_OID(0);
+ int64 result;
+ PgStat_StatTabEntry *tabentry;
+
+ if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
+ result = 0;
+ else
+ result = (float8) (tabentry->stat);
+
+ PG_RETURN_FLOAT8(result);
+}
+
/* pg_stat_get_analyze_count */
PG_STAT_GET_RELENTRY_INT64(analyze_count

I think it's better to define the macro just before its first usage (meaning
just after pg_stat_get_vacuum_count()): that would be consistent with the places
the other macros are defined.

=== 3

+ int64 result;

s/int64/double/?

=== 4

+ Total time this table has spent in manual vacuum
+ </para></entry>

Mention the unit?

=== 5

+ /*
+ * When verbose or autovacuum logging is used, initialize a resource usage
+ * snapshot and optionally track I/O timing.
+ */
if (instrument)
{

Out of curiosity, why this extra comment? To be somehow consistent with
do_analyze_rel()?

=== 6

@@ -343,6 +349,8 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
pgstat_progress_start_command(PROGRESS_COMMAND_VACUUM,
RelationGetRelid(rel));

+ starttime = GetCurrentTimestamp();

I wonder if it wouldn't make more sense to put the GetCurrentTimestamp() call
before the pgstat_progress_start_command() call. That would be aligned with the
"endtime" being after the pgstat_progress_end_command() and where it was before
the patch.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: track vacuum/analyze cumulative time per relation
Date: 2025-01-14 23:01:52
Message-ID: CAA5RZ0v9060p7hjtLPY9i0Me6SoaRKYwxATqt_DnZr8dGSnUhQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

thanks for the review!

> The comments do not reflect the function names ("total" is missing to give
> pg_stat_get_total_vacuum_time() and such).

fixed

> I think it's better to define the macro just before its first usage (meaning
> just after pg_stat_get_vacuum_count()): that would be consistent with the places
> the other macros are defined.

correct. I fixed this to match the placement of similar macros.

> s/int64/double/?

fixed

> Mention the unit?

fixed

> Out of curiosity, why this extra comment? To be somehow consistent with
> do_analyze_rel()?

Yes, I added it for consistency, but since the comment was not there
before, I just decided to remove it.

> I wonder if it wouldn't make more sense to put the GetCurrentTimestamp() call
> before the pgstat_progress_start_command() call. That would be aligned with the
> "endtime" being after the pgstat_progress_end_command() and where it was before
> the patch.

I agree. Fixed.

Please see the attached v3.

Regards,

Sami

Attachment Content-Type Size
v3-0001-Track-per-relation-cumulative-time-spent-in-vacuu.patch application/octet-stream 16.0 KB

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: track vacuum/analyze cumulative time per relation
Date: 2025-01-15 13:11:26
Message-ID: Z4ez/[email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On Tue, Jan 14, 2025 at 05:01:52PM -0600, Sami Imseih wrote:
> Please see the attached v3.

Thanks!

A few comments:

=== 1

+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>total_vacuum_time</structfield> <type>bigint</type>
+ </para>

Those new fields should be documented as "double precision".

=== 2

+#define PG_STAT_GET_RELENTRY_FLOAT8(stat) \
+Datum \
+CppConcat(pg_stat_get_,stat)(PG_FUNCTION_ARGS) \
+{ \
+ Oid relid = PG_GETARG_OID(0); \
+ float8 result; \
+ PgStat_StatTabEntry *tabentry; \
+ \
+ if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL) \
+ result = 0; \
+ else \
+ result = (float8) (tabentry->stat); \
+ \
+ PG_RETURN_FLOAT8(result); \
+}

I did propose "double" up-thread to be consistent with the code around. That
would mean to also cast to "double". That's just for consistency purpose. What do
you think?

Appart from the above that LGTM.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: track vacuum/analyze cumulative time per relation
Date: 2025-01-15 17:09:00
Message-ID: CAA5RZ0s8=wd-X4GgMcFQO1WbwPtw4HjTtMcDahsnegZ8p6tYPQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

> Appart from the above that LGTM.

thanks! I took care of your comments. I was not sure
about the need to cast "double" but as you mention this
is consistent with other parts of pgstatfuncs.c

v4 attached.

Regards,

Sami

Attachment Content-Type Size
v4-0001-Track-per-relation-cumulative-time-spent-in-vacuu.patch application/octet-stream 16.2 KB

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: track vacuum/analyze cumulative time per relation
Date: 2025-01-17 13:42:52
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On Wed, Jan 15, 2025 at 11:09:00AM -0600, Sami Imseih wrote:
> > Appart from the above that LGTM.
>
> thanks! I took care of your comments.

Thanks!

> I was not sure
> about the need to cast "double" but as you mention this
> is consistent with other parts of pgstatfuncs.c

Yup.

> v4 attached.

One comment:

+ PG_RETURN_FLOAT8(result); \

The "\" indentation looks wrong for this line (was ok in v3).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: track vacuum/analyze cumulative time per relation
Date: 2025-01-17 15:42:13
Message-ID: CAA5RZ0vKWKULGZ_V8DOOtFHXtsOLMVLCpkJ+guniB66T-o_eZA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

> One comment:
>
> + PG_RETURN_FLOAT8(result); \
>
> The "\" indentation looks wrong for this line (was ok in v3).
>
> Regards,
>

Indeed. Corrected in v5

Thanks!

Sami

Attachment Content-Type Size
v5-0001-Track-per-relation-cumulative-time-spent-in-vacuu.patch application/octet-stream 16.2 KB

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: track vacuum/analyze cumulative time per relation
Date: 2025-01-20 03:25:17
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jan 17, 2025 at 09:42:13AM -0600, Sami Imseih wrote:
> Indeed. Corrected in v5

I can get behind the idea of the patch.

+ PgStat_Counter total_vacuum_time;
+ PgStat_Counter total_autovacuum_time;
+ PgStat_Counter total_analyze_time;
+ PgStat_Counter total_autoanalyze_time;
} PgStat_StatTabEntry;

These are time values in milliseconds. It would be good to document
that as a comment with these fields, like PgStat_CheckpointerStats and
others. Perhaps they should be grouped with their related counters
within PgStat_StatTabEntry? Or not.

The patch makes sense here in terms of passing down to
pgstat_report_vacuum() the elapsed time and the end time to avoid an
extra GetCurrentTimestamp() when calculating if the logging should
happen. However in analyze.c..

+++ b/src/backend/commands/analyze.c
[...]
+ TimestampTz endtime = 0;
[...]
- TimestampTz endtime = GetCurrentTimestamp();
-
if (verbose || params->log_min_duration == 0 ||
TimestampDifferenceExceeds(starttime, endtime,

This is incorrect, endtime is never set. I would give priority to the
symmetry of the code here, using for the analyze reporting the same
arguments as the vacuum bits, so as we don't call GetCurrentTimestamp
in the proposed changes for pgstat_report_analyze().

+ if (AmAutoVacuumWorkerProcess())
+ tabentry->total_autovacuum_time += elapsedtime;
+ else
+ tabentry->total_vacuum_time += elapsedtime;
[...]
+ if (AmAutoVacuumWorkerProcess())
+ tabentry->total_autoanalyze_time += elapsedtime;
+ else
+ tabentry->total_analyze_time += elapsedtime;

These could be grouped with their previous blocks.
--
Michael


From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: track vacuum/analyze cumulative time per relation
Date: 2025-01-20 17:04:40
Message-ID: CAA5RZ0su2Unc+iJkhL6QEfsbjj6-WXjRWyf2E0b4BotUH=o4FA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

> I can get behind the idea of the patch.

Thanks for the review!

> + PgStat_Counter total_vacuum_time;
> + PgStat_Counter total_autovacuum_time;
> + PgStat_Counter total_analyze_time;
> + PgStat_Counter total_autoanalyze_time;
> } PgStat_StatTabEntry;
> These are time values in milliseconds. It would be good to document
> that as a comment with these fields, like PgStat_CheckpointerStats and
> others. Perhaps they should be grouped with their related counters
> within PgStat_StatTabEntry? Or not.

done, including grouping all the vacuum counters together.

> This is incorrect, endtime is never set.

indeed. fixed.

> I would give priority to the
> symmetry of the code here, using for the analyze reporting the same
> arguments as the vacuum bits, so as we don't call GetCurrentTimestamp
> in the proposed changes for pgstat_report_analyze().

I agree.
pgstat_report_analyze and pgstat_report_vacuum match
will now take endtime and elapsedtime and we can remove
the GetCurrentTimestamp inside pgstat_report_analyze.

> + if (AmAutoVacuumWorkerProcess())
> + tabentry->total_autovacuum_time += elapsedtime;
> + else
> + tabentry->total_vacuum_time += elapsedtime;
> [...]
> + if (AmAutoVacuumWorkerProcess())
> + tabentry->total_autoanalyze_time += elapsedtime;
> + else
> + tabentry->total_analyze_time += elapsedtime;
>

Correct. fixed as well.

See the attached v6.

Thanks!

Regards,

Sami

Attachment Content-Type Size
v6-0001-Track-per-relation-cumulative-time-spent-in-vacuu.patch application/octet-stream 16.6 KB

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: track vacuum/analyze cumulative time per relation
Date: 2025-01-24 07:32:45
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 20, 2025 at 11:04:40AM -0600, Sami Imseih wrote:
>> + PgStat_Counter total_vacuum_time;
>> + PgStat_Counter total_autovacuum_time;
>> + PgStat_Counter total_analyze_time;
>> + PgStat_Counter total_autoanalyze_time;
>> } PgStat_StatTabEntry;
>> These are time values in milliseconds. It would be good to document
>> that as a comment with these fields, like PgStat_CheckpointerStats and
>> others. Perhaps they should be grouped with their related counters
>> within PgStat_StatTabEntry? Or not.
>
> done, including grouping all the vacuum counters together.

I was referring to the order of the fields in the structure itself,
but that's no big deal one way or the other.

> See the attached v6.

+ PgStat_Counter total_vacuum_time; /* times in milliseconds */
+ PgStat_Counter total_autovacuum_time; /* times in milliseconds */
+ PgStat_Counter total_analyze_time; /* times in milliseconds */
+ PgStat_Counter total_autoanalyze_time; /* times in milliseconds */

This should be one comment for the whole block, or this should use the
singular for each comment.

instrument = (verbose || (AmAutoVacuumWorkerProcess() &&
params->log_min_duration >= 0));
+
if (instrument)

Some noise.

if (instrument)
{
- TimestampTz endtime = GetCurrentTimestamp();

On HEAD in the ANALYZE path, the endtime is calculated after
index_vacuum_cleanup(). Your patch calculates it before
index_vacuum_cleanup(), which would result in an incorrect calculation
if the index cleanup takes a long time with less logs generated, no?
Sorry for not noticing that earlier on the thread.. Perhaps it would
be just better to pass the start time to pgstat_report_vacuum() and
pgstat_report_analyze() then let their internals do the elapsed time
calculations. Consistency of the arguments for both functions is
something worth having, IMO, even if it means a bit more
GetCurrentTimestamp() in this case.
--
Michael


From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: track vacuum/analyze cumulative time per relation
Date: 2025-01-24 20:19:59
Message-ID: CAA5RZ0s9FSC8cU65D_XeV19-Zd3kFgJrBokGjq-=_Ccpjk6ZLw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

> I was referring to the order of the fields in the structure itself,
> but that's no big deal one way or the other.

I understand your point now. I will group them with the
related counters in the next rev and will use

> This should be one comment for the whole block, or this should use the
> singular for each comment.

I will use a singular "/* time in milliseconds */" comment for each
new field.

This existing write_time field is plural, but I will leave
that one alone for now.

PgStat_Counter write_time; /* times in milliseconds */

> On HEAD in the ANALYZE path, the endtime is calculated after
> index_vacuum_cleanup(). Your patch calculates it before
> index_vacuum_cleanup(), which would result in an incorrect calculation
> if the index cleanup takes a long time with less logs generated, no?
> Sorry for not noticing that earlier on the thread.. Perhaps it would
> be just better to pass the start time to pgstat_report_vacuum() and
> pgstat_report_analyze() then let their internals do the elapsed time
> calculations. Consistency of the arguments for both functions is
> something worth having, IMO, even if it means a bit more
> GetCurrentTimestamp() in this case.

So currently, the report of the last_(autoanalyze|analyze)_time
is set before the index_vacuum_cleanup, but for logging purposes
the elapsed time is calculated afterwards. Most users will not notice
this, but I think that is wrong as well.

I think we should calculate endtime and elapsedtime and call
pgstat_report_analyze after the index_vacuum_cleanup; and
before vac_close_indexes. This is more accurate and will
avoid incurring the extra GetCurrentTimestamp() call.

What do you think?

Regards,

Sami


From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: track vacuum/analyze cumulative time per relation
Date: 2025-01-27 00:30:16
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jan 24, 2025 at 02:19:59PM -0600, Sami Imseih wrote:
> So currently, the report of the last_(autoanalyze|analyze)_time
> is set before the index_vacuum_cleanup, but for logging purposes
> the elapsed time is calculated afterwards. Most users will not notice
> this, but I think that is wrong as well.
>
> I think we should calculate endtime and elapsedtime and call
> pgstat_report_analyze after the index_vacuum_cleanup; and
> before vac_close_indexes. This is more accurate and will
> avoid incurring the extra GetCurrentTimestamp() call.
>
> What do you think?

I think that this is hiding a behavior change while adding new
counters, and both changes are independent. Another thing that is
slightly incorrect to do if we take the argument of only adding the
counters is moving around the call of pgstat_progress_end_command(),
because it's not really wrong as-is, either. I'd suggest to make all
that a separate discussion.

The addition of the extra GetCurrentTimestamp() in the report path
does not stress me much, FWIW, and keeping the tracking of the end
time within the report routines for [auto]analyze and [auto]vacuum has
the merit that the caller does not need to worry about more
TimestampDifferenceMilliseconds() calculations.

I have put my hands on this patch, and at the end I think that we
should just do the attached, which is simpler and addresses your
use-case. Note also that the end time is acquired while the entries
are not locked in the report routines, and some tweaks in the docs and
comments.
--
Michael

Attachment Content-Type Size
v7-0001-Track-per-relation-cumulative-time-spent-in-auto-.patch text/x-diff 15.2 KB

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Sami Imseih <samimseih(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: track vacuum/analyze cumulative time per relation
Date: 2025-01-27 13:31:06
Message-ID: Z5eKmuxH/[email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On Mon, Jan 27, 2025 at 09:30:16AM +0900, Michael Paquier wrote:
> The addition of the extra GetCurrentTimestamp() in the report path
> does not stress me much,

Reading at the previous messages I see how you reached this state. I also think
that makes sense and that's not an issue as we are not in a hot path here.

> should just do the attached, which is simpler and addresses your
> use-case. Note also that the end time is acquired while the entries
> are not locked in the report routines, and some tweaks in the docs and
> comments.

LGTM.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: track vacuum/analyze cumulative time per relation
Date: 2025-01-27 17:22:16
Message-ID: CAA5RZ0vpAA+jXiCjjO5DGbLTUVmqe99ioCx0dkg1FKNc1h2biA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

> I think that this is hiding a behavior change while adding new
> counters, and both changes are independent.

That's a fair point.

> Another thing that is
> slightly incorrect to do if we take the argument of only adding the
> counters is moving around the call of pgstat_progress_end_command(),
> because it's not really wrong as-is, either. I'd suggest to make all
> that a separate discussion.

Yeah, we have inconsistency between when vacuum and analyze
command ends as well as inconsistency between the time
reported to pg_stats vs the logs. I will start a follow-up thread for
this.

> I have put my hands on this patch, and at the end I think that we
> should just do the attached, which is simpler and addresses your
> use-case. Note also that the end time is acquired while the entries
> are not locked in the report routines, and some tweaks in the docs and
> comments.

Thanks for the update. This LGTM.

Regards,

Sami


From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: track vacuum/analyze cumulative time per relation
Date: 2025-01-28 00:59:50
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 27, 2025 at 11:22:16AM -0600, Sami Imseih wrote:
>> I have put my hands on this patch, and at the end I think that we
>> should just do the attached, which is simpler and addresses your
>> use-case. Note also that the end time is acquired while the entries
>> are not locked in the report routines, and some tweaks in the docs and
>> comments.
>
> Thanks for the update. This LGTM.

Okidoki. Done, then, with bumps for the catalog version and the stats
file version.
--
Michael


From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Sami Imseih <samimseih(at)gmail(dot)com>
Subject: Re: POC: track vacuum/analyze cumulative time per relation
Date: 2025-02-04 13:06:15
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hi!

On 28.01.2025 03:59, Michael Paquier wrote:
> On Mon, Jan 27, 2025 at 11:22:16AM -0600, Sami Imseih wrote:
>>> I have put my hands on this patch, and at the end I think that we
>>> should just do the attached, which is simpler and addresses your
>>> use-case. Note also that the end time is acquired while the entries
>>> are not locked in the report routines, and some tweaks in the docs and
>>> comments.
>> Thanks for the update. This LGTM.
> Okidoki. Done, then, with bumps for the catalog version and the stats
> file version.

You might be interested in looking at my patch for collecting vacuum
statistics? Same issue, but it collects more information about the
vacuum procedure for relations and databases.

I mentioned it here [0] in the current thread.

[0]
https://www.postgresql.org/message-id/Z4U3hkWuEgByhGgJ%40ip-10-97-1-34.eu-west-3.compute.internal

--
Regards,
Alena Rybakina
Postgres Professional