Skip to content
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Commit 0628670

Browse files
committedApr 3, 2024
Invent SERIALIZE option for EXPLAIN.
EXPLAIN (ANALYZE, SERIALIZE) allows collection of statistics about the volume of data emitted by a query, as well as the time taken to convert the data to the on-the-wire format. Previously there was no way to investigate this without actually sending the data to the client, in which case network transmission costs might swamp what you wanted to see. In particular this feature allows investigating the costs of de-TOASTing compressed or out-of-line data during formatting. Stepan Rutz and Matthias van de Meent, reviewed by Tomas Vondra and myself Discussion: https://postgr.es/m/ca0adb0e-fa4e-c37e-1cd7-91170b18cae1@gmx.de
1 parent 97ce821 commit 0628670

File tree

10 files changed

+542
-11
lines changed

10 files changed

+542
-11
lines changed
 

‎doc/src/sgml/perform.sgml

Lines changed: 18 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -144,11 +144,11 @@ EXPLAIN SELECT * FROM tenk1;
144144
It's important to understand that the cost of an upper-level node includes
145145
the cost of all its child nodes. It's also important to realize that
146146
the cost only reflects things that the planner cares about.
147-
In particular, the cost does not consider the time spent transmitting
148-
result rows to the client, which could be an important
149-
factor in the real elapsed time; but the planner ignores it because
150-
it cannot change it by altering the plan. (Every correct plan will
151-
output the same row set, we trust.)
147+
In particular, the cost does not consider the time spent to convert
148+
output values to text form or to transmit them to the client, which
149+
could be important factors in the real elapsed time; but the planner
150+
ignores those costs because it cannot change them by altering the
151+
plan. (Every correct plan will output the same row set, we trust.)
152152
</para>
153153

154154
<para>
@@ -956,6 +956,17 @@ EXPLAIN UPDATE parent SET f2 = f2 + 1 WHERE f1 = 101;
956956
<command>EXPLAIN ANALYZE</command>.
957957
</para>
958958

959+
<para>
960+
The time shown for the top-level node does not include any time needed
961+
to convert the query's output data into displayable form or to send it
962+
to the client. While <command>EXPLAIN ANALYZE</command> will never
963+
send the data to the client, it can be told to convert the query's
964+
output data to displayable form and measure the time needed for that,
965+
by specifying the <literal>SERIALIZE</literal> option. That time will
966+
be shown separately, and it's also included in the
967+
total <literal>Execution time</literal>.
968+
</para>
969+
959970
</sect2>
960971

961972
<sect2 id="using-explain-caveats">
@@ -965,7 +976,8 @@ EXPLAIN UPDATE parent SET f2 = f2 + 1 WHERE f1 = 101;
965976
There are two significant ways in which run times measured by
966977
<command>EXPLAIN ANALYZE</command> can deviate from normal execution of
967978
the same query. First, since no output rows are delivered to the client,
968-
network transmission costs and I/O conversion costs are not included.
979+
network transmission costs are not included. I/O conversion costs are
980+
not included either unless <literal>SERIALIZE</literal> is specified.
969981
Second, the measurement overhead added by <command>EXPLAIN
970982
ANALYZE</command> can be significant, especially on machines with slow
971983
<function>gettimeofday()</function> operating-system calls. You can use the

‎doc/src/sgml/ref/explain.sgml

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ EXPLAIN [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] <rep
4141
SETTINGS [ <replaceable class="parameter">boolean</replaceable> ]
4242
GENERIC_PLAN [ <replaceable class="parameter">boolean</replaceable> ]
4343
BUFFERS [ <replaceable class="parameter">boolean</replaceable> ]
44+
SERIALIZE [ { NONE | TEXT | BINARY } ]
4445
WAL [ <replaceable class="parameter">boolean</replaceable> ]
4546
TIMING [ <replaceable class="parameter">boolean</replaceable> ]
4647
SUMMARY [ <replaceable class="parameter">boolean</replaceable> ]
@@ -206,6 +207,34 @@ ROLLBACK;
206207
</listitem>
207208
</varlistentry>
208209

210+
<varlistentry>
211+
<term><literal>SERIALIZE</literal></term>
212+
<listitem>
213+
<para>
214+
Include information on the cost
215+
of <firstterm>serializing</firstterm> the query's output data, that
216+
is converting it to text or binary format to send to the client.
217+
This can be a significant part of the time required for regular
218+
execution of the query, if the datatype output functions are
219+
expensive or if <acronym>TOAST</acronym>ed values must be fetched
220+
from out-of-line storage. <command>EXPLAIN</command>'s default
221+
behavior, <literal>SERIALIZE NONE</literal>, does not perform these
222+
conversions. If <literal>SERIALIZE TEXT</literal>
223+
or <literal>SERIALIZE BINARY</literal> is specified, the appropriate
224+
conversions are performed, and the time spent doing so is measured
225+
(unless <literal>TIMING OFF</literal> is specified). If
226+
the <literal>BUFFERS</literal> option is also specified, then any
227+
buffer accesses involved in the conversions are counted too.
228+
In no case, however, will <command>EXPLAIN</command> actually send
229+
the resulting data to the client; hence network transmission costs
230+
cannot be investigated this way.
231+
Serialization may only be enabled when <literal>ANALYZE</literal> is
232+
also enabled. If <literal>SERIALIZE</literal> is written without an
233+
argument, <literal>TEXT</literal> is assumed.
234+
</para>
235+
</listitem>
236+
</varlistentry>
237+
209238
<varlistentry>
210239
<term><literal>WAL</literal></term>
211240
<listitem>

‎src/backend/access/common/printtup.c

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -294,6 +294,9 @@ printtup_prepare_info(DR_printtup *myState, TupleDesc typeinfo, int numAttrs)
294294

295295
/* ----------------
296296
* printtup --- send a tuple to the client
297+
*
298+
* Note: if you change this function, see also serializeAnalyzeReceive
299+
* in explain.c, which is meant to replicate the computations done here.
297300
* ----------------
298301
*/
299302
static bool
@@ -317,7 +320,7 @@ printtup(TupleTableSlot *slot, DestReceiver *self)
317320
oldcontext = MemoryContextSwitchTo(myState->tmpcontext);
318321

319322
/*
320-
* Prepare a DataRow message (note buffer is in per-row context)
323+
* Prepare a DataRow message (note buffer is in per-query context)
321324
*/
322325
pq_beginmessage_reuse(buf, 'D');
323326

‎src/backend/commands/explain.c

Lines changed: 409 additions & 2 deletions
Large diffs are not rendered by default.

‎src/backend/tcop/dest.c

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@
3333
#include "access/xact.h"
3434
#include "commands/copy.h"
3535
#include "commands/createas.h"
36+
#include "commands/explain.h"
3637
#include "commands/matview.h"
3738
#include "executor/functions.h"
3839
#include "executor/tqueue.h"
@@ -151,6 +152,9 @@ CreateDestReceiver(CommandDest dest)
151152

152153
case DestTupleQueue:
153154
return CreateTupleQueueDestReceiver(NULL);
155+
156+
case DestExplainSerialize:
157+
return CreateExplainSerializeDestReceiver(NULL);
154158
}
155159

156160
/* should never get here */
@@ -186,6 +190,7 @@ EndCommand(const QueryCompletion *qc, CommandDest dest, bool force_undecorated_o
186190
case DestSQLFunction:
187191
case DestTransientRel:
188192
case DestTupleQueue:
193+
case DestExplainSerialize:
189194
break;
190195
}
191196
}
@@ -231,6 +236,7 @@ NullCommand(CommandDest dest)
231236
case DestSQLFunction:
232237
case DestTransientRel:
233238
case DestTupleQueue:
239+
case DestExplainSerialize:
234240
break;
235241
}
236242
}
@@ -274,6 +280,7 @@ ReadyForQuery(CommandDest dest)
274280
case DestSQLFunction:
275281
case DestTransientRel:
276282
case DestTupleQueue:
283+
case DestExplainSerialize:
277284
break;
278285
}
279286
}

‎src/include/commands/explain.h

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,13 @@
1717
#include "lib/stringinfo.h"
1818
#include "parser/parse_node.h"
1919

20+
typedef enum ExplainSerializeOption
21+
{
22+
EXPLAIN_SERIALIZE_NONE,
23+
EXPLAIN_SERIALIZE_TEXT,
24+
EXPLAIN_SERIALIZE_BINARY,
25+
} ExplainSerializeOption;
26+
2027
typedef enum ExplainFormat
2128
{
2229
EXPLAIN_FORMAT_TEXT,
@@ -48,6 +55,7 @@ typedef struct ExplainState
4855
bool memory; /* print planner's memory usage information */
4956
bool settings; /* print modified settings */
5057
bool generic; /* generate a generic plan */
58+
ExplainSerializeOption serialize; /* serialize the query's output? */
5159
ExplainFormat format; /* output format */
5260
/* state for output formatting --- not reset for each new plan tree */
5361
int indent; /* current indentation level */
@@ -132,4 +140,6 @@ extern void ExplainOpenGroup(const char *objtype, const char *labelname,
132140
extern void ExplainCloseGroup(const char *objtype, const char *labelname,
133141
bool labeled, ExplainState *es);
134142

143+
extern DestReceiver *CreateExplainSerializeDestReceiver(ExplainState *es);
144+
135145
#endif /* EXPLAIN_H */

‎src/include/tcop/dest.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,7 @@ typedef enum
9696
DestSQLFunction, /* results sent to SQL-language func mgr */
9797
DestTransientRel, /* results sent to transient relation */
9898
DestTupleQueue, /* results sent to tuple queue */
99+
DestExplainSerialize, /* results are serialized and discarded */
99100
} CommandDest;
100101

101102
/* ----------------

‎src/test/regress/expected/explain.out

Lines changed: 53 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -135,7 +135,7 @@ select explain_filter('explain (analyze, buffers, format xml) select * from int8
135135
</explain>
136136
(1 row)
137137

138-
select explain_filter('explain (analyze, buffers, format yaml) select * from int8_tbl i8');
138+
select explain_filter('explain (analyze, serialize, buffers, format yaml) select * from int8_tbl i8');
139139
explain_filter
140140
-------------------------------
141141
- Plan: +
@@ -175,6 +175,20 @@ select explain_filter('explain (analyze, buffers, format yaml) select * from int
175175
Temp Written Blocks: N +
176176
Planning Time: N.N +
177177
Triggers: +
178+
Serialization: +
179+
Time: N.N +
180+
Output Volume: N +
181+
Format: "text" +
182+
Shared Hit Blocks: N +
183+
Shared Read Blocks: N +
184+
Shared Dirtied Blocks: N +
185+
Shared Written Blocks: N +
186+
Local Hit Blocks: N +
187+
Local Read Blocks: N +
188+
Local Dirtied Blocks: N +
189+
Local Written Blocks: N +
190+
Temp Read Blocks: N +
191+
Temp Written Blocks: N +
178192
Execution Time: N.N
179193
(1 row)
180194

@@ -639,3 +653,41 @@ select explain_filter('explain (verbose) select * from int8_tbl i8');
639653
Query Identifier: N
640654
(3 rows)
641655

656+
-- Test SERIALIZE option
657+
select explain_filter('explain (analyze,serialize) select * from int8_tbl i8');
658+
explain_filter
659+
-----------------------------------------------------------------------------------------------
660+
Seq Scan on int8_tbl i8 (cost=N.N..N.N rows=N width=N) (actual time=N.N..N.N rows=N loops=N)
661+
Planning Time: N.N ms
662+
Serialization: time=N.N ms output=NkB format=text
663+
Execution Time: N.N ms
664+
(4 rows)
665+
666+
select explain_filter('explain (analyze,serialize text,buffers,timing off) select * from int8_tbl i8');
667+
explain_filter
668+
---------------------------------------------------------------------------------
669+
Seq Scan on int8_tbl i8 (cost=N.N..N.N rows=N width=N) (actual rows=N loops=N)
670+
Planning Time: N.N ms
671+
Serialization: output=NkB format=text
672+
Execution Time: N.N ms
673+
(4 rows)
674+
675+
select explain_filter('explain (analyze,serialize binary,buffers,timing) select * from int8_tbl i8');
676+
explain_filter
677+
-----------------------------------------------------------------------------------------------
678+
Seq Scan on int8_tbl i8 (cost=N.N..N.N rows=N width=N) (actual time=N.N..N.N rows=N loops=N)
679+
Planning Time: N.N ms
680+
Serialization: time=N.N ms output=NkB format=binary
681+
Execution Time: N.N ms
682+
(4 rows)
683+
684+
-- this tests an edge case where we have no data to return
685+
select explain_filter('explain (analyze,serialize) create temp table explain_temp as select * from int8_tbl i8');
686+
explain_filter
687+
-----------------------------------------------------------------------------------------------
688+
Seq Scan on int8_tbl i8 (cost=N.N..N.N rows=N width=N) (actual time=N.N..N.N rows=N loops=N)
689+
Planning Time: N.N ms
690+
Serialization: time=N.N ms output=NkB format=text
691+
Execution Time: N.N ms
692+
(4 rows)
693+

‎src/test/regress/sql/explain.sql

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ select explain_filter('explain (analyze) select * from int8_tbl i8');
6666
select explain_filter('explain (analyze, verbose) select * from int8_tbl i8');
6767
select explain_filter('explain (analyze, buffers, format text) select * from int8_tbl i8');
6868
select explain_filter('explain (analyze, buffers, format xml) select * from int8_tbl i8');
69-
select explain_filter('explain (analyze, buffers, format yaml) select * from int8_tbl i8');
69+
select explain_filter('explain (analyze, serialize, buffers, format yaml) select * from int8_tbl i8');
7070
select explain_filter('explain (buffers, format text) select * from int8_tbl i8');
7171
select explain_filter('explain (buffers, format json) select * from int8_tbl i8');
7272

@@ -162,3 +162,10 @@ select explain_filter('explain (verbose) select * from t1 where pg_temp.mysin(f1
162162
-- Test compute_query_id
163163
set compute_query_id = on;
164164
select explain_filter('explain (verbose) select * from int8_tbl i8');
165+
166+
-- Test SERIALIZE option
167+
select explain_filter('explain (analyze,serialize) select * from int8_tbl i8');
168+
select explain_filter('explain (analyze,serialize text,buffers,timing off) select * from int8_tbl i8');
169+
select explain_filter('explain (analyze,serialize binary,buffers,timing) select * from int8_tbl i8');
170+
-- this tests an edge case where we have no data to return
171+
select explain_filter('explain (analyze,serialize) create temp table explain_temp as select * from int8_tbl i8');

‎src/tools/pgindent/typedefs.list

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -713,6 +713,7 @@ ExplainForeignModify_function
713713
ExplainForeignScan_function
714714
ExplainFormat
715715
ExplainOneQuery_hook_type
716+
ExplainSerializeOption
716717
ExplainState
717718
ExplainStmt
718719
ExplainWorkersState
@@ -2536,6 +2537,8 @@ SerCommitSeqNo
25362537
SerialControl
25372538
SerialIOData
25382539
SerializableXactHandle
2540+
SerializeDestReceiver
2541+
SerializeMetrics
25392542
SerializedActiveRelMaps
25402543
SerializedClientConnectionInfo
25412544
SerializedRanges

0 commit comments

Comments
 (0)
Please sign in to comment.