Skip to content

Commit 115464b

Browse files
committed
doc: add section about heap-only tuples (HOT)
Reported-by: Jonathan S. Katz Discussion: https://postgr.es/m/[email protected] Backpatch-through: 11
1 parent 50e088d commit 115464b

9 files changed

+86
-11
lines changed

doc/src/sgml/acronyms.sgml

+1-3
Original file line numberDiff line numberDiff line change
@@ -299,9 +299,7 @@
299299
<term><acronym>HOT</acronym></term>
300300
<listitem>
301301
<para>
302-
<ulink
303-
url="https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/access/heap/README.HOT;hb=HEAD">Heap-Only
304-
Tuples</ulink>
302+
<link linkend="storage-hot">Heap-Only Tuples</link>
305303
</para>
306304
</listitem>
307305
</varlistentry>

doc/src/sgml/btree.sgml

+2-1
Original file line numberDiff line numberDiff line change
@@ -639,7 +639,8 @@ options(<replaceable>relopts</replaceable> <type>local_relopts *</type>) returns
639639
accumulate and adversely affect query latency and throughput. This
640640
typically occurs with <command>UPDATE</command>-heavy workloads
641641
where most individual updates cannot apply the
642-
<acronym>HOT</acronym> optimization. Changing the value of only
642+
<link linkend="storage-hot"><acronym>HOT</acronym> optimization.</link>
643+
Changing the value of only
643644
one column covered by one index during an <command>UPDATE</command>
644645
<emphasis>always</emphasis> necessitates a new set of index tuples
645646
&mdash; one for <emphasis>each and every</emphasis> index on the

doc/src/sgml/catalogs.sgml

+1-1
Original file line numberDiff line numberDiff line change
@@ -4381,7 +4381,7 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
43814381
<para>
43824382
If true, queries must not use the index until the <structfield>xmin</structfield>
43834383
of this <structname>pg_index</structname> row is below their <symbol>TransactionXmin</symbol>
4384-
event horizon, because the table may contain broken HOT chains with
4384+
event horizon, because the table may contain broken <link linkend="storage-hot">HOT chains</link> with
43854385
incompatible rows that they can see
43864386
</para></entry>
43874387
</row>

doc/src/sgml/config.sgml

+2-1
Original file line numberDiff line numberDiff line change
@@ -4491,7 +4491,8 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
44914491
<listitem>
44924492
<para>
44934493
Specifies the number of transactions by which <command>VACUUM</command> and
4494-
<acronym>HOT</acronym> updates will defer cleanup of dead row versions. The
4494+
<link linkend="storage-hot"><acronym>HOT</acronym> updates</link>
4495+
will defer cleanup of dead row versions. The
44954496
default is zero transactions, meaning that dead row versions can be
44964497
removed as soon as possible, that is, as soon as they are no longer
44974498
visible to any open transaction. You may wish to set this to a

doc/src/sgml/indexam.sgml

+2-1
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,8 @@
4545
extant versions of the same logical row; to an index, each tuple is
4646
an independent object that needs its own index entry. Thus, an
4747
update of a row always creates all-new index entries for the row, even if
48-
the key values did not change. (HOT tuples are an exception to this
48+
the key values did not change. (<link linkend="storage-hot">HOT
49+
tuples</link> are an exception to this
4950
statement; but indexes do not deal with those, either.) Index entries for
5051
dead tuples are reclaimed (by vacuuming) when the dead tuples themselves
5152
are reclaimed.

doc/src/sgml/indices.sgml

+4-2
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,9 @@ CREATE INDEX test1_id_index ON test1 (id);
103103

104104
<para>
105105
After an index is created, the system has to keep it synchronized with the
106-
table. This adds overhead to data manipulation operations.
106+
table. This adds overhead to data manipulation operations. Indexes can
107+
also prevent the creation of <link linkend="storage-hot">heap-only
108+
tuples</link>.
107109
Therefore indexes that are seldom or never used in queries
108110
should be removed.
109111
</para>
@@ -749,7 +751,7 @@ CREATE INDEX people_names ON people ((first_name || ' ' || last_name));
749751
<para>
750752
Index expressions are relatively expensive to maintain, because the
751753
derived expression(s) must be computed for each row insertion
752-
and non-HOT update. However, the index expressions are
754+
and <link linkend="storage-hot">non-HOT update.</link> However, the index expressions are
753755
<emphasis>not</emphasis> recomputed during an indexed search, since they are
754756
already stored in the index. In both examples above, the system
755757
sees the query as just <literal>WHERE indexedcolumn = 'constant'</literal>

doc/src/sgml/monitoring.sgml

+1-1
Original file line numberDiff line numberDiff line change
@@ -4426,7 +4426,7 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
44264426
<structfield>n_tup_upd</structfield> <type>bigint</type>
44274427
</para>
44284428
<para>
4429-
Number of rows updated (includes HOT updated rows)
4429+
Number of rows updated (includes <link linkend="storage-hot">HOT updated rows</link>)
44304430
</para></entry>
44314431
</row>
44324432

doc/src/sgml/ref/create_table.sgml

+3-1
Original file line numberDiff line numberDiff line change
@@ -1435,7 +1435,9 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
14351435
to the indicated percentage; the remaining space on each page is
14361436
reserved for updating rows on that page. This gives <command>UPDATE</command>
14371437
a chance to place the updated copy of a row on the same page as the
1438-
original, which is more efficient than placing it on a different page.
1438+
original, which is more efficient than placing it on a different
1439+
page, and makes <link linkend="storage-hot">heap-only tuple
1440+
updates</link> more likely.
14391441
For a table whose entries are never updated, complete packing is the
14401442
best choice, but in heavily updated tables smaller fillfactors are
14411443
appropriate. This parameter cannot be set for TOAST tables.

doc/src/sgml/storage.sgml

+70
Original file line numberDiff line numberDiff line change
@@ -1075,4 +1075,74 @@ data. Empty in ordinary tables.</entry>
10751075
</sect2>
10761076
</sect1>
10771077

1078+
<sect1 id="storage-hot">
1079+
1080+
<title>Heap-Only Tuples (<acronym>HOT</acronym>)</title>
1081+
1082+
<para>
1083+
To allow for high concurrency, <productname>PostgreSQL</productname>
1084+
uses <link linkend="mvcc-intro">multiversion concurrency
1085+
control</link> (<acronym>MVCC</acronym>) to store rows. However,
1086+
<acronym>MVCC</acronym> has some downsides for update queries.
1087+
Specifically, updates require new versions of rows to be added to
1088+
tables. This can also require new index entries for each updated row,
1089+
and removal of old versions of rows and their index entries can be
1090+
expensive.
1091+
</para>
1092+
1093+
<para>
1094+
To help reduce the overhead of updates,
1095+
<productname>PostgreSQL</productname> has an optimization called
1096+
heap-only tuples (<acronym>HOT</acronym>). This optimization is
1097+
possible when:
1098+
1099+
<itemizedlist>
1100+
<listitem>
1101+
<para>
1102+
The update does not modify any columns referenced by the table's
1103+
indexes, including expression and partial indexes.
1104+
</para>
1105+
</listitem>
1106+
<listitem>
1107+
<para>
1108+
There is sufficient free space on the page containing the old row
1109+
for the updated row.
1110+
</para>
1111+
</listitem>
1112+
</itemizedlist>
1113+
1114+
In such cases, heap-only tuples provide two optimizations:
1115+
1116+
<itemizedlist>
1117+
<listitem>
1118+
<para>
1119+
New index entries are not needed to represent updated rows.
1120+
</para>
1121+
</listitem>
1122+
<listitem>
1123+
<para>
1124+
Old versions of updated rows can be completely removed during normal
1125+
operation, including <command>SELECT</command>s, instead of requiring
1126+
periodic vacuum operations. (This is possible because indexes
1127+
do not reference their <link linkend="storage-page-layout">page
1128+
item identifiers</link>.)
1129+
</para>
1130+
</listitem>
1131+
</itemizedlist>
1132+
</para>
1133+
1134+
<para>
1135+
In summary, heap-only tuple updates can only be created
1136+
if columns used by indexes are not updated. You can
1137+
increase the likelihood of sufficient page space for
1138+
<acronym>HOT</acronym> updates by decreasing a table's <link
1139+
linkend="sql-createtable"><literal>fillfactor</literal></link>.
1140+
If you don't, <acronym>HOT</acronym> updates will still happen because
1141+
new rows will naturally migrate to new pages and existing pages with
1142+
sufficient free space for new row versions. The system view <link
1143+
linkend="monitoring-pg-stat-all-tables-view">pg_stat_all_tables</link>
1144+
allows monitoring of the occurrence of HOT and non-HOT updates.
1145+
</para>
1146+
</sect1>
1147+
10781148
</chapter>

0 commit comments

Comments
 (0)