Skip to content

Commit 3785f7e

Browse files
committed
Doc: move info for btree opclass implementors into main documentation.
Up to now, useful info for writing a new btree opclass has been buried in the backend's nbtree/README file. Let's move it into the SGML docs, in preparation for extending it with info about "in_range" functions in the upcoming window RANGE patch. To do this, I chose to create a new chapter for btree indexes in Part VII (Internals), parallel to the chapters that exist for the newer index AMs. This is a pretty short chapter as-is. At some point somebody might care to flesh it out with more detail about btree internals, but that is beyond the scope of my ambition for today. Discussion: https://postgr.es/m/[email protected]
1 parent f069c91 commit 3785f7e

File tree

5 files changed

+276
-61
lines changed

5 files changed

+276
-61
lines changed

doc/src/sgml/btree.sgml

+267
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,267 @@
1+
<!-- doc/src/sgml/btree.sgml -->
2+
3+
<chapter id="btree">
4+
<title>B-Tree Indexes</title>
5+
6+
<indexterm>
7+
<primary>index</primary>
8+
<secondary>B-Tree</secondary>
9+
</indexterm>
10+
11+
<sect1 id="btree-intro">
12+
<title>Introduction</title>
13+
14+
<para>
15+
<productname>PostgreSQL</productname> includes an implementation of the
16+
standard <acronym>btree</acronym> (multi-way binary tree) index data
17+
structure. Any data type that can be sorted into a well-defined linear
18+
order can be indexed by a btree index. The only limitation is that an
19+
index entry cannot exceed approximately one-third of a page (after TOAST
20+
compression, if applicable).
21+
</para>
22+
23+
<para>
24+
Because each btree operator class imposes a sort order on its data type,
25+
btree operator classes (or, really, operator families) have come to be
26+
used as <productname>PostgreSQL</productname>'s general representation
27+
and understanding of sorting semantics. Therefore, they've acquired
28+
some features that go beyond what would be needed just to support btree
29+
indexes, and parts of the system that are quite distant from the
30+
btree AM make use of them.
31+
</para>
32+
33+
</sect1>
34+
35+
<sect1 id="btree-behavior">
36+
<title>Behavior of B-Tree Operator Classes</title>
37+
38+
<para>
39+
As shown in <xref linkend="xindex-btree-strat-table"/>, a btree operator
40+
class must provide five comparison operators,
41+
<literal>&lt;</literal>,
42+
<literal>&lt;=</literal>,
43+
<literal>=</literal>,
44+
<literal>&gt;=</literal> and
45+
<literal>&gt;</literal>.
46+
One might expect that <literal>&lt;&gt;</literal> should also be part of
47+
the operator class, but it is not, because it would almost never be
48+
useful to use a <literal>&lt;&gt;</literal> WHERE clause in an index
49+
search. (For some purposes, the planner treats <literal>&lt;&gt;</literal>
50+
as associated with a btree operator class; but it finds that operator via
51+
the <literal>=</literal> operator's negator link, rather than
52+
from <structname>pg_amop</structname>.)
53+
</para>
54+
55+
<para>
56+
When several data types share near-identical sorting semantics, their
57+
operator classes can be grouped into an operator family. Doing so is
58+
advantageous because it allows the planner to make deductions about
59+
cross-type comparisons. Each operator class within the family should
60+
contain the single-type operators (and associated support functions)
61+
for its input data type, while cross-type comparison operators and
62+
support functions are <quote>loose</quote> in the family. It is
63+
recommendable that a complete set of cross-type operators be included
64+
in the family, thus ensuring that the planner can represent any
65+
comparison conditions that it deduces from transitivity.
66+
</para>
67+
68+
<para>
69+
There are some basic assumptions that a btree operator family must
70+
satisfy:
71+
</para>
72+
73+
<itemizedlist>
74+
<listitem>
75+
<para>
76+
An <literal>=</literal> operator must be an equivalence relation; that
77+
is, for all non-null values <replaceable>A</replaceable>,
78+
<replaceable>B</replaceable>, <replaceable>C</replaceable> of the
79+
data type:
80+
81+
<itemizedlist>
82+
<listitem>
83+
<para>
84+
<replaceable>A</replaceable> <literal>=</literal>
85+
<replaceable>A</replaceable> is true
86+
(<firstterm>reflexive law</firstterm>)
87+
</para>
88+
</listitem>
89+
<listitem>
90+
<para>
91+
if <replaceable>A</replaceable> <literal>=</literal>
92+
<replaceable>B</replaceable>,
93+
then <replaceable>B</replaceable> <literal>=</literal>
94+
<replaceable>A</replaceable>
95+
(<firstterm>symmetric law</firstterm>)
96+
</para>
97+
</listitem>
98+
<listitem>
99+
<para>
100+
if <replaceable>A</replaceable> <literal>=</literal>
101+
<replaceable>B</replaceable> and <replaceable>B</replaceable>
102+
<literal>=</literal> <replaceable>C</replaceable>,
103+
then <replaceable>A</replaceable> <literal>=</literal>
104+
<replaceable>C</replaceable>
105+
(<firstterm>transitive law</firstterm>)
106+
</para>
107+
</listitem>
108+
</itemizedlist>
109+
</para>
110+
</listitem>
111+
112+
<listitem>
113+
<para>
114+
A <literal>&lt;</literal> operator must be a strong ordering relation;
115+
that is, for all non-null values <replaceable>A</replaceable>,
116+
<replaceable>B</replaceable>, <replaceable>C</replaceable>:
117+
118+
<itemizedlist>
119+
<listitem>
120+
<para>
121+
<replaceable>A</replaceable> <literal>&lt;</literal>
122+
<replaceable>A</replaceable> is false
123+
(<firstterm>irreflexive law</firstterm>)
124+
</para>
125+
</listitem>
126+
<listitem>
127+
<para>
128+
if <replaceable>A</replaceable> <literal>&lt;</literal>
129+
<replaceable>B</replaceable>
130+
and <replaceable>B</replaceable> <literal>&lt;</literal>
131+
<replaceable>C</replaceable>,
132+
then <replaceable>A</replaceable> <literal>&lt;</literal>
133+
<replaceable>C</replaceable>
134+
(<firstterm>transitive law</firstterm>)
135+
</para>
136+
</listitem>
137+
</itemizedlist>
138+
</para>
139+
</listitem>
140+
141+
<listitem>
142+
<para>
143+
Furthermore, the ordering is total; that is, for all non-null
144+
values <replaceable>A</replaceable>, <replaceable>B</replaceable>:
145+
146+
<itemizedlist>
147+
<listitem>
148+
<para>
149+
exactly one of <replaceable>A</replaceable> <literal>&lt;</literal>
150+
<replaceable>B</replaceable>, <replaceable>A</replaceable>
151+
<literal>=</literal> <replaceable>B</replaceable>, and
152+
<replaceable>B</replaceable> <literal>&lt;</literal>
153+
<replaceable>A</replaceable> is true
154+
(<firstterm>trichotomy law</firstterm>)
155+
</para>
156+
</listitem>
157+
</itemizedlist>
158+
159+
(The trichotomy law justifies the definition of the comparison support
160+
function, of course.)
161+
</para>
162+
</listitem>
163+
</itemizedlist>
164+
165+
<para>
166+
The other three operators are defined in terms of <literal>=</literal>
167+
and <literal>&lt;</literal> in the obvious way, and must act consistently
168+
with them.
169+
</para>
170+
171+
<para>
172+
For an operator family supporting multiple data types, the above laws must
173+
hold when <replaceable>A</replaceable>, <replaceable>B</replaceable>,
174+
<replaceable>C</replaceable> are taken from any data types in the family.
175+
The transitive laws are the trickiest to ensure, as in cross-type
176+
situations they represent statements that the behaviors of two or three
177+
different operators are consistent.
178+
As an example, it would not work to put <type>float8</type>
179+
and <type>numeric</type> into the same operator family, at least not with
180+
the current semantics that <type>numeric</type> values are converted
181+
to <type>float8</type> for comparison to a <type>float8</type>. Because
182+
of the limited accuracy of <type>float8</type>, this means there are
183+
distinct <type>numeric</type> values that will compare equal to the
184+
same <type>float8</type> value, and thus the transitive law would fail.
185+
</para>
186+
187+
<para>
188+
Another requirement for a multiple-data-type family is that any implicit
189+
or binary-coercion casts that are defined between data types included in
190+
the operator family must not change the associated sort ordering.
191+
</para>
192+
193+
<para>
194+
It should be fairly clear why a btree index requires these laws to hold
195+
within a single data type: without them there is no ordering to arrange
196+
the keys with. Also, index searches using a comparison key of a
197+
different data type require comparisons to behave sanely across two
198+
data types. The extensions to three or more data types within a family
199+
are not strictly required by the btree index mechanism itself, but the
200+
planner relies on them for optimization purposes.
201+
</para>
202+
203+
</sect1>
204+
205+
<sect1 id="btree-support-funcs">
206+
<title>B-Tree Support Functions</title>
207+
208+
<para>
209+
As shown in <xref linkend="xindex-btree-support-table"/>, btree defines
210+
one required and one optional support function.
211+
</para>
212+
213+
<para>
214+
For each combination of data types that a btree operator family provides
215+
comparison operators for, it must provide a comparison support function,
216+
registered in <structname>pg_amproc</structname> with support function
217+
number 1 and
218+
<structfield>amproclefttype</structfield>/<structfield>amprocrighttype</structfield>
219+
equal to the left and right data types for the comparison (i.e., the
220+
same data types that the matching operators are registered with
221+
in <structname>pg_amop</structname>).
222+
The comparison function must take two non-null values
223+
<replaceable>A</replaceable> and <replaceable>B</replaceable> and
224+
return an <type>int32</type> value that
225+
is <literal>&lt;</literal> <literal>0</literal>, <literal>0</literal>,
226+
or <literal>&gt;</literal> <literal>0</literal>
227+
when <replaceable>A</replaceable> <literal>&lt;</literal>
228+
<replaceable>B</replaceable>, <replaceable>A</replaceable>
229+
<literal>=</literal> <replaceable>B</replaceable>,
230+
or <replaceable>A</replaceable> <literal>&gt;</literal>
231+
<replaceable>B</replaceable>, respectively. The function must not
232+
return <literal>INT_MIN</literal> for the <replaceable>A</replaceable>
233+
<literal>&lt;</literal> <replaceable>B</replaceable> case,
234+
since the value may be negated before being tested for sign. A null
235+
result is disallowed, too.
236+
See <filename>src/backend/access/nbtree/nbtcompare.c</filename> for
237+
examples.
238+
</para>
239+
240+
<para>
241+
If the compared values are of a collatable data type, the appropriate
242+
collation OID will be passed to the comparison support function, using
243+
the standard <function>PG_GET_COLLATION()</function> mechanism.
244+
</para>
245+
246+
<para>
247+
Optionally, a btree operator family may provide <firstterm>sort
248+
support</firstterm> function(s), registered under support function number
249+
2. These functions allow implementing comparisons for sorting purposes
250+
in a more efficient way than naively calling the comparison support
251+
function. The APIs involved in this are defined in
252+
<filename>src/include/utils/sortsupport.h</filename>.
253+
</para>
254+
255+
</sect1>
256+
257+
<sect1 id="btree-implementation">
258+
<title>Implementation</title>
259+
260+
<para>
261+
An introduction to the btree index implementation can be found in
262+
<filename>src/backend/access/nbtree/README</filename>.
263+
</para>
264+
265+
</sect1>
266+
267+
</chapter>

doc/src/sgml/filelist.sgml

+1
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,7 @@
8383
<!ENTITY bki SYSTEM "bki.sgml">
8484
<!ENTITY catalogs SYSTEM "catalogs.sgml">
8585
<!ENTITY geqo SYSTEM "geqo.sgml">
86+
<!ENTITY btree SYSTEM "btree.sgml">
8687
<!ENTITY gist SYSTEM "gist.sgml">
8788
<!ENTITY spgist SYSTEM "spgist.sgml">
8889
<!ENTITY gin SYSTEM "gin.sgml">

doc/src/sgml/postgres.sgml

+1
Original file line numberDiff line numberDiff line change
@@ -252,6 +252,7 @@
252252
&geqo;
253253
&indexam;
254254
&generic-wal;
255+
&btree;
255256
&gist;
256257
&spgist;
257258
&gin;

doc/src/sgml/xindex.sgml

+7-8
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@
3535
<productname>PostgreSQL</productname>, but all index methods are
3636
described in <classname>pg_am</classname>. It is possible to add a
3737
new index access method by writing the necessary code and
38-
then creating a row in <classname>pg_am</classname> &mdash; but that is
38+
then creating an entry in <classname>pg_am</classname> &mdash; but that is
3939
beyond the scope of this chapter (see <xref linkend="indexam"/>).
4040
</para>
4141

@@ -404,6 +404,8 @@
404404
B-trees require a single support function, and allow a second one to be
405405
supplied at the operator class author's option, as shown in <xref
406406
linkend="xindex-btree-support-table"/>.
407+
The requirements for these support functions are explained further in
408+
<xref linkend="btree-support-funcs"/>.
407409
</para>
408410

409411
<table tocentry="1" id="xindex-btree-support-table">
@@ -426,8 +428,8 @@
426428
</row>
427429
<row>
428430
<entry>
429-
Return the addresses of C-callable sort support function(s),
430-
as documented in <filename>utils/sortsupport.h</filename> (optional)
431+
Return the addresses of C-callable sort support function(s)
432+
(optional)
431433
</entry>
432434
<entry>2</entry>
433435
</row>
@@ -1056,11 +1058,8 @@ ALTER OPERATOR FAMILY integer_ops USING btree ADD
10561058

10571059
<para>
10581060
In a B-tree operator family, all the operators in the family must sort
1059-
compatibly, meaning that the transitive laws hold across all the data types
1060-
supported by the family: <quote>if A = B and B = C, then A = C</quote>,
1061-
and <quote>if A &lt; B and B &lt; C, then A &lt; C</quote>. Moreover, implicit
1062-
or binary coercion casts between types represented in the operator family
1063-
must not change the associated sort ordering. For each
1061+
compatibly, as is specified in detail in <xref linkend="btree-behavior"/>.
1062+
For each
10641063
operator in the family there must be a support function having the same
10651064
two input data types as the operator. It is recommended that a family be
10661065
complete, i.e., for each combination of data types, all operators are

src/backend/access/nbtree/README

-53
Original file line numberDiff line numberDiff line change
@@ -623,56 +623,3 @@ routines must treat it accordingly. The actual key stored in the
623623
item is irrelevant, and need not be stored at all. This arrangement
624624
corresponds to the fact that an L&Y non-leaf page has one more pointer
625625
than key.
626-
627-
Notes to Operator Class Implementors
628-
------------------------------------
629-
630-
With this implementation, we require each supported combination of
631-
datatypes to supply us with a comparison procedure via pg_amproc.
632-
This procedure must take two nonnull values A and B and return an int32 < 0,
633-
0, or > 0 if A < B, A = B, or A > B, respectively. The procedure must
634-
not return INT_MIN for "A < B", since the value may be negated before
635-
being tested for sign. A null result is disallowed, too. See nbtcompare.c
636-
for examples.
637-
638-
There are some basic assumptions that a btree operator family must satisfy:
639-
640-
An = operator must be an equivalence relation; that is, for all non-null
641-
values A,B,C of the datatype:
642-
643-
A = A is true reflexive law
644-
if A = B, then B = A symmetric law
645-
if A = B and B = C, then A = C transitive law
646-
647-
A < operator must be a strong ordering relation; that is, for all non-null
648-
values A,B,C:
649-
650-
A < A is false irreflexive law
651-
if A < B and B < C, then A < C transitive law
652-
653-
Furthermore, the ordering is total; that is, for all non-null values A,B:
654-
655-
exactly one of A < B, A = B, and B < A is true trichotomy law
656-
657-
(The trichotomy law justifies the definition of the comparison support
658-
procedure, of course.)
659-
660-
The other three operators are defined in terms of these two in the obvious way,
661-
and must act consistently with them.
662-
663-
For an operator family supporting multiple datatypes, the above laws must hold
664-
when A,B,C are taken from any datatypes in the family. The transitive laws
665-
are the trickiest to ensure, as in cross-type situations they represent
666-
statements that the behaviors of two or three different operators are
667-
consistent. As an example, it would not work to put float8 and numeric into
668-
an opfamily, at least not with the current semantics that numerics are
669-
converted to float8 for comparison to a float8. Because of the limited
670-
accuracy of float8, this means there are distinct numeric values that will
671-
compare equal to the same float8 value, and thus the transitive law fails.
672-
673-
It should be fairly clear why a btree index requires these laws to hold within
674-
a single datatype: without them there is no ordering to arrange the keys with.
675-
Also, index searches using a key of a different datatype require comparisons
676-
to behave sanely across two datatypes. The extensions to three or more
677-
datatypes within a family are not strictly required by the btree index
678-
mechanism itself, but the planner relies on them for optimization purposes.

0 commit comments

Comments
 (0)