Skip to content

Commit 16fa9b2

Browse files
committed
Add support for building GiST index by sorting.
This adds a new optional support function to the GiST access method: sortsupport. If it is defined, the GiST index is built by sorting all data to the order defined by the sortsupport's comparator function, and packing the tuples in that order to GiST pages. This is similar to how B-tree index build works, and is much faster than inserting the tuples one by one. The resulting index is smaller too, because the pages are packed more tightly, upto 'fillfactor'. The normal build method works by splitting pages, which tends to lead to more wasted space. The quality of the resulting index depends on how good the opclass-defined sort order is. A good order preserves locality of the input data. As the first user of this facility, add 'sortsupport' function to the point_ops opclass. It sorts the points in Z-order (aka Morton Code), by interleaving the bits of the X and Y coordinates. Author: Andrey Borodin Reviewed-by: Pavel Borisov, Thomas Munro Discussion: https://www.postgresql.org/message-id/1A36620E-CAD8-4267-9067-FB31385E7C0D%40yandex-team.ru
1 parent 089da3c commit 16fa9b2

File tree

17 files changed

+935
-106
lines changed

17 files changed

+935
-106
lines changed

doc/src/sgml/gist.sgml

+70
Original file line numberDiff line numberDiff line change
@@ -259,6 +259,8 @@ CREATE INDEX ON my_table USING GIST (my_inet_column inet_ops);
259259
<function>compress</function> method is omitted. The optional tenth method
260260
<function>options</function> is needed if the operator class provides
261261
the user-specified parameters.
262+
The <function>sortsupport</function> method is also optional and is used to
263+
speed up building a <acronym>GiST</acronym> index.
262264
</para>
263265

264266
<variablelist>
@@ -1065,6 +1067,74 @@ my_compress(PG_FUNCTION_ARGS)
10651067
</para>
10661068
</listitem>
10671069
</varlistentry>
1070+
1071+
<varlistentry>
1072+
<term><function>sortsupport</function></term>
1073+
<listitem>
1074+
<para>
1075+
Returns a comparator function to sort data in a way that preserves
1076+
locality. It is used by <command>CREATE INDEX</command> and
1077+
<command>REINDEX</command> commands. The quality of the created index
1078+
depends on how well the sort order determined by the comparator function
1079+
preserves locality of the inputs.
1080+
</para>
1081+
<para>
1082+
The <function>sortsupport</function> method is optional. If it is not
1083+
provided, <command>CREATE INDEX</command> builds the index by inserting
1084+
each tuple to the tree using the <function>penalty</function> and
1085+
<function>picksplit</function> functions, which is much slower.
1086+
</para>
1087+
1088+
<para>
1089+
The <acronym>SQL</acronym> declaration of the function must look like
1090+
this:
1091+
1092+
<programlisting>
1093+
CREATE OR REPLACE FUNCTION my_sortsupport(internal)
1094+
RETURNS void
1095+
AS 'MODULE_PATHNAME'
1096+
LANGUAGE C STRICT;
1097+
</programlisting>
1098+
1099+
The argument is a pointer to a <structname>SortSupport</structname>
1100+
struct. At a minimum, the function must fill in its comparator field.
1101+
The comparator takes three arguments: two Datums to compare, and
1102+
a pointer to the <structname>SortSupport</structname> struct. The
1103+
Datums are the two indexed values in the format that they are stored
1104+
in the index; that is, in the format returned by the
1105+
<function>compress</function> method. The full API is defined in
1106+
<filename>src/include/utils/sortsupport.h</filename>.
1107+
</para>
1108+
1109+
<para>
1110+
The matching code in the C module could then follow this skeleton:
1111+
1112+
<programlisting>
1113+
PG_FUNCTION_INFO_V1(my_sortsupport);
1114+
1115+
static int
1116+
my_fastcmp(Datum x, Datum y, SortSupport ssup)
1117+
{
1118+
/* establish order between x and y by computing some sorting value z */
1119+
1120+
int z1 = ComputeSpatialCode(x);
1121+
int z2 = ComputeSpatialCode(y);
1122+
1123+
return z1 == z2 ? 0 : z1 > z2 ? 1 : -1;
1124+
}
1125+
1126+
Datum
1127+
my_sortsupport(PG_FUNCTION_ARGS)
1128+
{
1129+
SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0);
1130+
1131+
ssup->comparator = my_fastcmp;
1132+
PG_RETURN_VOID();
1133+
}
1134+
</programlisting>
1135+
</para>
1136+
</listitem>
1137+
</varlistentry>
10681138
</variablelist>
10691139

10701140
<para>

0 commit comments

Comments
 (0)