|
207 | 207 |
|
208 | 208 | <para>
|
209 | 209 | As shown in <xref linkend="xindex-btree-support-table"/>, btree defines
|
210 |
| - one required and two optional support functions. The three |
| 210 | + one required and three optional support functions. The four |
211 | 211 | user-defined methods are:
|
212 | 212 | </para>
|
213 | 213 | <variablelist>
|
@@ -456,6 +456,100 @@ returns bool
|
456 | 456 | </para>
|
457 | 457 | </listitem>
|
458 | 458 | </varlistentry>
|
| 459 | + <varlistentry> |
| 460 | + <term><function>equalimage</function></term> |
| 461 | + <listitem> |
| 462 | + <para> |
| 463 | + Optionally, a btree operator family may provide |
| 464 | + <function>equalimage</function> (<quote>equality implies image |
| 465 | + equality</quote>) support functions, registered under support |
| 466 | + function number 4. These functions allow the core code to |
| 467 | + determine when it is safe to apply the btree deduplication |
| 468 | + optimization. Currently, <function>equalimage</function> |
| 469 | + functions are only called when building or rebuilding an index. |
| 470 | + </para> |
| 471 | + <para> |
| 472 | + An <function>equalimage</function> function must have the |
| 473 | + signature |
| 474 | +<synopsis> |
| 475 | +equalimage(<replaceable>opcintype</replaceable> <type>oid</type>) returns bool |
| 476 | +</synopsis> |
| 477 | + The return value is static information about an operator class |
| 478 | + and collation. Returning <literal>true</literal> indicates that |
| 479 | + the <function>order</function> function for the operator class is |
| 480 | + guaranteed to only return <literal>0</literal> (<quote>arguments |
| 481 | + are equal</quote>) when its <replaceable>A</replaceable> and |
| 482 | + <replaceable>B</replaceable> arguments are also interchangeable |
| 483 | + without any loss of semantic information. Not registering an |
| 484 | + <function>equalimage</function> function or returning |
| 485 | + <literal>false</literal> indicates that this condition cannot be |
| 486 | + assumed to hold. |
| 487 | + </para> |
| 488 | + <para> |
| 489 | + The <replaceable>opcintype</replaceable> argument is the |
| 490 | + <literal><structname>pg_type</structname>.oid</literal> of the |
| 491 | + data type that the operator class indexes. This is a convenience |
| 492 | + that allows reuse of the same underlying |
| 493 | + <function>equalimage</function> function across operator classes. |
| 494 | + If <replaceable>opcintype</replaceable> is a collatable data |
| 495 | + type, the appropriate collation OID will be passed to the |
| 496 | + <function>equalimage</function> function, using the standard |
| 497 | + <function>PG_GET_COLLATION()</function> mechanism. |
| 498 | + </para> |
| 499 | + <para> |
| 500 | + As far as the operator class is concerned, returning |
| 501 | + <literal>true</literal> indicates that deduplication is safe (or |
| 502 | + safe for the collation whose OID was passed to its |
| 503 | + <function>equalimage</function> function). However, the core |
| 504 | + code will only deem deduplication safe for an index when |
| 505 | + <emphasis>every</emphasis> indexed column uses an operator class |
| 506 | + that registers an <function>equalimage</function> function, and |
| 507 | + each function actually returns <literal>true</literal> when |
| 508 | + called. |
| 509 | + </para> |
| 510 | + <para> |
| 511 | + Image equality is <emphasis>almost</emphasis> the same condition |
| 512 | + as simple bitwise equality. There is one subtle difference: When |
| 513 | + indexing a varlena data type, the on-disk representation of two |
| 514 | + image equal datums may not be bitwise equal due to inconsistent |
| 515 | + application of <acronym>TOAST</acronym> compression on input. |
| 516 | + Formally, when an operator class's |
| 517 | + <function>equalimage</function> function returns |
| 518 | + <literal>true</literal>, it is safe to assume that the |
| 519 | + <literal>datum_image_eq()</literal> C function will always agree |
| 520 | + with the operator class's <function>order</function> function |
| 521 | + (provided that the same collation OID is passed to both the |
| 522 | + <function>equalimage</function> and <function>order</function> |
| 523 | + functions). |
| 524 | + </para> |
| 525 | + <para> |
| 526 | + The core code is fundamentally unable to deduce anything about |
| 527 | + the <quote>equality implies image equality</quote> status of an |
| 528 | + operator class within a multiple-data-type family based on |
| 529 | + details from other operator classes in the same family. Also, it |
| 530 | + is not sensible for an operator family to register a cross-type |
| 531 | + <function>equalimage</function> function, and attempting to do so |
| 532 | + will result in an error. This is because <quote>equality implies |
| 533 | + image equality</quote> status does not just depend on |
| 534 | + sorting/equality semantics, which are more or less defined at the |
| 535 | + operator family level. In general, the semantics that one |
| 536 | + particular data type implements must be considered separately. |
| 537 | + </para> |
| 538 | + <para> |
| 539 | + The convention followed by the operator classes included with the |
| 540 | + core <productname>PostgreSQL</productname> distribution is to |
| 541 | + register a stock, generic <function>equalimage</function> |
| 542 | + function. Most operator classes register |
| 543 | + <function>btequalimage()</function>, which indicates that |
| 544 | + deduplication is safe unconditionally. Operator classes for |
| 545 | + collatable data types such as <type>text</type> register |
| 546 | + <function>btvarstrequalimage()</function>, which indicates that |
| 547 | + deduplication is safe with deterministic collations. Best |
| 548 | + practice for third-party extensions is to register their own |
| 549 | + custom function to retain control. |
| 550 | + </para> |
| 551 | + </listitem> |
| 552 | + </varlistentry> |
459 | 553 | </variablelist>
|
460 | 554 |
|
461 | 555 | </sect1>
|
|
0 commit comments