PostgreSQL Source Code git master
brin_bloom.c
Go to the documentation of this file.
1/*
2 * brin_bloom.c
3 * Implementation of Bloom opclass for BRIN
4 *
5 * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
6 * Portions Copyright (c) 1994, Regents of the University of California
7 *
8 *
9 * A BRIN opclass summarizing page range into a bloom filter.
10 *
11 * Bloom filters allow efficient testing whether a given page range contains
12 * a particular value. Therefore, if we summarize each page range into a small
13 * bloom filter, we can easily (and cheaply) test whether it contains values
14 * we get later.
15 *
16 * The index only supports equality operators, similarly to hash indexes.
17 * Bloom indexes are however much smaller, and support only bitmap scans.
18 *
19 * Note: Don't confuse this with bloom indexes, implemented in a contrib
20 * module. That extension implements an entirely new AM, building a bloom
21 * filter on multiple columns in a single row. This opclass works with an
22 * existing AM (BRIN) and builds bloom filter on a column.
23 *
24 *
25 * values vs. hashes
26 * -----------------
27 *
28 * The original column values are not used directly, but are first hashed
29 * using the regular type-specific hash function, producing a uint32 hash.
30 * And this hash value is then added to the summary - i.e. it's hashed
31 * again and added to the bloom filter.
32 *
33 * This allows the code to treat all data types (byval/byref/...) the same
34 * way, with only minimal space requirements, because we're working with
35 * hashes and not the original values. Everything is uint32.
36 *
37 * Of course, this assumes the built-in hash function is reasonably good,
38 * without too many collisions etc. But that does seem to be the case, at
39 * least based on past experience. After all, the same hash functions are
40 * used for hash indexes, hash partitioning and so on.
41 *
42 *
43 * hashing scheme
44 * --------------
45 *
46 * Bloom filters require a number of independent hash functions. There are
47 * different schemes how to construct them - for example we might use
48 * hash_uint32_extended with random seeds, but that seems fairly expensive.
49 * We use a scheme requiring only two functions described in this paper:
50 *
51 * Less Hashing, Same Performance:Building a Better Bloom Filter
52 * Adam Kirsch, Michael Mitzenmacher, Harvard School of Engineering and
53 * Applied Sciences, Cambridge, Massachusetts [DOI 10.1002/rsa.20208]
54 *
55 * The two hash functions h1 and h2 are calculated using hard-coded seeds,
56 * and then combined using (h1 + i * h2) to generate the hash functions.
57 *
58 *
59 * sizing the bloom filter
60 * -----------------------
61 *
62 * Size of a bloom filter depends on the number of distinct values we will
63 * store in it, and the desired false positive rate. The higher the number
64 * of distinct values and/or the lower the false positive rate, the larger
65 * the bloom filter. On the other hand, we want to keep the index as small
66 * as possible - that's one of the basic advantages of BRIN indexes.
67 *
68 * Although the number of distinct elements (in a page range) depends on
69 * the data, we can consider it fixed. This simplifies the trade-off to
70 * just false positive rate vs. size.
71 *
72 * At the page range level, false positive rate is a probability the bloom
73 * filter matches a random value. For the whole index (with sufficiently
74 * many page ranges) it represents the fraction of the index ranges (and
75 * thus fraction of the table to be scanned) matching the random value.
76 *
77 * Furthermore, the size of the bloom filter is subject to implementation
78 * limits - it has to fit onto a single index page (8kB by default). As
79 * the bitmap is inherently random (when "full" about half the bits is set
80 * to 1, randomly), compression can't help very much.
81 *
82 * To reduce the size of a filter (to fit to a page), we have to either
83 * accept higher false positive rate (undesirable), or reduce the number
84 * of distinct items to be stored in the filter. We can't alter the input
85 * data, of course, but we may make the BRIN page ranges smaller - instead
86 * of the default 128 pages (1MB) we may build index with 16-page ranges,
87 * or something like that. This should reduce the number of distinct values
88 * in the page range, making the filter smaller (with fixed false positive
89 * rate). Even for random data sets this should help, as the number of rows
90 * per heap page is limited (to ~290 with very narrow tables, likely ~20
91 * in practice).
92 *
93 * Of course, good sizing decisions depend on having the necessary data,
94 * i.e. number of distinct values in a page range (of a given size) and
95 * table size (to estimate cost change due to change in false positive
96 * rate due to having larger index vs. scanning larger indexes). We may
97 * not have that data - for example when building an index on empty table
98 * it's not really possible. And for some data we only have estimates for
99 * the whole table and we can only estimate per-range values (ndistinct).
100 *
101 * Another challenge is that while the bloom filter is per-column, it's
102 * the whole index tuple that has to fit into a page. And for multi-column
103 * indexes that may include pieces we have no control over (not necessarily
104 * bloom filters, the other columns may use other BRIN opclasses). So it's
105 * not entirely clear how to distribute the space between those columns.
106 *
107 * The current logic, implemented in brin_bloom_get_ndistinct, attempts to
108 * make some basic sizing decisions, based on the size of BRIN ranges, and
109 * the maximum number of rows per range.
110 *
111 *
112 * IDENTIFICATION
113 * src/backend/access/brin/brin_bloom.c
114 */
115#include "postgres.h"
116
117#include <math.h>
118
119#include "access/brin.h"
120#include "access/brin_internal.h"
121#include "access/brin_page.h"
122#include "access/brin_tuple.h"
123#include "access/genam.h"
124#include "access/htup_details.h"
125#include "access/reloptions.h"
126#include "catalog/pg_am.h"
127#include "catalog/pg_type.h"
128#include "common/hashfn.h"
129#include "port/pg_bitutils.h"
130#include "utils/fmgrprotos.h"
131#include "utils/rel.h"
132
133#define BloomEqualStrategyNumber 1
134
135/*
136 * Additional SQL level support functions. We only have one, which is
137 * used to calculate hash of the input value.
138 *
139 * Procedure numbers must not use values reserved for BRIN itself; see
140 * brin_internal.h.
141 */
142#define BLOOM_MAX_PROCNUMS 1 /* maximum support procs we need */
143#define PROCNUM_HASH 11 /* required */
144
145/*
146 * Subtract this from procnum to obtain index in BloomOpaque arrays
147 * (Must be equal to minimum of private procnums).
148 */
149#define PROCNUM_BASE 11
150
151/*
152 * Storage type for BRIN's reloptions.
153 */
154typedef struct BloomOptions
155{
156 int32 vl_len_; /* varlena header (do not touch directly!) */
157 double nDistinctPerRange; /* number of distinct values per range */
158 double falsePositiveRate; /* false positive for bloom filter */
160
161/*
162 * The current min value (16) is somewhat arbitrary, but it's based
163 * on the fact that the filter header is ~20B alone, which is about
164 * the same as the filter bitmap for 16 distinct items with 1% false
165 * positive rate. So by allowing lower values we'd not gain much. In
166 * any case, the min should not be larger than MaxHeapTuplesPerPage
167 * (~290), which is the theoretical maximum for single-page ranges.
168 */
169#define BLOOM_MIN_NDISTINCT_PER_RANGE 16
170
171/*
172 * Used to determine number of distinct items, based on the number of rows
173 * in a page range. The 10% is somewhat similar to what estimate_num_groups
174 * does, so we use the same factor here.
175 */
176#define BLOOM_DEFAULT_NDISTINCT_PER_RANGE -0.1 /* 10% of values */
177
178/*
179 * Allowed range and default value for the false positive range. The exact
180 * values are somewhat arbitrary, but were chosen considering the various
181 * parameters (size of filter vs. page size, etc.).
182 *
183 * The lower the false-positive rate, the more accurate the filter is, but
184 * it also gets larger - at some point this eliminates the main advantage
185 * of BRIN indexes, which is the tiny size. At 0.01% the index is about
186 * 10% of the table (assuming 290 distinct values per 8kB page).
187 *
188 * On the other hand, as the false-positive rate increases, larger part of
189 * the table has to be scanned due to mismatches - at 25% we're probably
190 * close to sequential scan being cheaper.
191 */
192#define BLOOM_MIN_FALSE_POSITIVE_RATE 0.0001 /* 0.01% fp rate */
193#define BLOOM_MAX_FALSE_POSITIVE_RATE 0.25 /* 25% fp rate */
194#define BLOOM_DEFAULT_FALSE_POSITIVE_RATE 0.01 /* 1% fp rate */
195
196#define BloomGetNDistinctPerRange(opts) \
197 ((opts) && (((BloomOptions *) (opts))->nDistinctPerRange != 0) ? \
198 (((BloomOptions *) (opts))->nDistinctPerRange) : \
199 BLOOM_DEFAULT_NDISTINCT_PER_RANGE)
200
201#define BloomGetFalsePositiveRate(opts) \
202 ((opts) && (((BloomOptions *) (opts))->falsePositiveRate != 0.0) ? \
203 (((BloomOptions *) (opts))->falsePositiveRate) : \
204 BLOOM_DEFAULT_FALSE_POSITIVE_RATE)
205
206/*
207 * And estimate of the largest bloom we can fit onto a page. This is not
208 * a perfect guarantee, for a couple of reasons. For example, the row may
209 * be larger because the index has multiple columns.
210 */
211#define BloomMaxFilterSize \
212 MAXALIGN_DOWN(BLCKSZ - \
213 (MAXALIGN(SizeOfPageHeaderData + \
214 sizeof(ItemIdData)) + \
215 MAXALIGN(sizeof(BrinSpecialSpace)) + \
216 SizeOfBrinTuple))
217
218/*
219 * Seeds used to calculate two hash functions h1 and h2, which are then used
220 * to generate k hashes using the (h1 + i * h2) scheme.
221 */
222#define BLOOM_SEED_1 0x71d924af
223#define BLOOM_SEED_2 0xba48b314
224
225/*
226 * Bloom Filter
227 *
228 * Represents a bloom filter, built on hashes of the indexed values. That is,
229 * we compute a uint32 hash of the value, and then store this hash into the
230 * bloom filter (and compute additional hashes on it).
231 *
232 * XXX We could implement "sparse" bloom filters, keeping only the bytes that
233 * are not entirely 0. But while indexes don't support TOAST, the varlena can
234 * still be compressed. So this seems unnecessary, because the compression
235 * should do the same job.
236 *
237 * XXX We can also watch the number of bits set in the bloom filter, and then
238 * stop using it (and not store the bitmap, to save space) when the false
239 * positive rate gets too high. But even if the false positive rate exceeds the
240 * desired value, it still can eliminate some page ranges.
241 */
242typedef struct BloomFilter
243{
244 /* varlena header (do not touch directly!) */
246
247 /* space for various flags (unused for now) */
249
250 /* fields for the HASHED phase */
251 uint8 nhashes; /* number of hash functions */
252 uint32 nbits; /* number of bits in the bitmap (size) */
253 uint32 nbits_set; /* number of bits set to 1 */
254
255 /* data of the bloom filter */
258
259/*
260 * bloom_filter_size
261 * Calculate Bloom filter parameters (nbits, nbytes, nhashes).
262 *
263 * Given expected number of distinct values and desired false positive rate,
264 * calculates the optimal parameters of the Bloom filter.
265 *
266 * The resulting parameters are returned through nbytesp (number of bytes),
267 * nbitsp (number of bits) and nhashesp (number of hash functions). If a
268 * pointer is NULL, the parameter is not returned.
269 */
270static void
271bloom_filter_size(int ndistinct, double false_positive_rate,
272 int *nbytesp, int *nbitsp, int *nhashesp)
273{
274 double k;
275 int nbits,
276 nbytes;
277
278 /* sizing bloom filter: -(n * ln(p)) / (ln(2))^2 */
279 nbits = ceil(-(ndistinct * log(false_positive_rate)) / pow(log(2.0), 2));
280
281 /* round m to whole bytes */
282 nbytes = ((nbits + 7) / 8);
283 nbits = nbytes * 8;
284
285 /*
286 * round(log(2.0) * m / ndistinct), but assume round() may not be
287 * available on Windows
288 */
289 k = log(2.0) * nbits / ndistinct;
290 k = (k - floor(k) >= 0.5) ? ceil(k) : floor(k);
291
292 if (nbytesp)
293 *nbytesp = nbytes;
294
295 if (nbitsp)
296 *nbitsp = nbits;
297
298 if (nhashesp)
299 *nhashesp = (int) k;
300}
301
302/*
303 * bloom_init
304 * Initialize the Bloom Filter, allocate all the memory.
305 *
306 * The filter is initialized with optimal size for ndistinct expected values
307 * and the requested false positive rate. The filter is stored as varlena.
308 */
309static BloomFilter *
310bloom_init(int ndistinct, double false_positive_rate)
311{
312 Size len;
313 BloomFilter *filter;
314
315 int nbits; /* size of filter / number of bits */
316 int nbytes; /* size of filter / number of bytes */
317 int nhashes; /* number of hash functions */
318
319 Assert(ndistinct > 0);
320 Assert(false_positive_rate > 0 && false_positive_rate < 1);
321
322 /* calculate bloom filter size / parameters */
323 bloom_filter_size(ndistinct, false_positive_rate,
324 &nbytes, &nbits, &nhashes);
325
326 /*
327 * Reject filters that are obviously too large to store on a page.
328 *
329 * Initially the bloom filter is just zeroes and so very compressible, but
330 * as we add values it gets more and more random, and so less and less
331 * compressible. So initially everything fits on the page, but we might
332 * get surprising failures later - we want to prevent that, so we reject
333 * bloom filter that are obviously too large.
334 *
335 * XXX It's not uncommon to oversize the bloom filter a bit, to defend
336 * against unexpected data anomalies (parts of table with more distinct
337 * values per range etc.). But we still need to make sure even the
338 * oversized filter fits on page, if such need arises.
339 *
340 * XXX This check is not perfect, because the index may have multiple
341 * filters that are small individually, but too large when combined.
342 */
343 if (nbytes > BloomMaxFilterSize)
344 elog(ERROR, "the bloom filter is too large (%d > %zu)", nbytes,
346
347 /*
348 * We allocate the whole filter. Most of it is going to be 0 bits, so the
349 * varlena is easy to compress.
350 */
351 len = offsetof(BloomFilter, data) + nbytes;
352
353 filter = (BloomFilter *) palloc0(len);
354
355 filter->flags = 0;
356 filter->nhashes = nhashes;
357 filter->nbits = nbits;
358
359 SET_VARSIZE(filter, len);
360
361 return filter;
362}
363
364
365/*
366 * bloom_add_value
367 * Add value to the bloom filter.
368 */
369static BloomFilter *
370bloom_add_value(BloomFilter *filter, uint32 value, bool *updated)
371{
372 int i;
373 uint64 h1,
374 h2;
375
376 /* compute the hashes, used for the bloom filter */
379
380 /* compute the requested number of hashes */
381 for (i = 0; i < filter->nhashes; i++)
382 {
383 /* h1 + h2 + f(i) */
384 uint32 h = (h1 + i * h2) % filter->nbits;
385 uint32 byte = (h / 8);
386 uint32 bit = (h % 8);
387
388 /* if the bit is not set, set it and remember we did that */
389 if (!(filter->data[byte] & (0x01 << bit)))
390 {
391 filter->data[byte] |= (0x01 << bit);
392 filter->nbits_set++;
393 if (updated)
394 *updated = true;
395 }
396 }
397
398 return filter;
399}
400
401
402/*
403 * bloom_contains_value
404 * Check if the bloom filter contains a particular value.
405 */
406static bool
408{
409 int i;
410 uint64 h1,
411 h2;
412
413 /* calculate the two hashes */
416
417 /* compute the requested number of hashes */
418 for (i = 0; i < filter->nhashes; i++)
419 {
420 /* h1 + h2 + f(i) */
421 uint32 h = (h1 + i * h2) % filter->nbits;
422 uint32 byte = (h / 8);
423 uint32 bit = (h % 8);
424
425 /* if the bit is not set, the value is not there */
426 if (!(filter->data[byte] & (0x01 << bit)))
427 return false;
428 }
429
430 /* all hashes found in bloom filter */
431 return true;
432}
433
434typedef struct BloomOpaque
435{
436 /*
437 * XXX At this point we only need a single proc (to compute the hash), but
438 * let's keep the array just like inclusion and minmax opclasses, for
439 * consistency. We may need additional procs in the future.
440 */
443
444static FmgrInfo *bloom_get_procinfo(BrinDesc *bdesc, uint16 attno,
445 uint16 procnum);
446
447
448Datum
450{
451 BrinOpcInfo *result;
452
453 /*
454 * opaque->strategy_procinfos is initialized lazily; here it is set to
455 * all-uninitialized by palloc0 which sets fn_oid to InvalidOid.
456 *
457 * bloom indexes only store the filter as a single BYTEA column
458 */
459
460 result = palloc0(MAXALIGN(SizeofBrinOpcInfo(1)) +
461 sizeof(BloomOpaque));
462 result->oi_nstored = 1;
463 result->oi_regular_nulls = true;
464 result->oi_opaque = (BloomOpaque *)
465 MAXALIGN((char *) result + SizeofBrinOpcInfo(1));
466 result->oi_typcache[0] = lookup_type_cache(PG_BRIN_BLOOM_SUMMARYOID, 0);
467
468 PG_RETURN_POINTER(result);
469}
470
471/*
472 * brin_bloom_get_ndistinct
473 * Determine the ndistinct value used to size bloom filter.
474 *
475 * Adjust the ndistinct value based on the pagesPerRange value. First,
476 * if it's negative, it's assumed to be relative to maximum number of
477 * tuples in the range (assuming each page gets MaxHeapTuplesPerPage
478 * tuples, which is likely a significant over-estimate). We also clamp
479 * the value, not to over-size the bloom filter unnecessarily.
480 *
481 * XXX We can only do this when the pagesPerRange value was supplied.
482 * If it wasn't, it has to be a read-only access to the index, in which
483 * case we don't really care. But perhaps we should fall-back to the
484 * default pagesPerRange value?
485 *
486 * XXX We might also fetch info about ndistinct estimate for the column,
487 * and compute the expected number of distinct values in a range. But
488 * that may be tricky due to data being sorted in various ways, so it
489 * seems better to rely on the upper estimate.
490 *
491 * XXX We might also calculate a better estimate of rows per BRIN range,
492 * instead of using MaxHeapTuplesPerPage (which probably produces values
493 * much higher than reality).
494 */
495static int
497{
498 double ndistinct;
499 double maxtuples;
500 BlockNumber pagesPerRange;
501
502 pagesPerRange = BrinGetPagesPerRange(bdesc->bd_index);
503 ndistinct = BloomGetNDistinctPerRange(opts);
504
505 Assert(BlockNumberIsValid(pagesPerRange));
506
507 maxtuples = MaxHeapTuplesPerPage * pagesPerRange;
508
509 /*
510 * Similarly to n_distinct, negative values are relative - in this case to
511 * maximum number of tuples in the page range (maxtuples).
512 */
513 if (ndistinct < 0)
514 ndistinct = (-ndistinct) * maxtuples;
515
516 /*
517 * Positive values are to be used directly, but we still apply a couple of
518 * safeties to avoid using unreasonably small bloom filters.
519 */
520 ndistinct = Max(ndistinct, BLOOM_MIN_NDISTINCT_PER_RANGE);
521
522 /*
523 * And don't use more than the maximum possible number of tuples, in the
524 * range, which would be entirely wasteful.
525 */
526 ndistinct = Min(ndistinct, maxtuples);
527
528 return (int) ndistinct;
529}
530
531/*
532 * Examine the given index tuple (which contains partial status of a certain
533 * page range) by comparing it to the given value that comes from another heap
534 * tuple. If the new value is outside the bloom filter specified by the
535 * existing tuple values, update the index tuple and return true. Otherwise,
536 * return false and do not modify in this case.
537 */
538Datum
540{
541 BrinDesc *bdesc = (BrinDesc *) PG_GETARG_POINTER(0);
542 BrinValues *column = (BrinValues *) PG_GETARG_POINTER(1);
546 Oid colloid = PG_GET_COLLATION();
547 FmgrInfo *hashFn;
548 uint32 hashValue;
549 bool updated = false;
550 AttrNumber attno;
551 BloomFilter *filter;
552
553 Assert(!isnull);
554
555 attno = column->bv_attno;
556
557 /*
558 * If this is the first non-null value, we need to initialize the bloom
559 * filter. Otherwise just extract the existing bloom filter from
560 * BrinValues.
561 */
562 if (column->bv_allnulls)
563 {
566 column->bv_values[0] = PointerGetDatum(filter);
567 column->bv_allnulls = false;
568 updated = true;
569 }
570 else
571 filter = (BloomFilter *) PG_DETOAST_DATUM(column->bv_values[0]);
572
573 /*
574 * Compute the hash of the new value, using the supplied hash function,
575 * and then add the hash value to the bloom filter.
576 */
577 hashFn = bloom_get_procinfo(bdesc, attno, PROCNUM_HASH);
578
579 hashValue = DatumGetUInt32(FunctionCall1Coll(hashFn, colloid, newval));
580
581 filter = bloom_add_value(filter, hashValue, &updated);
582
583 column->bv_values[0] = PointerGetDatum(filter);
584
585 PG_RETURN_BOOL(updated);
586}
587
588/*
589 * Given an index tuple corresponding to a certain page range and a scan key,
590 * return whether the scan key is consistent with the index tuple's bloom
591 * filter. Return true if so, false otherwise.
592 */
593Datum
595{
596 BrinDesc *bdesc = (BrinDesc *) PG_GETARG_POINTER(0);
597 BrinValues *column = (BrinValues *) PG_GETARG_POINTER(1);
598 ScanKey *keys = (ScanKey *) PG_GETARG_POINTER(2);
599 int nkeys = PG_GETARG_INT32(3);
600 Oid colloid = PG_GET_COLLATION();
601 AttrNumber attno;
602 Datum value;
603 bool matches;
604 FmgrInfo *finfo;
605 uint32 hashValue;
606 BloomFilter *filter;
607 int keyno;
608
609 filter = (BloomFilter *) PG_DETOAST_DATUM(column->bv_values[0]);
610
611 Assert(filter);
612
613 /*
614 * Assume all scan keys match. We'll be searching for a scan key
615 * eliminating the page range (we can stop on the first such key).
616 */
617 matches = true;
618
619 for (keyno = 0; keyno < nkeys; keyno++)
620 {
621 ScanKey key = keys[keyno];
622
623 /* NULL keys are handled and filtered-out in bringetbitmap */
624 Assert(!(key->sk_flags & SK_ISNULL));
625
626 attno = key->sk_attno;
627 value = key->sk_argument;
628
629 switch (key->sk_strategy)
630 {
632
633 /*
634 * We want to return the current page range if the bloom
635 * filter seems to contain the value.
636 */
637 finfo = bloom_get_procinfo(bdesc, attno, PROCNUM_HASH);
638
639 hashValue = DatumGetUInt32(FunctionCall1Coll(finfo, colloid, value));
640 matches &= bloom_contains_value(filter, hashValue);
641
642 break;
643 default:
644 /* shouldn't happen */
645 elog(ERROR, "invalid strategy number %d", key->sk_strategy);
646 matches = false;
647 break;
648 }
649
650 if (!matches)
651 break;
652 }
653
654 PG_RETURN_BOOL(matches);
655}
656
657/*
658 * Given two BrinValues, update the first of them as a union of the summary
659 * values contained in both. The second one is untouched.
660 *
661 * XXX We assume the bloom filters have the same parameters for now. In the
662 * future we should have 'can union' function, to decide if we can combine
663 * two particular bloom filters.
664 */
665Datum
667{
668 int i;
669 int nbytes;
672 BloomFilter *filter_a;
673 BloomFilter *filter_b;
674
675 Assert(col_a->bv_attno == col_b->bv_attno);
676 Assert(!col_a->bv_allnulls && !col_b->bv_allnulls);
677
678 filter_a = (BloomFilter *) PG_DETOAST_DATUM(col_a->bv_values[0]);
679 filter_b = (BloomFilter *) PG_DETOAST_DATUM(col_b->bv_values[0]);
680
681 /* make sure the filters use the same parameters */
682 Assert(filter_a && filter_b);
683 Assert(filter_a->nbits == filter_b->nbits);
684 Assert(filter_a->nhashes == filter_b->nhashes);
685 Assert((filter_a->nbits > 0) && (filter_a->nbits % 8 == 0));
686
687 nbytes = (filter_a->nbits) / 8;
688
689 /* simply OR the bitmaps */
690 for (i = 0; i < nbytes; i++)
691 filter_a->data[i] |= filter_b->data[i];
692
693 /* update the number of bits set in the filter */
694 filter_a->nbits_set = pg_popcount((const char *) filter_a->data, nbytes);
695
696 /* if we decompressed filter_a, update the summary */
697 if (PointerGetDatum(filter_a) != col_a->bv_values[0])
698 {
699 pfree(DatumGetPointer(col_a->bv_values[0]));
700 col_a->bv_values[0] = PointerGetDatum(filter_a);
701 }
702
703 /* also free filter_b, if it was decompressed */
704 if (PointerGetDatum(filter_b) != col_b->bv_values[0])
705 pfree(filter_b);
706
708}
709
710/*
711 * Cache and return inclusion opclass support procedure
712 *
713 * Return the procedure corresponding to the given function support number
714 * or null if it does not exist.
715 */
716static FmgrInfo *
718{
719 BloomOpaque *opaque;
720 uint16 basenum = procnum - PROCNUM_BASE;
721
722 /*
723 * We cache these in the opaque struct, to avoid repetitive syscache
724 * lookups.
725 */
726 opaque = (BloomOpaque *) bdesc->bd_info[attno - 1]->oi_opaque;
727
728 if (opaque->extra_procinfos[basenum].fn_oid == InvalidOid)
729 {
731 procnum)))
732 fmgr_info_copy(&opaque->extra_procinfos[basenum],
733 index_getprocinfo(bdesc->bd_index, attno, procnum),
734 bdesc->bd_context);
735 else
737 errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
738 errmsg_internal("invalid opclass definition"),
739 errdetail_internal("The operator class is missing support function %d for column %d.",
740 procnum, attno));
741 }
742
743 return &opaque->extra_procinfos[basenum];
744}
745
746Datum
748{
750
751 init_local_reloptions(relopts, sizeof(BloomOptions));
752
753 add_local_real_reloption(relopts, "n_distinct_per_range",
754 "number of distinct items expected in a BRIN page range",
756 -1.0, INT_MAX, offsetof(BloomOptions, nDistinctPerRange));
757
758 add_local_real_reloption(relopts, "false_positive_rate",
759 "desired false-positive rate for the bloom filters",
763 offsetof(BloomOptions, falsePositiveRate));
764
766}
767
768/*
769 * brin_bloom_summary_in
770 * - input routine for type brin_bloom_summary.
771 *
772 * brin_bloom_summary is only used internally to represent summaries
773 * in BRIN bloom indexes, so it has no operations of its own, and we
774 * disallow input too.
775 */
776Datum
778{
779 /*
780 * brin_bloom_summary stores the data in binary form and parsing text
781 * input is not needed, so disallow this.
782 */
784 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
785 errmsg("cannot accept a value of type %s", "pg_brin_bloom_summary")));
786
787 PG_RETURN_VOID(); /* keep compiler quiet */
788}
789
790
791/*
792 * brin_bloom_summary_out
793 * - output routine for type brin_bloom_summary.
794 *
795 * BRIN bloom summaries are serialized into a bytea value, but we want
796 * to output something nicer humans can understand.
797 */
798Datum
800{
801 BloomFilter *filter;
803
804 /* detoast the data to get value with a full 4B header */
806
809
810 appendStringInfo(&str, "mode: hashed nhashes: %u nbits: %u nbits_set: %u",
811 filter->nhashes, filter->nbits, filter->nbits_set);
812
814
816}
817
818/*
819 * brin_bloom_summary_recv
820 * - binary input routine for type brin_bloom_summary.
821 */
822Datum
824{
826 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
827 errmsg("cannot accept a value of type %s", "pg_brin_bloom_summary")));
828
829 PG_RETURN_VOID(); /* keep compiler quiet */
830}
831
832/*
833 * brin_bloom_summary_send
834 * - binary output routine for type brin_bloom_summary.
835 *
836 * BRIN bloom summaries are serialized in a bytea value (although the
837 * type is named differently), so let's just send that.
838 */
839Datum
841{
842 return byteasend(fcinfo);
843}
int16 AttrNumber
Definition: attnum.h:21
uint32 BlockNumber
Definition: block.h:31
static bool BlockNumberIsValid(BlockNumber blockNumber)
Definition: block.h:71
#define BrinGetPagesPerRange(relation)
Definition: brin.h:41
static int brin_bloom_get_ndistinct(BrinDesc *bdesc, BloomOptions *opts)
Definition: brin_bloom.c:496
#define PROCNUM_HASH
Definition: brin_bloom.c:143
#define BLOOM_SEED_1
Definition: brin_bloom.c:222
#define BLOOM_DEFAULT_NDISTINCT_PER_RANGE
Definition: brin_bloom.c:176
Datum brin_bloom_consistent(PG_FUNCTION_ARGS)
Definition: brin_bloom.c:594
static void bloom_filter_size(int ndistinct, double false_positive_rate, int *nbytesp, int *nbitsp, int *nhashesp)
Definition: brin_bloom.c:271
static BloomFilter * bloom_init(int ndistinct, double false_positive_rate)
Definition: brin_bloom.c:310
#define BloomGetNDistinctPerRange(opts)
Definition: brin_bloom.c:196
struct BloomFilter BloomFilter
Datum brin_bloom_options(PG_FUNCTION_ARGS)
Definition: brin_bloom.c:747
struct BloomOptions BloomOptions
#define BLOOM_MAX_FALSE_POSITIVE_RATE
Definition: brin_bloom.c:193
#define BLOOM_DEFAULT_FALSE_POSITIVE_RATE
Definition: brin_bloom.c:194
static bool bloom_contains_value(BloomFilter *filter, uint32 value)
Definition: brin_bloom.c:407
#define BLOOM_MAX_PROCNUMS
Definition: brin_bloom.c:142
#define BloomGetFalsePositiveRate(opts)
Definition: brin_bloom.c:201
#define BloomMaxFilterSize
Definition: brin_bloom.c:211
#define BLOOM_MIN_NDISTINCT_PER_RANGE
Definition: brin_bloom.c:169
#define BloomEqualStrategyNumber
Definition: brin_bloom.c:133
struct BloomOpaque BloomOpaque
#define BLOOM_MIN_FALSE_POSITIVE_RATE
Definition: brin_bloom.c:192
#define BLOOM_SEED_2
Definition: brin_bloom.c:223
#define PROCNUM_BASE
Definition: brin_bloom.c:149
Datum brin_bloom_union(PG_FUNCTION_ARGS)
Definition: brin_bloom.c:666
Datum brin_bloom_summary_send(PG_FUNCTION_ARGS)
Definition: brin_bloom.c:840
Datum brin_bloom_summary_out(PG_FUNCTION_ARGS)
Definition: brin_bloom.c:799
static FmgrInfo * bloom_get_procinfo(BrinDesc *bdesc, uint16 attno, uint16 procnum)
Definition: brin_bloom.c:717
Datum brin_bloom_add_value(PG_FUNCTION_ARGS)
Definition: brin_bloom.c:539
Datum brin_bloom_summary_in(PG_FUNCTION_ARGS)
Definition: brin_bloom.c:777
static BloomFilter * bloom_add_value(BloomFilter *filter, uint32 value, bool *updated)
Definition: brin_bloom.c:370
Datum brin_bloom_summary_recv(PG_FUNCTION_ARGS)
Definition: brin_bloom.c:823
Datum brin_bloom_opcinfo(PG_FUNCTION_ARGS)
Definition: brin_bloom.c:449
#define SizeofBrinOpcInfo(ncols)
Definition: brin_internal.h:41
Datum byteasend(PG_FUNCTION_ARGS)
Definition: bytea.c:363
#define RegProcedureIsValid(p)
Definition: c.h:776
#define Min(x, y)
Definition: c.h:1003
#define MAXALIGN(LEN)
Definition: c.h:810
uint8_t uint8
Definition: c.h:536
#define PG_USED_FOR_ASSERTS_ONLY
Definition: c.h:223
#define Max(x, y)
Definition: c.h:997
#define FLEXIBLE_ARRAY_MEMBER
Definition: c.h:470
int32_t int32
Definition: c.h:534
uint64_t uint64
Definition: c.h:539
uint16_t uint16
Definition: c.h:537
uint32_t uint32
Definition: c.h:538
size_t Size
Definition: c.h:610
int errmsg_internal(const char *fmt,...)
Definition: elog.c:1170
int errdetail_internal(const char *fmt,...)
Definition: elog.c:1243
int errcode(int sqlerrcode)
Definition: elog.c:863
int errmsg(const char *fmt,...)
Definition: elog.c:1080
#define ERROR
Definition: elog.h:39
#define elog(elevel,...)
Definition: elog.h:226
#define ereport(elevel,...)
Definition: elog.h:150
Datum FunctionCall1Coll(FmgrInfo *flinfo, Oid collation, Datum arg1)
Definition: fmgr.c:1130
void fmgr_info_copy(FmgrInfo *dstinfo, FmgrInfo *srcinfo, MemoryContext destcxt)
Definition: fmgr.c:581
#define PG_RETURN_VOID()
Definition: fmgr.h:349
#define PG_GETARG_POINTER(n)
Definition: fmgr.h:276
#define PG_RETURN_CSTRING(x)
Definition: fmgr.h:362
#define PG_GETARG_DATUM(n)
Definition: fmgr.h:268
#define PG_GET_OPCLASS_OPTIONS()
Definition: fmgr.h:342
#define PG_DETOAST_DATUM(datum)
Definition: fmgr.h:240
#define PG_GETARG_INT32(n)
Definition: fmgr.h:269
#define PG_GETARG_BOOL(n)
Definition: fmgr.h:274
#define PG_RETURN_POINTER(x)
Definition: fmgr.h:361
#define PG_GET_COLLATION()
Definition: fmgr.h:198
#define PG_FUNCTION_ARGS
Definition: fmgr.h:193
#define PG_RETURN_BOOL(x)
Definition: fmgr.h:359
#define newval
uint64 hash_bytes_uint32_extended(uint32 k, uint64 seed)
Definition: hashfn.c:631
Assert(PointerIsAligned(start, uint64))
const char * str
#define MaxHeapTuplesPerPage
Definition: htup_details.h:624
FmgrInfo * index_getprocinfo(Relation irel, AttrNumber attnum, uint16 procnum)
Definition: indexam.c:917
RegProcedure index_getprocid(Relation irel, AttrNumber attnum, uint16 procnum)
Definition: indexam.c:883
static struct @169 value
int i
Definition: isn.c:77
if(TABLE==NULL||TABLE_index==NULL)
Definition: isn.c:81
void pfree(void *pointer)
Definition: mcxt.c:1594
void * palloc0(Size size)
Definition: mcxt.c:1395
static AmcheckOptions opts
Definition: pg_amcheck.c:112
static uint64 pg_popcount(const char *buf, int bytes)
Definition: pg_bitutils.h:363
const void size_t len
const void * data
static uint32 DatumGetUInt32(Datum X)
Definition: postgres.h:232
static Datum PointerGetDatum(const void *X)
Definition: postgres.h:332
uint64_t Datum
Definition: postgres.h:70
static Pointer DatumGetPointer(Datum X)
Definition: postgres.h:322
#define InvalidOid
Definition: postgres_ext.h:37
unsigned int Oid
Definition: postgres_ext.h:32
void init_local_reloptions(local_relopts *relopts, Size relopt_struct_size)
Definition: reloptions.c:754
void add_local_real_reloption(local_relopts *relopts, const char *name, const char *desc, double default_val, double min_val, double max_val, int offset)
Definition: reloptions.c:992
#define SK_ISNULL
Definition: skey.h:115
void appendStringInfo(StringInfo str, const char *fmt,...)
Definition: stringinfo.c:145
void appendStringInfoChar(StringInfo str, char ch)
Definition: stringinfo.c:242
void initStringInfo(StringInfo str)
Definition: stringinfo.c:97
uint8 nhashes
Definition: brin_bloom.c:251
char data[FLEXIBLE_ARRAY_MEMBER]
Definition: brin_bloom.c:256
uint32 nbits_set
Definition: brin_bloom.c:253
uint32 nbits
Definition: brin_bloom.c:252
uint16 flags
Definition: brin_bloom.c:248
int32 vl_len_
Definition: brin_bloom.c:245
FmgrInfo extra_procinfos[BLOOM_MAX_PROCNUMS]
Definition: brin_bloom.c:441
double falsePositiveRate
Definition: brin_bloom.c:158
int32 vl_len_
Definition: bloom.h:103
double nDistinctPerRange
Definition: brin_bloom.c:157
BrinOpcInfo * bd_info[FLEXIBLE_ARRAY_MEMBER]
Definition: brin_internal.h:62
Relation bd_index
Definition: brin_internal.h:50
MemoryContext bd_context
Definition: brin_internal.h:47
TypeCacheEntry * oi_typcache[FLEXIBLE_ARRAY_MEMBER]
Definition: brin_internal.h:37
uint16 oi_nstored
Definition: brin_internal.h:28
bool oi_regular_nulls
Definition: brin_internal.h:31
void * oi_opaque
Definition: brin_internal.h:34
Datum * bv_values
Definition: brin_tuple.h:34
AttrNumber bv_attno
Definition: brin_tuple.h:31
bool bv_allnulls
Definition: brin_tuple.h:33
Definition: fmgr.h:57
Oid fn_oid
Definition: fmgr.h:59
TypeCacheEntry * lookup_type_cache(Oid type_id, int flags)
Definition: typcache.c:386
static void SET_VARSIZE(void *PTR, Size len)
Definition: varatt.h:432
Datum bit(PG_FUNCTION_ARGS)
Definition: varbit.c:391