How To Get List of Elements From Multiranges
How To Get List of Elements From Multiranges
So, some time ago, Pg devs added multi ranges – that is datatype that can be used to store multiple ranges in single column.
The thing is that it wasn't really simple how to get list of ranges from within such multirange. There was no operator, no way to split it.
A month ago Alexander Korotkov committed patch that added unnest() over multiranges, but it got some problems, and was reverted
It will eventually made it's way into sources, I assume, but in the mean time – a friend of mine asked how to get list of elements from multiranges. So I looked into
it.
$ \dTS *multirange
List of data types
Schema │ Name │ Description
────────────┼─────────────────────────┼───────────────────────────────────────────────────────────────────────
pg_catalog │ anycompatiblemultirange │ pseudo-type representing a multirange over a polymorphic common type
pg_catalog │ anymultirange │ pseudo-type representing a polymorphic base type that is a multirange
pg_catalog │ datemultirange │ multirange of dates
pg_catalog │ int4multirange │ multirange of integers
pg_catalog │ int8multirange │ multirange of bigints
pg_catalog │ nummultirange │ multirange of numerics
pg_catalog │ tsmultirange │ multirange of timestamps without time zone
pg_catalog │ tstzmultirange │ multirange of timestamps with time zone
(8 rows)
First two are irrelevant. So we're realistically have to figure out multiranges for:
date
integers
numerics
timestamps with or without timezone
The thing is that none of these can contain characters like [, ], (, or ) – which are used to denote end of ranges. So, while it's ugly, and will be obsolete the second
that we'll get unnest(), we can split it using regexps.
1. [ or ( character
2. Any number of characters, with the exception of ] and )
3. ] or ) character
That's simple to write, but regexp for this will be well, unpleasant. Since character ranges in regexps are written within brackets, we will have to do some (a lot of)
escaping. So the parts that I listed above will become:
1. [\[(]
2. [^\])]+
3. [\])]
$ with mr as (
select '[0,100]'::int4range::int4multirange - '[30,40)'::int4range::int4multirange as v
)
select x
from mr, regexp_matches(mr.v::text, '[\[(][^\])]+[\])]', 'g') x;
x
──────────────
{"[0,30)"}
{"[40,101)"}
(2 rows)
Nice-ish. Now, I just need to extract first element from each array that regexp_matches returned, and cast back to int4range:
$ with mr as (
select '[0,100]'::int4range::int4multirange - '[30,40)'::int4range::int4multirange as v
)
select x[1]::int4range
from mr, regexp_matches(mr.v::text, '[\[(][^\])]+[\])]', 'g') x;
x
──────────
[0,30)
[40,101)
(2 rows)
Well, this isn't really beautiful. We could wrap it in a function, but it would have to return texts, and not ranges, and I simply don't know if it's possible in sensible
way.
And – this is just temporary, until unpack() will be re-submitted. So, I guess it's OK for now. We can write the ugly regexp in the function, though:
2021-07-18 at 20:34
The unnest() function was OK. The trouble was caused by the cast from multirange to the array of range, which went in the same commit.
I’ve re-pushed unnest() part to pg 14. Hopefully everything will be OK.
https://www.postgresql.org/message-id/E1m5BGf-0001HR-Of%40gemulon.postgresql.org
2. depesz says:
2021-07-18 at 21:33
@Alexander:
Thanks for the update, great news.