Re: jsonpath
От | Nikita Glukhov |
---|---|
Тема | Re: jsonpath |
Дата | |
Msg-id | [email protected] обсуждение исходный текст |
Ответ на | Re: jsonpath ("Jonathan S. Katz" <[email protected]>) |
Ответы |
Re: jsonpath
|
Список | pgsql-hackers |
On 17.03.2019 21:29, Jonathan S. Katz wrote:
On 3/17/19 1:14 PM, Alexander Korotkov wrote:On Sun, Mar 17, 2019 at 8:06 PM Jonathan S. Katz <[email protected]> wrote:On 3/17/19 1:02 PM, Alexander Korotkov wrote:Thank you for the explanation. Is it jsonb_ops or jsonb_path_ops?I just used "USING gin(col)" so jsonb_ops.I see. So, jsonb_ops extracts from this query only existence of .length key. And I can bet it exists in all (or almost all) the documents. Thus, optimizer thinks index might be useful, while it's useless. There is not much can be done while we don't have statistics for jsonb (and access to it from GIN extract_query). So, for now we can just refuse from extracting only keys from jsonpath in jsonb_ops. But I think it would be better to just document this issue. In future we should improve that with statistics.That seems to make sense, especially given how I've typically stored JSON documents in PostgreSQL. It sounds like this particular problem would be solved appropriately with JSONB statistics.
GIN jsonb_ops extracts from query data @? '$.length ? (@ < 150)' the same GIN entries as from queries data @? '$.length' data ? 'length' If you don't want to extract entries from unsupported expressions, you can try to use another jsonpath operator @@. Queries will also look like a bit simpler: data @@ '$.length < 150' data @@ '$.content like_regex "^Start"' data @@ '$.content like_regex "risk" flag "i"' All this queries emit no GIN entries. But note that data @@ '$ ? (@.content == "foo").length < 150' emits the same entries as data @@ '$.content == "foo"' We already have a POC implementation of jsonb statistics that was written 2 years ago. I rebased it onto the current master yesterday. If it is interesting, you can find it on my GitHub [1]. But note, that there is no support for jsonpath operators yet, only boolean EXISTS ?, ?|, ?&, and CONTAINS @> operators are supported. Also there is no docs, and it works slowly (a more effective storage method for statistics of individual JSON paths is needed). Also there is ability to calculate derived statistics of expressions like js -> 'x' -> 0 -> 'y' js #> '{x,0,y}' using jsonb statistics for columns "js". So the selectivity of expressions js -> 'x' -> 0 -> 'y' = '123' js #> '{x,0,y}' >= '123' also can be estimated (but these expressions can't be used by index on "js"). This topic deserves a separate discussion. I hope, we will start the corresponding thread for PG13 after we find a more effective way of jsonb statistics storing. [1] https://github.com/glukhovn/postgres/tree/jsonb_stats
-- Nikita Glukhov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
В списке pgsql-hackers по дате отправления: