revamp row-security tracking

Lists: pgsql-hackers
From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: revamp row-security tracking
Date: 2024-11-21 18:00:37
Message-ID: Zz91RagtQg2s9497@nathan
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

In light of CVE-2024-10976, which was fixed by commit cd7ab57, I'd like to
propose a bigger change to this area of the code that aims to future-proof
it a bit. Instead of requiring hackers to carefully cart around whether a
query references a table with RLS enabled, I think we should instead
accumulate such information globally and require higher-level routines like
fireRIRrules() and inline_set_returning_function() to inspect it as needed.

The attached patch accomplishes this by establishing a global queue of
row-security "nest levels" that the aforementioned higher-level callers can
use. Essentially, they will first call PushRowSecurityNestLevel(), then
any calls to get_row_security_policies() that encounter a table with
row-security enabled will mark the current nest level (and its parents).
Finally, PopRowSecurity() is used to retrieve whether row-security might
apply, and the query can be marked correctly. I've also attempted to
handle resetting this queue after an ERROR or failed SRF inlining, but I'm
not yet positive that I've got all the details right.

With this patch applied, we should be able to revert some of the
row-security tracking code, especially the stuff added by commit cd7ab57.

Thoughts?

--
nathan

Attachment Content-Type Size
v1-0001-revamp-row-security-tracking.patch text/plain 16.4 KB

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: revamp row-security tracking
Date: 2024-11-29 10:01:41
Message-ID: CAEZATCUDuzvQ8THEHNRRkg7Nia7OzZBA5PzwCm_c70M82pCnPg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 21 Nov 2024 at 18:00, Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
>
> In light of CVE-2024-10976, which was fixed by commit cd7ab57, I'd like to
> propose a bigger change to this area of the code that aims to future-proof
> it a bit. Instead of requiring hackers to carefully cart around whether a
> query references a table with RLS enabled, I think we should instead
> accumulate such information globally and require higher-level routines like
> fireRIRrules() and inline_set_returning_function() to inspect it as needed.
>
> The attached patch accomplishes this by establishing a global queue of
> row-security "nest levels" that the aforementioned higher-level callers can
> use.

I'm not convinced that this is an improvement.

The code in check_sql_fn_retval() is building a Query struct from
scratch, so it seems perfectly natural for it to be responsible for
setting all the required fields, based on the information it has
available. With this patch, check_sql_fn_retval() is returning a
potentially incorrectly marked Query at the end of the querytree list,
which the caller is responsible for fixing up, which doesn't seem
ideal.

I'm also not a fan of using global variables in this way, or the
resulting need to hook into the transaction management system to tidy
up. The end result is that the places where the flag is set are moved
further away from where RLS policies are applied, which IMO makes the
code much harder to follow.

There is exactly one place where RLS policies are applied, and it
seems much more natural for it to have responsibility for setting this
flag. I think that a slightly neater way for it to handle that would
be to modify fireRIRrules(), adding an extra parameter "bool
*hasRowSecurity" that it would set to true if RLS is enabled for the
query it is rewriting. Doing that forces all callers to think about
whether or not that affects some outer query. For example,
ApplyRetrieveRule() would then do:

rule_action = fireRIRrules(rule_action, activeRIRs,
&parsetree->hasRowSecurity);

rather than having a separate second step to update the flag on
"parsetree", and similarly for fireRIRrules()'s recursive calls to
itself. If, in the future, it becomes necessary to invoke
fireRIRrules() on more parts of a Query, it's then much more likely
that the new code won't forget to update the parent query's flag.

Regards,
Dean


From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: revamp row-security tracking
Date: 2024-12-02 16:34:21
Message-ID: Z03hjW3U9yERkDdG@nathan
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Nov 29, 2024 at 10:01:41AM +0000, Dean Rasheed wrote:
> On Thu, 21 Nov 2024 at 18:00, Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
>> The attached patch accomplishes this by establishing a global queue of
>> row-security "nest levels" that the aforementioned higher-level callers can
>> use.
>
> I'm not convinced that this is an improvement.

Thanks for reviewing.

> The code in check_sql_fn_retval() is building a Query struct from
> scratch, so it seems perfectly natural for it to be responsible for
> setting all the required fields, based on the information it has
> available. With this patch, check_sql_fn_retval() is returning a
> potentially incorrectly marked Query at the end of the querytree list,
> which the caller is responsible for fixing up, which doesn't seem
> ideal.

While it is indeed natural for the code that builds a Query to be
responsible for setting it correctly, unfortunately there's no backstop if
someone forgets to do so (as was the case in the recent CVE). I don't
think my v1 patch would necessarily prevent all such problems, but I do
think it would help prevent some.

> There is exactly one place where RLS policies are applied, and it
> seems much more natural for it to have responsibility for setting this
> flag. I think that a slightly neater way for it to handle that would
> be to modify fireRIRrules(), adding an extra parameter "bool
> *hasRowSecurity" that it would set to true if RLS is enabled for the
> query it is rewriting. Doing that forces all callers to think about
> whether or not that affects some outer query. For example,
> ApplyRetrieveRule() would then do:
>
> rule_action = fireRIRrules(rule_action, activeRIRs,
> &parsetree->hasRowSecurity);
>
> rather than having a separate second step to update the flag on
> "parsetree", and similarly for fireRIRrules()'s recursive calls to
> itself. If, in the future, it becomes necessary to invoke
> fireRIRrules() on more parts of a Query, it's then much more likely
> that the new code won't forget to update the parent query's flag.

I've attempted this in the attached v2 patch. I do think this is an
improvement over the status quo, but I worry that it doesn't go far enough.

--
nathan

Attachment Content-Type Size
v2-0001-revamp-row-security-tracking.patch text/plain 5.8 KB

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: revamp row-security tracking
Date: 2025-02-17 17:53:00
Message-ID: Z7N3fIDbIGIzGByR@nathan
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Given there doesn't seem to be a huge amount of interest in this, I plan to
mark it as Withdrawn soon.

--
nathan


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: revamp row-security tracking
Date: 2025-02-17 18:08:29
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Nathan Bossart <nathandbossart(at)gmail(dot)com> writes:
> Given there doesn't seem to be a huge amount of interest in this, I plan to
> mark it as Withdrawn soon.

I think you're being too impatient. It's still an interesting
topic, it just needs more thought to get to something committable.

I find this has-row-security marking problem to be comparable
to the has-sublinks marking problem. We've had tons of
bugs-of-omission with that too, and the present code feels
ugly and not any less prone to omissions than it ever was.

I wonder whether considering both problems together would yield any
insights, following Polya's dictum that "the more general problem may
be easier to solve".

One straightforward idea is to just not do the marking at all,
but rather require places that want to know these properties
to do a fresh search of the query tree when they want to know
it. That obviously has performance questions to answer, but
it's easier to give answers to performance questions than
"is this correct" questions.

regards, tom lane


From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: revamp row-security tracking
Date: 2025-02-17 21:42:44
Message-ID: Z7OtVDMEs_EKpu6d@nathan
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Feb 17, 2025 at 01:08:29PM -0500, Tom Lane wrote:
> I think you're being too impatient. It's still an interesting
> topic, it just needs more thought to get to something committable.

Maybe I am. Thanks for chiming in.

> I find this has-row-security marking problem to be comparable
> to the has-sublinks marking problem. We've had tons of
> bugs-of-omission with that too, and the present code feels
> ugly and not any less prone to omissions than it ever was.
>
> I wonder whether considering both problems together would yield any
> insights, following Polya's dictum that "the more general problem may
> be easier to solve".
>
> One straightforward idea is to just not do the marking at all,
> but rather require places that want to know these properties
> to do a fresh search of the query tree when they want to know
> it. That obviously has performance questions to answer, but
> it's easier to give answers to performance questions than
> "is this correct" questions.

That could be worth a try. The reason I started with the global queue idea
was that we seem to reliably discover relations with RLS enabled, we just
tend to miss propagating that information to the top-level query. We could
invent a separate query tree walker for discovering RLS, etc. to keep it
centralized. However, besides the performance questions, it would be
another separate piece of code to keep updated. Perhaps another variation
on this idea is to create a query walker that just looks for hasRowSecurity
flags throughout the tree (and to use that to mark the plan cache entry
appropriately).

--
nathan


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: revamp row-security tracking
Date: 2025-02-17 21:54:56
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Nathan Bossart <nathandbossart(at)gmail(dot)com> writes:
> Perhaps another variation
> on this idea is to create a query walker that just looks for hasRowSecurity
> flags throughout the tree (and to use that to mark the plan cache entry
> appropriately).

That seems like a pretty plausible compromise position. So we'd
redefine Query.hasRowSecurity as summarizing the situation for only
the Query's own rtable entries, not recursively for sub-Queries.

regards, tom lane