Skip to content

[ub] added many missing entries to UB annex #7864

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
May 14, 2025

Conversation

notadragon
Copy link
Contributor

No description provided.

Copy link

@timuraudio timuraudio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following instance of explicit language UB still seems to be missing from this PR:

[basic.compound]/4: If a pointer value P is used in an evaluation E and P is not valid in the context of E, then the behavior is undefined if E is an indirection ([expr.unary.op]) or an invocation of a deallocation function ([basic.stc.dynamic.deallocation]).

In addition, the PR missed some of the cases where a single ubref should be split into two because there are actually two related but distinct instances of explicit language UB:

intro.object.implicit.create
expr.dynamic.cast.lifetime
expr.delete.mismatch
class.cdtor.convert.or.form.pointer

@timuraudio
Copy link

timuraudio commented May 5, 2025

In addition, I am aware of the following issues:

expr.ref.not.similar – the stable identifier should be changed to something that reflects it's actually about member access

expr.add.polymorphic – it's a weird choice of stable identifier as it's not actually bout polymorphism – I would change to expr.add.not.similar, to be consistent with expr.ref.not.similar

expr.ass.overlap – change to expr.assign.overlap

@notadragon
Copy link
Contributor Author

Many of the cases where you suggest splitting up UB into multiple entries don't seem to be really needed --- i think it might help for us to pin down what exactly we consider worth having distinct entries.

For now, I think we don't need to get more granular than absolutely needed, as long as we cover everything. we can always refine the categorization later if we need to. My priority at the moment is to be sure we have captured everything first before we try to over-explain with more fine-grained categories.

@timuraudio
Copy link

You are right that covering everything is more important than granularity and we can do this later, but my counter-argument is that doing it now is quick and easy and doing it later will be more expensive and complicated – there aren't any downsides to doing it now as far as I'm aware.

Note also that if every explicit mention of the word "undefined" in the Standard is immediately followed by an \ubdef, there is no ambiguity about us having covered them all and about which part each \ubdef actually refers to. You can achieve that with the following edits:

expr.delete.mismatch should be split into expr.delete.mismatch and expr.delete.array.mismatch. We already have a case where an instance of UB was split into single-object and array versions: expr.delete.dynamic.type.differ and expr.delete.dynamic.array.dynamic.type.differ. We should be consistent.

1a.
As a related but separate issue, the existing stable identifier "expr.delete.dynamic.array.dynamic.type.differ" seems weirdly repetitive, it should just be "expr.delete.array.dynamic.type.differ".

class.cdtor.convert.or.form.pointer should be split into class.cdtor.convert.pointer and class.cdtor.form.pointer. Those are different cases that warrant different examples and different error messages. The former is about converting a pointer the object itself, whereas the latter is about accessing its members. There are subtle differences between the two cases that would benefit from being pointed out.

expr.dynamic.cast.lifetime should be split into expr.dynamic.cast.lifetime.pointer and expr.dynamic.cast.lifetime.glvalue. The main reason is again consistency – we already do this for lifetime.outside.pointer and lifetime.outside.glvalue.

intro.object.implicit.create should be split into intro.object.implicit.create.object for "If no such set of objects would give the program defined behavior, ..." and intro.object.implicit.create.ptr for "If no such pointer value would give the program defined behavior, ...". The reason is that it is actually quite hard to figure out exactly when this is UB and when this isn't and to come up with examples. The examples for these two cases are subtly different and having them listed separately will greatly help with understanding.

@hsutter hsutter added the ub-ifndr UB and IFNDR Annex label May 13, 2025
@jensmaurer
Copy link
Member

Ok, let's merge this into the main pull request and fix any remaining issues there.

@jensmaurer jensmaurer merged commit 88a6384 into cplusplus:ub-ifndr May 14, 2025
0 of 2 checks passed
Copy link

@shafik shafik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank for all the work that went into this PR. Apologies for late review but I caught at least a few issues but they should be quick to fix.

\begin{codeblock}
void f() {
double d = FLT_MAX * 16;
d *= 16;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary for this example?

I don't think so: https://godbolt.org/z/P1aqaeY79

but clang misses this case :-(

it looks like clang used to flag this based on my old article here, so this behavior might have been a purposeful change.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not practically be UB based on cwg2723

\begin{codeblock}
extern int &ir1;
int i2 = ir1; // undefined behavior, \tcode{ir1} not yet initialized
int ir1 = 17;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not right, not sure what you intended here: https://godbolt.org/z/djaTE6o6Y

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will fix in another PR

{
X& px = &g();
px->~X();
int*p = px->i; // undefined behavior
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few errors here, I think you meant: https://godbolt.org/z/TvxPn18xY

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will fix in another PR

\pnum
\begin{example}
\begin{codeblock}
void g()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not think about this before but perhaps examples like this are preferable since they are more verifiable for folks in that we can see immediate feedback: https://godbolt.org/z/zf47x1cE3

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think i find wedging things into constexpr to be a bit distracting --- it seems like it would hide the issue trying to be addressed. Also, given how many edge cases are not caught by the compilers and how often people say "my compiler lets me do during constexpr evaluation so it couldn't possibly be UB" I am skeptical about having the standard itself rely on such presumptions.

@notadragon notadragon deleted the ub-ifndr branch May 23, 2025 17:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ub-ifndr UB and IFNDR Annex
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants