This RFC consists of two main parts
- the addition of nesting information to the diagnostics (engine) and consumers
- the introduction of a new diagnostics output format to take advantage of that information.
For the current state of the implementation, see [Clang] [Diagnostics] Add a new text diagnostics format that supports nested diagnostics by Sirraide · Pull Request #151234 · llvm/llvm-project · GitHub.
Both parts are described in greater detail below, but to summarise, I think it would be beneficial to be able to arrange diagnostics hierarchically beyond the simple ‘notes are attached to warnings/errors’ model that we have at the moment. However, altering the way in which we currently render diagnostics to the terminal would probably be too much of a breaking change for a number of people, so I propose adding a new format instead.
Thanks also to @AaronBallman for feedback on this RFC.
1. Nesting
In the current implementation, Diagnostic
, StoredDiagnostic
, and DiagnosticStorage
now have a NestingLevel
member, which is an (optional) unsigned
value. The diagnostics engine keeps track of a ‘global nesting level’, which can be incremented via an RAII object; when a diagnostic is emitted (i.e. passed to the DiagnosticConsumer
), the nesting level of the diagnostic is set to the current global nesting level (or that level +1 if it is a note). This means that, by default, the nesting level is 1
for note and 0
for any other type of diagnostic.
StoredDiagnostic
and DiagnosticStorage
store an UnsignedOrNone
, which is unset if the nesting level hasn’t been computed yet; if set, the nesting level will not be recomputed when the diagnostic is emitted. This is to allow diagnostic consumers to store diagnostics and emit them later on without altering their nesting level.
How to Make Use of Nesting
On the consumer side, in and of itself, the nesting level does nothing; it is up to diagnostic consumers to make use of it.
On the producer side, the intended (and indeed only) way to nest diagnostics is to use the aforementioned RAII object (DiagnosticsEngine::NestingLevelRAII
); I’ve introduced nesting to a few of the DiagnoseTypeTraitDetails
diagnostics since they were previously pointed out to me as being good candidates for this. An example of this is the following:
static void DiagnoseNonTriviallyCopyableReason(Sema &SemaRef,
SourceLocation Loc, QualType T) {
SemaRef.Diag(Loc, diag::note_unsatisfied_trait)
<< T << diag::TraitName::TriviallyCopyable;
+ DiagnosticsEngine::NestingLevelRAII IncrementNestingLevel{SemaRef.Diags};
if (T->isReferenceType())
SemaRef.Diag(Loc, diag::note_unsatisfied_trait_reason)
<< diag::TraitNotSatisfiedReason::Ref;
const CXXRecordDecl *D = T->getAsCXXRecordDecl();
if (!D || D->isInvalidDecl())
return;
if (D->hasDefinition())
DiagnoseNonTriviallyCopyableReason(SemaRef, Loc, D);
+ IncrementNestingLevel.Increment();
SemaRef.Diag(D->getLocation(), diag::note_defined_here) << D;
}
(The Increment()
function allows ‘reusing’ an RAII object rather than having to create a new one every time you want to increment the nesting level. It’s possible to make do w/o it by just adding another scope—which is what I did initially—but I ended up including it because after migrating a mere 4 or so functions, I was already using it in 2 places, so chances are there will be a quite a few more cases where it will come in handy…)
Basically, if you know you’re about to emit diagnostics which you’d like to be more deeply nested, you simply increment the global nesting level, and then any code after that doesn’t have to know or care about what nesting level it’s at. This is also (currently) the only way to alter the nesting level of a diagnostic. The intention of this is to make it possible to introduce nesting without having to fundamentally alter the way (or order) in which diagnostics are emitted—though of course, there will most likely be places where we will want to perform some amount of rearranging or grouping to take full advantage of nesting.
I briefly experimented with making it possible to write something along the lines of Diag(...) << diag::AddNestingLevel(1)
, which would add an increment to the nesting level of a single diagnostic, but I found the RAII object to be both simpler to implement and easier to reason about. One thing we definitely don’t want to start doing is setting the nesting level to an absolute value (e.g. Diag(...) << diag::SetNestingLevel(5)
or whatever), since that would break in the presence of recursive nesting.
2. The New ‘Fancy’ Diagnostics Format
Pass -fdiagnostics-format=fancy
to enable it. If someone can think of a better name for this, please let me know (note: I didn’t want to call this the ‘nested’ diagnostic format or anything like that because it does more than just add nesting).
While we could simply update the existing text diagnostic printer to e.g. indent diagnostics by some amount for each nesting level, I don’t think this is a good idea: the current format is fairly well-established, and I wouldn’t be surprised if doing even something as simple as adding leading indentation would mess with systems that expect <file>:<line>:<col>: error:
at the start of a line.
Moreover, I would argue that our current default format has a few shortcomings, though this might be fairly subjective (a while back I did look into a few papers about writing ergonomic diagnostics in the course of this, but I don’t consider this to be an exact science…); e.g. consider this diagnostic:
In file included from test.cc:17:
/usr/lib/gcc/x86_64-redhat-linux/15/../../../../include/c++/15/variant:103:21: error: static assertion failed due to requirement '2UL < sizeof...(_Types)'
103 | static_assert(_Np < sizeof...(_Types));
| ^~~~~~~~~~~~~~~~~~~~~~~
I’d argue that the first thing you’d probably want to see when looking at a compiler error is the actual error message, but in our current format, you first have to make it past the include stack and the file path; in this case, the path is so long that the error message flows off the screen (I’ve been working on improving that part too, but it’s turned out to be more complicated than expected; see #148745 for that).
Example
So, what I suggest we do instead is add a completely new format which can both make use of nesting and hopefully be a bit more ergonomic. In the present version of this format, the error above would end up being formatted as
error: static assertion failed due to requirement '2UL < sizeof...(_Types)'
|
| - included from test.cc:17:10:
| - at /usr/lib/gcc/x86_64-redhat-linux/15/../../../../include/c++/15/variant:103:21:
|
| 103 | static_assert(_Np < sizeof...(_Types));
| | ^~~~~~~~~~~~~~~~~~~~~~~
which I would argue is a bit clearer structurally speaking. Of course, this is just what I’ve managed to arrive at; if anyone has any suggestions as to how to improve on this (or if you find it completely unreadable or don’t like it at all), please let me know.
Connecting Lines
The line in the first column might seem a bit odd in isolation, but it’s there to join to any nested diagnostics following it, e.g.
error: static assertion failed due to requirement '__is_trivially_copyable(Y)'
|
| - at test.cc:14:15:
|
| 14 | static_assert(__is_trivially_copyable(Y));
| | ^~~~~~~~~~~~~~~~~~~~~~~~~~
|
|-- note: 'Y' is not trivially copyable
|
| - at test.cc:14:15:
|
|
|------ note: because it has a user provided copy constructor
|
| - at test.cc:14:15:
|
| 6 | struct Y { Y(const Y&); };
| | ~~~~~~~~~~~
| 7 |
| 8 | void g() {
| 9 | FOO
| 10 | }
| 11 |
| 12 | static_assert(__is_assignable(int&, void));
| 13 | static_assert(__is_empty(X&));
| 14 | static_assert(__is_trivially_copyable(Y));
| | ^
|
|---------- note: 'Y' defined here
|
| - at test.cc:6:8:
|
| 6 | struct Y { Y(const Y&); };
| | ^
I will say, I personally prefer this without the lines connecting to the notes (I think just the nesting+spacing is enough), but @AaronBallman liked them so I’ve included them here.
Alternative Formatting Ideas and Buffering
Certain more sophisticated formatting operations would require us to buffer diagnostics. Consider for instance a group of nested diagnostics that we might want to print in the following hypothetical format:
A
|
|-- B
| |
| |-- C
|
|-- D
|
|-- E
Printing this is impossible without buffering: when printing C
, we need to also print the |
characters in the first column that make up the line connecting to D
; conversely, when printing E
, we don’t print any |
s in the first column, because there is nothing else to connect to. The issue, clearly, is that we have no way of knowing whether there will be some future diagnostic that we need to draw a connecting line to… until we either see that diagnostic or another top-level diagnostic.
As a compromise, the current formatter resorts to printing a tree structure more similar to the following (the indentation is the same; it’s just the connecting lines that are different):
A
|
|-- B
|
|------ C
|
|-- D
|
|------ E
I personally don’t think this looks too good, but I don’t think we can do much better if we want to both draw connecting lines and avoid buffering. At the same time, my plan at least for this format is for us to progressively improve it, which might eventually entail buffering diagnostics. I’ve been experimenting with this and I think it’s not too hard to do, but I also feel like something like that is best left to a follow-up pr.
And just for comparison, if we were to get rid of the lines entirely, the error above would look like this:
error: static assertion failed due to requirement '__is_trivially_copyable(Y)'
- at test.cc:14:15:
14 | static_assert(__is_trivially_copyable(Y));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
note: 'Y' is not trivially copyable
- at test.cc:14:15:
note: because it has a user provided copy constructor
- at test.cc:14:15:
6 | struct Y { Y(const Y&); };
| ~~~~~~~~~~~
7 |
8 | void g() {
9 | FOO
10 | }
11 |
12 | static_assert(__is_assignable(int&, void));
13 | static_assert(__is_empty(X&));
14 | static_assert(__is_trivially_copyable(Y));
| ^
note: 'Y' defined here
- at test.cc:6:8:
6 | struct Y { Y(const Y&); };
| ^
Potential Applications
In addition to experimenting with buffering diagnostics, I can think of several places in Clang where we would be able to make use of nesting: diagnostics around overload resolution, template instantiations, and recursive calls in the constant evaluator are things that I think would become more comprehensible if we actually made the underlying hierarchical structure more obvious instead of just emitting a flood of errors or notes all next to each other. Of course, that will probably require some more sophisticated refactoring than just incrementing the nesting level in a few places.
If anyone can think of other diagnostics that could make use of nesting, please let me know so I can add them to my list of things to look into.