Problem
Clang produces mangled names that cannot be demangled with LLVM demangler APIs. This is causing bad UX in various debugging tools (e.g., lldb
, pprof
). There is a long-standing history of issues filed due to llvm-cxxfilt tool being unable to demangle names produced by clang (e.g., 43773, 73521, and more).
The root cause is that the codegen module in Clang and the LLVM demangling APIs are developed separately, and as a result they can diverge.
Background
Clang produces mangled names during code generation inside CodeGenModule, e.g., here.
Not all of the produced mangled names can be demangled by LLVMās own llvm::demangle API.
However, itās reasonable to expect that LLVM demangler can handle Clang-generated names in all cases.
E.g., some data from running large scale analysis on OSS projects to find this pattern:
- llvm/llvm-project - 843 unique instances of this discrepancy (of those, 339 in llvm, 336 in mlir, 85 in clang and 20 in lldb).
- absl libraries - 371 instances.
- protobuf libraries - 129 instances.
- gRPC libraries - 27 instances.
- tensorflow - 709 instances.
- filament - 122 instances.
- HEIR - 89 instances.
- Dawn - 193 instances.
Proposed solution
Introduce a diagnostic (remark) that would be emitted when a name cannot be demangled.
The usage would look like:
if (llvm::isMangledName(MangledName) && llvm::demangle(MangledName) == MangledName)
Diags.Report(ND->getLocation(), diag::remark_name_cannot_be_demangled) << MangledName;
The advantages of introducing this as a remark vs. a warning:
- Remark doesnāt break builds in
-Werror
mode. - The information is not user-actionable, so using a warning is contentious.
The diagnostic would have its own diagnostic group (e.g., DiagGroup<"demangler">
), so that it can be easily enabled and disabled.
Since the diagnostic is not immediately user-actionable and quite noisy:
- It will be off by default in the initial stage (we enable it with
-R
to test and fix failures). - The diagnostic will have information on how to report it in the diagnostic message.
Roadmap
We expect the diagnostic to be very noisy initially. Therefore, we propose a gradual rollout:
- Introduce the diagnostic and set it to off by default.
- Enable the diagnostic selectively for LLVM and internal projects to find the current demangling failures.
- Fix the problems in 2.
- Enable the diagnostic by default for RC builds.
- Collect and fix the problems reported by early RC adopters.
- Enable the diagnostic by default, since by this moment itās not noisy.
- (assuming the above steps leave no long tails) Remove the diagnostic group, so that the diagnostic can not be disabled anymore. This step requires caution, since we donāt want to have users heavily impacted in case of demangler problems.
Alternatives considered (misc)
- Adding a warning vs. a remark. The drawbacks of adding a warning are the opposite of the advantages of adding a remark mentioned above.
- Adding an assertion instead of a diagnostic. This has been rejected, since mixing build configuration (assertion) with runtime configuration (compile flag to enable or disable it) is quite unusual.
- Before generating the diagnostic, check that mangled names demangle to some expected shape (e.g., equal to the name that is mangled). This is not feasible, since mangled names donāt necessarily demangle to valid C++ (e.g.,
_ZZN1S1fEiiEd0_NKUlvE_clEv demangles
toS::f(int,int)::'lambda'()::operator()() const)
.