-
Notifications
You must be signed in to change notification settings - Fork 14.8k
[RISCV][lld] Guarding lld relaxation for RISCV #130265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-lld-elf @llvm/pr-subscribers-lld Author: Sharjeel Khan (Sharjeel-Khan) ChangesBased on the Full diff: https://github.com/llvm/llvm-project/pull/130265.diff 1 Files Affected:
diff --git a/lld/ELF/Arch/RISCV.cpp b/lld/ELF/Arch/RISCV.cpp
index 4d8989a21b501..c8a9a7093719e 100644
--- a/lld/ELF/Arch/RISCV.cpp
+++ b/lld/ELF/Arch/RISCV.cpp
@@ -901,6 +901,9 @@ bool RISCV::relaxOnce(int pass) const {
llvm::TimeTraceScope timeScope("RISC-V relaxOnce");
if (ctx.arg.relocatable)
return false;
+
+ if (pass == 29)
+ return false;
if (pass == 0)
initSymbolAnchors(ctx);
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
Based on llvm#123248 (comment), the relaxation algorithm assumes relaxing a call will shift the later function forward by the same bytes we removed. As some sections are between call and its call target are 32-byte aligned, the call and call target sections might not be the same distance apart. We guard band the relaxation so it stops the relaxation loop and take the last state.
It feels like a part of the address assignment loop where every pass is correct to emit (differing only in performance), should intrinsically never produce a failure to converge. That feels like something that should be part of the contract between the loop and the various targets. More specifically, it seems rough to have the target deciding which pass number to quit on; this would go out of sync if we lower or raise the pass limit in the address assignment loop. It seems cleaner to have it communicate something like: "from here on in, every pass is optional". (In this case, every pass might be optional?) |
The algorithm has a flaw that causes it to oscillate in some cases. We're looping between 2 or 3 different states. We need to account for the maximum alignment of all sections between the call instruction section and the call target section when relaxing calls and jumps. When one sections shrinks the start addresses of the sections after it may move by less than the number of bytes removed due to alignment requirements. This non-linear movement confuses the algorithm. The maximum alignment is not readily available and looping over the sections to compute it on the fly may be expensive. I've been told that binutils linker hit the same issue in the past. I know they check the maximum alignment and I was told there is a maximum number of passes.
I think we don't want RISC-V relaxation to ever fail so we should probably remove the convergence failure check from the caller for RISC-V relaxation and let RISC-V choose when to give up. |
Ping to see if anything else needs to be done to get approved |
Have you done any additional testing? I only tested it on the one case from the PR. |
So yeah, this is clearly a hack to work around a more fundamental design flaw in the implementation. I don't like it and would prefer it not be there, but if this is the only viable medium-term solution then I guess I won't object to it if we really need something. But really @MaskRay is the one to talk to about this. |
I think @palmer-dabbelt or @kito-cheng told me that bfd also has a max iteration limit where they give up regardless of convergence. |
Sorry for the delayed response. I am struggling to keep up with LLVM. Lately I’ve been swamped with a surge of linker-related work. (I thought lld was pretty stable, but lately there’s been GNU_PROPERTY from RISC-V and AArch64, a bunch of new AArch64 options, linker relaxation stuff for LoongArch, --icf=all, --why-live...)) I don't think this is acceptable. I'll check how to handle alignment in a better way. |
No worries, thanks for being involved in all the lld features that you mentioned. |
@Sharjeel-Khan did this get fixed some other way? |
No, @MaskRay did not find this acceptable and he would find a better way to handle the alignment so I closed this PR. I am keeping the issue open until it gets fixed. |
Based on #123248 (comment), the relaxation algorithm assumes relaxing a call will shift the later function forward by the same bytes we removed. As some sections are between call and its call target are 32-byte aligned, the call and call target sections might not be the same distance apart anymore. We guard band the relaxation so it stops the relaxation loop and take the last state.