Skip to content

[RISCV][lld] Guarding lld relaxation for RISCV #130265

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

Sharjeel-Khan
Copy link
Contributor

@Sharjeel-Khan Sharjeel-Khan commented Mar 7, 2025

Based on #123248 (comment), the relaxation algorithm assumes relaxing a call will shift the later function forward by the same bytes we removed. As some sections are between call and its call target are 32-byte aligned, the call and call target sections might not be the same distance apart anymore. We guard band the relaxation so it stops the relaxation loop and take the last state.

@llvmbot
Copy link
Member

llvmbot commented Mar 7, 2025

@llvm/pr-subscribers-lld-elf

@llvm/pr-subscribers-lld

Author: Sharjeel Khan (Sharjeel-Khan)

Changes

Based on the
#123248 (comment), the relaxation algorithm assumes relaxing a call will shift the later function forward by the same bytes we removed. As some sections are between call and its call target are 32-byte aligned, the call and call target sections might not be the same distance apart anymore. We guard band the relaxation so it stops the relaxation loop and take the last state.


Full diff: https://github.com/llvm/llvm-project/pull/130265.diff

1 Files Affected:

  • (modified) lld/ELF/Arch/RISCV.cpp (+3)
diff --git a/lld/ELF/Arch/RISCV.cpp b/lld/ELF/Arch/RISCV.cpp
index 4d8989a21b501..c8a9a7093719e 100644
--- a/lld/ELF/Arch/RISCV.cpp
+++ b/lld/ELF/Arch/RISCV.cpp
@@ -901,6 +901,9 @@ bool RISCV::relaxOnce(int pass) const {
   llvm::TimeTraceScope timeScope("RISC-V relaxOnce");
   if (ctx.arg.relocatable)
     return false;
+  
+  if (pass == 29)
+    return false;
 
   if (pass == 0)
     initSymbolAnchors(ctx);

Copy link

github-actions bot commented Mar 7, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@Sharjeel-Khan Sharjeel-Khan changed the title [RISCV][lld] Gurading lld relaxation for RISCV [RISCV][lld] Guarding lld relaxation for RISCV Mar 7, 2025
Based on llvm#123248 (comment),
the relaxation algorithm assumes relaxing a call will shift the later
function forward by the same bytes we removed. As some sections are
between call and its call target are 32-byte aligned, the call and call
target sections might not be the same distance apart. We guard band the
relaxation so it stops the relaxation loop and take the last state.
@mysterymath
Copy link
Contributor

mysterymath commented Mar 7, 2025

It feels like a part of the address assignment loop where every pass is correct to emit (differing only in performance), should intrinsically never produce a failure to converge. That feels like something that should be part of the contract between the loop and the various targets.

More specifically, it seems rough to have the target deciding which pass number to quit on; this would go out of sync if we lower or raise the pass limit in the address assignment loop. It seems cleaner to have it communicate something like: "from here on in, every pass is optional". (In this case, every pass might be optional?)

@topperc
Copy link
Collaborator

topperc commented Mar 7, 2025

It feels like a part of the address assignment loop where every pass is correct to emit (differing only in performance), should intrinsically never produce a failure to converge. That feels like something that should be part of the contract between the loop and the various targets.

The algorithm has a flaw that causes it to oscillate in some cases. We're looping between 2 or 3 different states. We need to account for the maximum alignment of all sections between the call instruction section and the call target section when relaxing calls and jumps. When one sections shrinks the start addresses of the sections after it may move by less than the number of bytes removed due to alignment requirements. This non-linear movement confuses the algorithm. The maximum alignment is not readily available and looping over the sections to compute it on the fly may be expensive.

I've been told that binutils linker hit the same issue in the past. I know they check the maximum alignment and I was told there is a maximum number of passes.

More specifically, it seems rough to have the target deciding which pass number to quit on; this would go out of sync if we lower or raise the pass limit in the address assignment loop. It seems cleaner to have it communicate something like: "from here on in, every pass is optional". (In this case, every pass might be optional?)

I think we don't want RISC-V relaxation to ever fail so we should probably remove the convergence failure check from the caller for RISC-V relaxation and let RISC-V choose when to give up.

@Sharjeel-Khan
Copy link
Contributor Author

Ping to see if anything else needs to be done to get approved

@topperc topperc requested review from jrtc27 and lenary March 19, 2025 21:25
@topperc
Copy link
Collaborator

topperc commented Mar 19, 2025

Ping to see if anything else needs to be done to get approved

Have you done any additional testing? I only tested it on the one case from the PR.

@jrtc27
Copy link
Collaborator

jrtc27 commented Mar 19, 2025

So yeah, this is clearly a hack to work around a more fundamental design flaw in the implementation. I don't like it and would prefer it not be there, but if this is the only viable medium-term solution then I guess I won't object to it if we really need something. But really @MaskRay is the one to talk to about this.

@topperc
Copy link
Collaborator

topperc commented Mar 19, 2025

So yeah, this is clearly a hack to work around a more fundamental design flaw in the implementation. I don't like it and would prefer it not be there, but if this is the only viable medium-term solution then I guess I won't object to it if we really need something. But really @MaskRay is the one to talk to about this.

I think @palmer-dabbelt or @kito-cheng told me that bfd also has a max iteration limit where they give up regardless of convergence.

@pirama-arumuga-nainar
Copy link
Collaborator

But really @MaskRay is the one to talk to about this.v

@MaskRay any thoughts on this workaround?

@MaskRay
Copy link
Member

MaskRay commented Apr 9, 2025

But really @MaskRay is the one to talk to about this.v

@MaskRay any thoughts on this workaround?

Sorry for the delayed response. I am struggling to keep up with LLVM. Lately I’ve been swamped with a surge of linker-related work. (I thought lld was pretty stable, but lately there’s been GNU_PROPERTY from RISC-V and AArch64, a bunch of new AArch64 options, linker relaxation stuff for LoongArch, --icf=all, --why-live...))

I don't think this is acceptable. I'll check how to handle alignment in a better way.

@pirama-arumuga-nainar
Copy link
Collaborator

Sorry for the delayed response. I am struggling to keep up with LLVM.

No worries, thanks for being involved in all the lld features that you mentioned.

@topperc
Copy link
Collaborator

topperc commented May 23, 2025

@Sharjeel-Khan did this get fixed some other way?

@Sharjeel-Khan
Copy link
Contributor Author

No, @MaskRay did not find this acceptable and he would find a better way to handle the alignment so I closed this PR. I am keeping the issue open until it gets fixed.

@Sharjeel-Khan Sharjeel-Khan deleted the lld-relax branch August 8, 2025 05:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants