Hi folks!
I was playing with LTO and LLC and found something that may be of interest for the community.
Consider this simple code:
int foo();
void bar() {
foo();
}
clang --target=riscv32 -march=rv32imafdc -mabi=ilp32d riscv-relax.c -c -o riscv-relax.o -save-temps=obj -O3
Relocations:
00000000 <bar>:
0: 17 03 00 00 auipc t1, 0
00000000: R_RISCV_CALL_PLT foo
00000000: R_RISCV_RELAX *ABS*
4: 67 00 03 00 jr t1 <bar>
Rebuilding the generated bytecode:
llc riscv-relax.bc --filetype=obj -o relax.o
Relocations:
00000000 <bar>:
0: 41 11 <unknown>
2: 06 c6 <unknown>
4: 97 00 00 00 auipc ra, 0
00000004: R_RISCV_CALL_PLT foo
8: e7 80 00 00 jalr ra <bar+0x4>
c: b2 40 <unknown>
e: 41 01 <unknown>
10: 82 80 <unknown>
The code itself is different because we are not running opt before llc - this is ok. The point is that relaxation was gone! This means that if we compile IR (llc or LTO, Clang workflow produces the expected result) we cannot benefit from linker relaxation.
After some investigation, I found the cause. There is a misuse of the subtarget class. When running through the normal sequence (Clang), we can state that the whole module will have the same feature flags, so the target can know about this in advance and this will be more or less constant for the things that are related to this problem. When we load .ll/.bc file, every function can have really different set of feature flags.
What we can trust during the whole compilation lifetime (all code): Target Triple.
What we cannot trust using a single object during the whole compilation lifetime (all code): Feature flags, because they can be different across functions, so they can represent different subtargets.
In some parts of the RISCV backend, we are trusting in the FF of a single initial Subtarget (empty for a .bc file), which must be not valid for all functions of this CU. This is especially problematic for this pure IR workflow.
Example:
bool RISCVAsmBackend::shouldForceRelocation(const MCAssembler &Asm,
const MCFixup &Fixup,
const MCValue &Target) {
if (Fixup.getKind() >= FirstLiteralRelocationKind)
return true;
switch (Fixup.getTargetKind()) {
default:
break;
case FK_Data_1:
case FK_Data_2:
case FK_Data_4:
case FK_Data_8:
if (Target.isAbsolute())
return false;
break;
case RISCV::fixup_riscv_got_hi20:
case RISCV::fixup_riscv_tls_got_hi20:
case RISCV::fixup_riscv_tls_gd_hi20:
return true;
}
return STI.hasFeature(RISCV::FeatureRelax) || ForceRelocs; // (*)
}
(*) RISCVAsmBackend is instantiated before the full target initialization, with a single STI. In fact, this subtarget is a generic one and is instantiated before a real RISC-V subtarget. The difference is that from the Clang workflow it contains the feature flags. When running LTO or llc, feature flags will be empty, even those that are on by default, like relaxation.
When STI matters, it always must be given as a parameter, to match the parent function, for example:
void RISCVAsmBackend::relaxInstruction(MCInst &Inst,
const MCSubtargetInfo &STI) const {
/**/
}
Considering the previous code, the code generator will provide the STI of the current function, which is correct. We can fix this by always generation relaxation RELOCS. In this stage, we cannot access the real subtarget attached to the current function. The only side effect that I can see is a slowdown in the linker speed when no-relax is applied because of branches.
What do you think?
Best regards!