Skip to content

[RFC][BPF] Support Jump Table #133856

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Conversation

yonghong-song
Copy link
Contributor

@yonghong-song yonghong-song commented Apr 1, 2025

NOTE: We probably need cpu v5 or other flags to enable this feature. We can add it later when necessary.

- Generate all jump tables in a single section named .jumptables.
- Represent each jump table as a symbol:
  - value points to an offset within .jumptables;
  - size encodes jump table size in bytes.
- Indirect jump is a gotox instruction:
  - dst register is an index within the table;
  - accompanied by a R_BPF_64_64 relocation pointing to a jump table
    symbol.

clang -S:

  .LJTI0_0:
          .reloc 0, FK_SecRel_8, .BPF.JT.0.0
          gotox r1
          goto LBB0_2
  LBB0_4:
          ...

          .section        .jumptables,"",@progbits
  .L0_0_set_4 = ((LBB0_4-.LBPF.JX.0.0)>>3)-1
  .L0_0_set_2 = ((LBB0_2-.LBPF.JX.0.0)>>3)-1
  ...
  .BPF.JT.0.0:
          .long   .L0_0_set_4
          .long   .L0_0_set_2
          ...

llvm-readelf -r --sections --symbols:

  Section Headers:
    [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
    ...
    [ 4] .jumptables       PROGBITS        0000000000000000 000118 000100 00      0   0  1
    ...

  Relocation section '.rel.text' at offset 0x2a8 contains 2 entries:
      Offset             Info             Type               Symbol's Value  Symbol's Name
  0000000000000010  0000000300000001 R_BPF_64_64            0000000000000000 .BPF.JT.0.0
  ...

  Symbol table '.symtab' contains 6 entries:
     Num:    Value          Size Type    Bind   Vis       Ndx Name
       ...
       2: 0000000000000000   112 FUNC    GLOBAL DEFAULT     2 foo
       3: 0000000000000000   128 NOTYPE  GLOBAL DEFAULT     4 .BPF.JT.0.0
       ...

llvm-objdump -Sdr:

  0000000000000000 <foo>:
       ...
       2:       gotox r1
                0000000000000010:  R_BPF_64_64  .BPF.JT.0.0

An option -bpf-min-jump-table-entries is implemented to control the minimum
number of entries to use a jump table on BPF. The default value 4, but it
can be changed with the following clang option

  clang ... -mllvm -bpf-min-jump-table-entries=6

where the number of jump table cases needs to be >= 6 in order to
use jump table.

@yonghong-song
Copy link
Contributor Author

@aspsk As we discussed in LSFMMBPF, here is the implementation for llvm jump table support. Please take a look and try libbpf/kernel implementations. Let me know if you hit any issues.

@4ast
Copy link
Member

4ast commented Apr 1, 2025

I could explore to use pc relative

Don't bother. x86 is doing it to save a byte in encoding. This technique doesn't apply to bpf isa.


let isIndirectBranch = 1 in {
def JX : JMP_IND<BPF_JA, "gotox", [(brind i64:$dst)]>;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice to see how it should be done, I just had hardcoded it in my test branch: aspsk@98773c6

Copy link

@aspsk aspsk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @yonghong-song! I will test this, match with the verification part, and post my results in this PR

@@ -65,10 +65,11 @@ BPFTargetLowering::BPFTargetLowering(const TargetMachine &TM,

setOperationAction(ISD::BR_CC, MVT::i64, Custom);
setOperationAction(ISD::BR_JT, MVT::Other, Expand);
setOperationAction(ISD::BRIND, MVT::Other, Expand);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this does remove restriction to not produce indirect jumps?

Is there a way to control if we want to generate indirect jumps "in general" vs., say, "only for large switches"? (Or even only for a particular switch?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this does remove restriction to not produce indirect jumps?
Yes, we do not want to expand 'brind', rather we will do pattern matching with 'brind'.

Is there a way to control if we want to generate indirect jumps "in general" vs., say, "only for large switches"? (Or even only for a particular switch?)

Good point. Let me do some experiments with a flag for this. I am not sure whether I could do 'only for a particular switch', but I will do some investigation. Hopefully can find a s solution for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an option to control how many cases in a switch statement to use jump table. The default is 4 cases. But you can change it with additional clang option, e.g., the minimum number of cases must be 6, then

clang ... -mllvm -bpf-min-jump-table-entries=6

I checked other targets, there are no control for a specific switch. So I think we do not need them for now.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks!

@aspsk
Copy link

aspsk commented Apr 1, 2025

With the above information, the size for each sub-rodata can be found easily

@yonghong-song could you please elaborate on this? How exactly is to classify those into per-table?

@yonghong-song
Copy link
Contributor Author

With the above information, the size for each sub-rodata can be found easily

@yonghong-song could you please elaborate on this? How exactly is to classify those into per-table?

The below is an example for test_tc_tunnel.bpf.o with -mllvm -bpf-min-jump-table-entries=1. The purpose is to have more jump tables in the code. First you can check '.rel.rodata' section. The following is the information with llvm-readelf -r test_tc_tunnel.bpf.o:

Relocation section '.rel.text' at offset 0xdd40 contains 9 entries:                                                            
    Offset             Info             Type               Symbol's Value  Symbol's Name                                       
0000000000000038  0000001700000001 R_BPF_64_64            0000000000000000 .rodata                                             
0000000000000120  0000001700000001 R_BPF_64_64            0000000000000000 .rodata                                             
0000000000000170  0000001700000001 R_BPF_64_64            0000000000000000 .rodata                                             
0000000000000268  0000001700000001 R_BPF_64_64            0000000000000000 .rodata                                             
0000000000000330  0000001700000001 R_BPF_64_64            0000000000000000 .rodata                                             
00000000000003a8  0000001700000001 R_BPF_64_64            0000000000000000 .rodata                                             
00000000000003e0  0000001700000001 R_BPF_64_64            0000000000000000 .rodata                                             
0000000000000420  0000001700000001 R_BPF_64_64            0000000000000000 .rodata                                             
0000000000000460  0000001700000001 R_BPF_64_64            0000000000000000 .rodata                                             
                                                                                                                               
Relocation section '.reldecap' at offset 0xddd0 contains 3 entries:                                                            
    Offset             Info             Type               Symbol's Value  Symbol's Name                                       
0000000000000048  0000001700000001 R_BPF_64_64            0000000000000000 .rodata                                             
00000000000000f0  0000001700000001 R_BPF_64_64            0000000000000000 .rodata                                             
00000000000001a0  000000020000000a R_BPF_64_32            0000000000000000 .text 

The above .rodata is what you really care. You can also find all .rodata relocations happen in decap and .text sections.

Relocation section '.rel.rodata' at offset 0xde00 contains 43 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name
0000000000000000  0000001500000002 R_BPF_64_ABS64         0000000000000000 decap
0000000000000008  0000001500000002 R_BPF_64_ABS64         0000000000000000 decap
0000000000000010  0000000200000002 R_BPF_64_ABS64         0000000000000000 .text
0000000000000018  0000000200000002 R_BPF_64_ABS64         0000000000000000 .text
0000000000000020  0000000200000002 R_BPF_64_ABS64         0000000000000000 .text
0000000000000028  0000000200000002 R_BPF_64_ABS64         0000000000000000 .text
...
0000000000000150  0000000200000002 R_BPF_64_ABS64         0000000000000000 .text

You then need to go through sections 'decap' and '.text' for their .rodata relocations.
For example, for

0000000000000038  0000001700000001 R_BPF_64_64            0000000000000000 .rodata                                             

It corresponds to insn 7 (0x38/8 = 7).

       7:       18 03 00 00 80 00 00 00 00 00 00 00 00 00 00 00 r3 = 0x80 ll
                0000000000000038:  R_BPF_64_64  .rodata

In the above 'r3 = 0x80' means the relocation starts 0x80 at .rodata section.

You need to scan ALL such relocations in .text and decap sections and with that you can sort based on start of each relocation. After that, you will be able to calculate each relocation size.

After you calculated each relocation size (for .rodata section), you need to check whether a particular relocation is for gotox or something else. So you need to go backwords to scan. For example,

      35:       67 03 00 00 03 00 00 00 r3 <<= 0x3
      36:       18 02 00 00 40 01 00 00 00 00 00 00 00 00 00 00 r2 = 0x140 ll
                0000000000000120:  R_BPF_64_64  .rodata
      38:       0f 32 00 00 00 00 00 00 r2 += r3
      39:       79 22 00 00 00 00 00 00 r2 = *(u64 *)(r2 + 0x0)
      40:       0d 02 00 00 00 00 00 00 gotox r2

You find a gotox insn with target r2, then you need to go back and find 'r2 = *(u64 *)(r2 + 0x0)' and then 'r2 += r3' and then 'r2 = 0x140 ll'. The above code pattern is gernated by llvm and should be generally true for jump table implementation. And you will be certain that the table for this particular gotox will be in offset 0x140 of .rodata section. The size of the table is already calculated based on the previous mechanism by scanning all .rodata relocations in .text and decap sections.

@aspsk
Copy link

aspsk commented Apr 2, 2025

You find a gotox insn with target r2, then you need to go back and find 'r2 = *(u64 *)(r2 + 0x0)' and then 'r2 += r3' and then 'r2 = 0x140 ll'. The above code pattern is gernated by llvm and should be generally true for jump table implementation. And you will be certain that the table for this particular gotox will be in offset 0x140 of .rodata section. The size of the table is already calculated based on the previous mechanism by scanning all .rodata relocations in .text and decap sections.

I am looking into how to automate this properly (I have a really hacky PoC test working with this version of llvm and my custom test). It looks simpler with explicit jump tables (when I take an address of a label and store in an array), because then I can just push values to a custom section.

Will post updates here.

@yonghong-song
Copy link
Contributor Author

I find a llvm option -emit-jump-table-sizes-section which can dump the jump table section/offset/size directly. This will avoid the gotox -> jump table address analysis.
The following is some details on commit 3:

    For example,
      [ 6] .rodata           PROGBITS        0000000000000000 000740 0000d6 00   A  0   0  8
      [ 7] .rel.rodata       REL             0000000000000000 003860 000080 10   I 39   6  8
      [ 8] .llvm_jump_table_sizes LLVM_JT_SIZES 0000000000000000 000816 000010 00      0   0  1
      [ 9] .rel.llvm_jump_table_sizes REL    0000000000000000 0038e0 000010 10   I 39   8  8
      ...
      [14] .llvm_jump_table_sizes LLVM_JT_SIZES 0000000000000000 000958 000010 00      0   0  1
      [15] .rel.llvm_jump_table_sizes REL    0000000000000000 003970 000010 10   I 39  14  8
    
    With llvm-readelf dump section 8 and 14:
      $ llvm-readelf -x 8 user_ringbuf_success.bpf.o
      Hex dump of section '.llvm_jump_table_sizes':
      0x00000000 00000000 00000000 04000000 00000000 ................
      $ llvm-readelf -x 14 user_ringbuf_success.bpf.o
      Hex dump of section '.llvm_jump_table_sizes':
      0x00000000 20000000 00000000 04000000 00000000  ...............
    You can see. There are two jump tables:
      jump table 1: offset 0, size 4 (4 labels)
      jump table 2: offset 0x20, size 4 (4 labels)
    
    Check sections 9 and 15, we can find the corresponding section:
      Relocation section '.rel.llvm_jump_table_sizes' at offset 0x38e0 contains 1 entries:
          Offset             Info             Type               Symbol's Value  Symbol's Name
      0000000000000000  0000000a00000002 R_BPF_64_ABS64         0000000000000000 .rodata
      Relocation section '.rel.llvm_jump_table_sizes' at offset 0x3970 contains 1 entries:
          Offset             Info             Type               Symbol's Value  Symbol's Name
      0000000000000000  0000000a00000002 R_BPF_64_ABS64         0000000000000000 .rodata
    and confirmed that the relocation is against '.rodata'.

    Dump .rodata section:
      0x00000000 a8000000 00000000 10010000 00000000 ................
      0x00000010 b8000000 00000000 c8000000 00000000 ................
      0x00000020 28040000 00000000 00050000 00000000 (...............
      0x00000030 70040000 00000000 b8040000 00000000 p...............
      0x00000040 44726169 6e207265 7475726e 65643a20 Drain returned:
    
    So we can get two jump tables:
      .rodata offset 0, # of lables 4:
      0x00000000 a8000000 00000000 10010000 00000000 ................
      0x00000010 b8000000 00000000 c8000000 00000000 ................
      .rodata offset 0x200, # of lables 4:
      0x00000020 28040000 00000000 00050000 00000000 (...............
      0x00000030 70040000 00000000 b8040000 00000000 p...............

This way, you just need to scan related code section. As long as it
matches one of jump tables (.rodata relocation, offset also matching),
you do not need to care about gotox at all.

@yonghong-song
Copy link
Contributor Author

This is one test failure like below:

.---command stderr------------
# | C:\ws\src\llvm\test\CodeGen\X86\jump-table-size-section.ll:86:15: error: NOFLAG-NOT: excluded string found in input
# | ; NOFLAG-NOT: .section .llvm_jump_table_sizes
# |               ^
# | <stdin>:41:2: note: found here
# |  .section .llvm_jump_table_sizes,"G",@llvm_jt_sizes,foo1,comdat
# |  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# | C:\ws\src\llvm\test\CodeGen\X86\jump-table-size-section.ll:154:15: error: NOFLAG-NOT: excluded string found in input
# | ; NOFLAG-NOT: .section .llvm_jump_table_sizes
# |               ^
# | <stdin>:150:2: note: found here
# |  .section .llvm_jump_table_sizes,"",@llvm_jt_sizes
# |  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# | 
# | Input file: <stdin>
# | Check file: C:\ws\src\llvm\test\CodeGen\X86\jump-table-size-section.ll

The reason should be due to my unconditional enabling -emit-jump-table-sizes-section. Will fix in the next revision if this approach is okay for @aspsk

@aspsk
Copy link

aspsk commented Apr 3, 2025

Thanks @yonghong-song, that size/offset section is really useful! This looks sufficient for me to continue with a PoC.

you do not need to care about gotox at all.

Unfortunately, I do, this is required for verification. For indirect jumps to work, two things should be verified:

    rX <- jump_table_x[i] # here jump_table_x[i] should be converted to M[i]
    ...
    gotox rX              # here on load imm=fd(M)

The gotox should reference an instance of an instruction set map M directly. This is required to 1) verify that we take each branch in visit_insn 2) verify the instruction when we symbolically run the verifier. In the latter case gotox rX is allowed iff imm=M and rX was loaded from the same map M.

So, in order to construct a verifiable program, libbpf should:

  • find all jump tables, for each jump table X:
    • create an instruction set map M(X)
    • find all loads from table X, replace by a map value load from map M(X)
  • find all gotox, backtrack the register load from map M(X), set imm=M(X) in the gotox

(Haven't checked yet for real, but this looks to be enough for "custom", e.g., user-defined, jump tables to work. Just declare it as static const, initialize with label addresses, and it will be present in .rodata. Maybe the only change that the corresponding size section should be pushed manually in this case?)

@yonghong-song
Copy link
Contributor Author

Thanks @yonghong-song, that size/offset section is really useful! This looks sufficient for me to continue with a PoC.

you do not need to care about gotox at all.

Unfortunately, I do, this is required for verification. For indirect jumps to work, two things should be verified:

You are right. Verification does need to connect jump table map and gotox insn.
When I say 'don't care gotox at all', I actually mean libbpf part where you need to go through various elf sections to connect them together.

    rX <- jump_table_x[i] # here jump_table_x[i] should be converted to M[i]
    ...
    gotox rX              # here on load imm=fd(M)

The gotox should reference an instance of an instruction set map M directly. This is required to 1) verify that we take each branch in visit_insn 2) verify the instruction when we symbolically run the verifier. In the latter case gotox rX is allowed iff imm=M and rX was loaded from the same map M.

So, in order to construct a verifiable program, libbpf should:

* find all jump tables, for each jump table `X`:
  
  * create an instruction set map `M(X)`
  * find all loads from table `X`, replace by a map value load from map `M(X)`

* find all `gotox`, backtrack the register load from map `M(X)`, set `imm=M(X)` in the `gotox`

Backtrack certainly work. But maybe there is an alternative not to do backtrack.
For example,

        r1 = .LJTI6_0 ll
            <== we will know this is from a map (representing a jump table)
            <== so mark r1 with some info like (jump_table_addr, offset 0)
        r1 += r6
            <== r1 still like jump_table_addr, need to ensure [r6, r6 + 8] is 
                    within jump table range
        r1 = *(u64 *)(r1 + 0)
            <== r1 will be jump_table target, but still keep jump_table reference in r1.
        gotox r1
            <== goto jump_table_target (still has jump_table reference,
                    from verifier perspective, all jump table targets
                    in jump table should be verified).

(Haven't checked yet for real, but this looks to be enough for "custom", e.g., user-defined, jump tables to work. Just declare it as static const, initialize with label addresses, and it will be present in .rodata. Maybe the only change that the corresponding size section should be pushed manually in this case?)

Your user-defined jump table may work. But it would be great if we can just allow the current common switch statements from code cleanness and developer productivity.

@aspsk
Copy link

aspsk commented Apr 3, 2025

Backtrack certainly work. But maybe there is an alternative not to do backtrack. For example,

        r1 = .LJTI6_0 ll
            <== we will know this is from a map (representing a jump table)
            <== so mark r1 with some info like (jump_table_addr, offset 0)
        r1 += r6
            <== r1 still like jump_table_addr, need to ensure [r6, r6 + 8] is 
                    within jump table range
        r1 = *(u64 *)(r1 + 0)
            <== r1 will be jump_table target, but still keep jump_table reference in r1.
        gotox r1
            <== goto jump_table_target (still has jump_table reference,
                    from verifier perspective, all jump table targets
                    in jump table should be verified).

Right, this is exactly what I've meant by "backtrack". Looks like for switch this will look similar in most cases. The need to ensure [r6, r6 + 8] is within jump table range[r6, r6 + 8] from libbpf side might be "best effort", only the verifier needs to check this for sure. I think that from libbpf side "r1 was loaded from map, then an offset was added" is enough to think that r1 is still a valid pointer from the same map. (The devil is in details, of course, let's see for sure when I have code.)

@yonghong-song
Copy link
Contributor Author

Backtrack certainly work. But maybe there is an alternative not to do backtrack. For example,

        r1 = .LJTI6_0 ll
            <== we will know this is from a map (representing a jump table)
            <== so mark r1 with some info like (jump_table_addr, offset 0)
        r1 += r6
            <== r1 still like jump_table_addr, need to ensure [r6, r6 + 8] is 
                    within jump table range
        r1 = *(u64 *)(r1 + 0)
            <== r1 will be jump_table target, but still keep jump_table reference in r1.
        gotox r1
            <== goto jump_table_target (still has jump_table reference,
                    from verifier perspective, all jump table targets
                    in jump table should be verified).

Right, this is exactly what I've meant by "backtrack". Looks like for switch this will look similar in most cases. The need to ensure [r6, r6 + 8] is within jump table range[r6, r6 + 8] from libbpf side might be "best effort", only the verifier needs to check this for sure. I think that from libbpf side "r1 was loaded from map, then an offset was added" is enough to think that r1 is still a valid pointer from the same map. (The devil is in details, of course, let's see for sure when I have code.)

Yes, libbpf does not need to do verifier work. The range analysis should be done in verifier.

@aspsk
Copy link

aspsk commented Apr 14, 2025

Hi @yonghong-song!

I was trying different switch variants, simple ones work like magic, so we're definitely going the right direction.

One simple case fails for me though. Namely, in the example below LLVM generates an unreachable instruction. Could you take a look please? An example source program is

SEC("syscall") int foo(struct simple_ctx *ctx)
{
        switch (ctx->x) {
        case 0:
                ret_user = 2;
                break;
        case 11:
                ret_user = 3;
                break;
        case 27:
                ret_user = 4;
                break;
        case 31:
                ret_user = 5;
                break;
        default:
                ret_user = 19;
                break;
        }

        return 0;
}

Then the object file looks like

0000000000000700 <foo>:
;       switch (ctx->x) {
     224:       79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0x0)
     225:       25 01 0f 00 1f 00 00 00 if r1 > 0x1f goto +0xf <foo+0x88>
     226:       67 01 00 00 03 00 00 00 r1 <<= 0x3
     227:       18 02 00 00 a8 00 00 00 00 00 00 00 00 00 00 00 r2 = 0xa8 ll
                0000000000000718:  R_BPF_64_64  .rodata
     229:       0f 12 00 00 00 00 00 00 r2 += r1
     230:       79 21 00 00 00 00 00 00 r1 = *(u64 *)(r2 + 0x0)
     231:       0d 01 00 00 00 00 00 00 gotox r1
     232:       05 00 08 00 00 00 00 00 goto +0x8 <foo+0x88>
     233:       b7 01 00 00 02 00 00 00 r1 = 0x2
;       switch (ctx->x) {
     234:       05 00 07 00 00 00 00 00 goto +0x7 <foo+0x90>
     235:       b7 01 00 00 04 00 00 00 r1 = 0x4
;               break;
     236:       05 00 05 00 00 00 00 00 goto +0x5 <foo+0x90>
     237:       b7 01 00 00 03 00 00 00 r1 = 0x3
;               break;
     238:       05 00 03 00 00 00 00 00 goto +0x3 <foo+0x90>
     239:       b7 01 00 00 05 00 00 00 r1 = 0x5
;               break;
     240:       05 00 01 00 00 00 00 00 goto +0x1 <foo+0x90>
     241:       b7 01 00 00 13 00 00 00 r1 = 0x13
     242:       18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0x0 ll
                0000000000000790:  R_BPF_64_64  ret_user
     244:       7b 12 00 00 00 00 00 00 *(u64 *)(r2 + 0x0) = r1
;       return 0;
     245:       b4 00 00 00 00 00 00 00 w0 = 0x0
     246:       95 00 00 00 00 00 00 00 exit

Now, the jump table is

242, 241, 241, 241, 241, 241, 241, 241,
241, 241, 241, 237, 241, 241, 241, 241,
241, 241, 241, 241, 241, 241, 241, 241,
241, 241, 241, 235, 241, 241, 241, 239

And the check

225:       25 01 0f 00 1f 00 00 00 if r1 > 0x1f goto +0xf <foo+0x88>

makes sure that r1 is always loaded from the jump table.

And this makes the instruction

232:       05 00 08 00 00 00 00 00 goto +0x8 <foo+0x88>

unreachable.

@4ast
Copy link
Member

4ast commented Apr 14, 2025

And this makes the instruction
232: 05 00 08 00 00 00 00 00 goto +0x8 <foo+0x88>
unreachable.

I suspect it won't be easy to avoid this on llvm side. Probably better to teach verifier to ignore those.

@aspsk
Copy link

aspsk commented Apr 15, 2025

And this makes the instruction
232: 05 00 08 00 00 00 00 00 goto +0x8 <foo+0x88>
unreachable.

I suspect it won't be easy to avoid this on llvm side. Probably better to teach verifier to ignore those.

Ok, thanks, will do this for now

@aspsk
Copy link

aspsk commented May 4, 2025

Update. I have a patch for kernel + libbpf which uses this LLVM and which passes all my new selftests + all (but one) standard bpf selftests which are compiled to use gotox (exceptions, cgroup_tcp_skb, cls_redirect, bpf_tcp_ca, bpf_iter_setsockopt, tc_change_tail, net_timestamping, tcpbpf_user, user_ringbuf, tcp_custom_syncookie, tcp_hdr_options).

So far only one selftest fails (tunnel). Once I fix it (and cleanup libbpf part of the series a bit), I will prepare and send RFC.

Copy link

github-actions bot commented May 8, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@yonghong-song
Copy link
Contributor Author

Thanks for the update. When trying your above example

SEC("syscall") int foo(struct simple_ctx *ctx)
...

I found a problem and just added another commit to fix the problem. The issue is due to llvm machine-sink pass. The implementation is similar to X86 (X86InstrInfo::getJumpTableIndex()). See the top commit (commit 4) for more details.

@aspsk
Copy link

aspsk commented May 8, 2025

I found a problem and just added another commit to fix the problem

Thanks @yonghong-song! I will test your latest changes over this weekend.

(The tunnel test I've fixed already, so all selftests pass now. Now need to replace hacks with normal code, and will send the patch.)

Copy link
Member

@inclyc inclyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to modify the ASMParser also?

static bool isValidIdAtStart(StringRef Name) {
return StringSwitch<bool>(Name.lower())
.Case("if", true)
.Case("call", true)
.Case("callx", true)
.Case("goto", true)

@yonghong-song
Copy link
Contributor Author

Do we need to modify the ASMParser also?

static bool isValidIdAtStart(StringRef Name) {
return StringSwitch<bool>(Name.lower())
.Case("if", true)
.Case("call", true)
.Case("callx", true)
.Case("goto", true)

Right, need to add gotox as well. Will fix. Thanks!

@aspsk
Copy link

aspsk commented Jun 15, 2025

Here's the kernel side which works with this LLVM: https://lore.kernel.org/bpf/[email protected]/

The following selftests contain indirect jumps (and pass):

* cgroup_tcp_skb, cls_redirect, bpf_tcp_ca,
* bpf_iter_setsockopt, tc_change_tail, net_timestamping,
* user_ringbuf, tcp_hdr_options, tunnel, exceptions,
* tcpbpf_user, tcp_custom_syncookie

A new selftest bpf_goto_x specifically tries to test different scenarios.

@yonghong-song
Copy link
Contributor Author

Thanks @aspsk I will also take a look at the kernel patch. Also, the current patch has some conflicts with latest 'main' branch. I will rebase and repost the new llvm patch after doing some testing.

@yonghong-song
Copy link
Contributor Author

Rebased on top of current main branch. No functionality change compared to previous version (in more than a month ago).

yonghong-song pushed a commit to yonghong-song/llvm-project that referenced this pull request Jul 7, 2025
Currently llvm has an option EmitJumpTableSizesSection which enables
unique jmptable size sections.

This patch added an option EmitUniqueJumpTableSection which enables
unique jmptable sections.

This patch will have EmitUniqueJumpTableSection on by default for BPF
programs. Without this, the jmptable will be in '.rodata' sections
which may include a lot of other stuffs e.g. const strings.

With EmitUniqueJumpTableSection, the llvm will generate unique jump table
section per function based on llvm internal conventions and it will
support ELF, XCOFF and COFF.

The following is an example with bpf selftest user_ringbuf_success.bpf.c
(also in description in llvm#133856):

  $ llvm-readelf -S user_ringbuf_success.bpf.o
  ...
  [ 6] .rodata.read_protocol_msg PROGBITS 0000000000000000 000740 000020 00   A  0   0  8
  [ 7] .rel.rodata.read_protocol_msg REL 0000000000000000 0038e8 000040 10   I 42   6  8
  [ 8] .llvm_jump_table_sizes LLVM_JT_SIZES 0000000000000000 000760 000010 00      0   0  1
  [ 9] .rel.llvm_jump_table_sizes REL    0000000000000000 003928 000010 10   I 42   8  8
  ...
  [14] .rodata.publish_next_kern_msg PROGBITS 0000000000000000 0008a0 000020 00   A  0   0  8
  [15] .rel.rodata.publish_next_kern_msg REL 0000000000000000 0039b8 000040 10   I 42  14  8
  [16] .llvm_jump_table_sizes LLVM_JT_SIZES 0000000000000000 0008c0 000010 00      0   0  1
  [17] .rel.llvm_jump_table_sizes REL    0000000000000000 0039f8 000010 10   I 42  16  8
  ...

  $ llvm-readelf -r user_ringbuf_success.bpf.o
  ...
  Relocation section '.rel.rodata.read_protocol_msg' at offset 0x38e8 contains 4 entries:
      Offset             Info             Type               Symbol's Value  Symbol's Name
  0000000000000000  0000000200000002 R_BPF_64_ABS64         0000000000000000 .text
  0000000000000008  0000000200000002 R_BPF_64_ABS64         0000000000000000 .text
  0000000000000010  0000000200000002 R_BPF_64_ABS64         0000000000000000 .text
  0000000000000018  0000000200000002 R_BPF_64_ABS64         0000000000000000 .text

  Relocation section '.rel.llvm_jump_table_sizes' at offset 0x3928 contains 1 entries:
      Offset             Info             Type               Symbol's Value  Symbol's Name
  0000000000000000  0000000a00000002 R_BPF_64_ABS64         0000000000000000 .rodata.read_protocol_msg
  ...
  Relocation section '.rel.rodata.publish_next_kern_msg' at offset 0x39b8 contains 4 entries:
      Offset             Info             Type               Symbol's Value  Symbol's Name
  0000000000000000  0000000200000002 R_BPF_64_ABS64         0000000000000000 .text
  0000000000000008  0000000200000002 R_BPF_64_ABS64         0000000000000000 .text
  0000000000000010  0000000200000002 R_BPF_64_ABS64         0000000000000000 .text
  0000000000000018  0000000200000002 R_BPF_64_ABS64         0000000000000000 .text

  Relocation section '.rel.llvm_jump_table_sizes' at offset 0x39f8 contains 1 entries:
      Offset             Info             Type               Symbol's Value  Symbol's Name
  0000000000000000  0000001200000002 R_BPF_64_ABS64         0000000000000000 .rodata.publish_next_kern_msg
  ...

  $ llvm-readelf -x '.rodata.read_protocol_msg' user_ringbuf_success.bpf.o
  Hex dump of section '.rodata.read_protocol_msg':
  0x00000000 a8000000 00000000 10010000 00000000 ................
  0x00000010 b8000000 00000000 c8000000 00000000 ................
  $ llvm-readelf -x '.rodata.publish_next_kern_msg' user_ringbuf_success.bpf.o
  Hex dump of section '.rodata.publish_next_kern_msg':
  0x00000000 28040000 00000000 00050000 00000000 (...............
  0x00000010 70040000 00000000 b8040000 00000000 p...............

  $ llvm-objdump -Sr user_ringbuf_success.bpf.o
  ...
  0000000000000000 <read_protocol_msg>:
  ...
  ;       switch (msg->msg_op) {
      13:       61 03 00 00 00 00 00 00 w3 = *(u32 *)(r0 + 0x0)
      14:       26 03 1c 00 03 00 00 00 if w3 > 0x3 goto +0x1c <read_protocol_msg+0x158>
      15:       67 03 00 00 03 00 00 00 r3 <<= 0x3
      16:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll
                0000000000000080:  R_BPF_64_64  .rodata.read_protocol_msg
      18:       0f 31 00 00 00 00 00 00 r1 += r3
      19:       79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0x0)
      20:       0d 01 00 00 00 00 00 00 gotox r1
  ...

What if a single function has two switch statements? The following is an example:

  $ cat test.c
  struct simple_ctx {
        int x;
        int y;
        int z;
  };
  int ret_user, ret_user2;
  void bar(void);
  int foo(struct simple_ctx *ctx, struct simple_ctx *ctx2)
  {
        switch (ctx->x) {
        case 1:
                ret_user = 8;
                break;
        case 6:
                ret_user = 3;
                break;
        case 2:
                ret_user = 4;
                break;
        case 31:
                ret_user = 5;
                break;
        default:
                ret_user = 19;
                break;
        }

        bar();
        switch (ctx2->x) {
        case 0:
                ret_user2 = 8;
                break;
        case 7:
                ret_user2 = 3;
                break;
        case 9:
                ret_user2 = 4;
                break;
        case 31:
                ret_user2 = 5;
                break;
        default:
                ret_user2 = 29;
                break;
        }

        return 0;
  }
  $ clang --target=bpf -O2 -c test.c
  $ llvm-readelf -S test.o
  ...
  [ 4] .rodata.foo       PROGBITS        0000000000000000 0001b8 0001f8 00   A  0   0  8
  [ 5] .rel.rodata.foo   REL             0000000000000000 0004e0 0003f0 10   I 10   4  8
  [ 6] .llvm_jump_table_sizes LLVM_JT_SIZES 0000000000000000 0003b0 000020 00      0   0  1
  [ 7] .rel.llvm_jump_table_sizes REL    0000000000000000 0008d0 000020 10   I 10   6  8
  ...

Note that the same '.llvm_jump-table_sizes' has information for two switch tables
since they are in the same function.

  $ llvm-readelf -x '.llvm_jump_table_sizes' test.o
  Hex dump of section '.llvm_jump_table_sizes':
  0x00000000 00000000 00000000 1f000000 00000000 ................
  0x00000010 f8000000 00000000 20000000 00000000 ........ .......

From the above, the total entries for two switch tables has 0x3f entries:

  $ llvm-readelf -x '.rodata.foo' test.o
  Hex dump of section '.rodata.foo':
  0x00000000 58000000 00000000 78000000 00000000 X.......x.......
  0x00000010 98000000 00000000 98000000 00000000 ................
  0x00000020 98000000 00000000 68000000 00000000 ........h.......
  0x00000030 98000000 00000000 98000000 00000000 ................
  0x00000040 98000000 00000000 98000000 00000000 ................
  0x00000050 98000000 00000000 98000000 00000000 ................
  0x00000060 98000000 00000000 98000000 00000000 ................
  0x00000070 98000000 00000000 98000000 00000000 ................
  0x00000080 98000000 00000000 98000000 00000000 ................
  0x00000090 98000000 00000000 98000000 00000000 ................
  0x000000a0 98000000 00000000 98000000 00000000 ................
  0x000000b0 98000000 00000000 98000000 00000000 ................
  0x000000c0 98000000 00000000 98000000 00000000 ................
  0x000000d0 98000000 00000000 98000000 00000000 ................
  0x000000e0 98000000 00000000 98000000 00000000 ................
  0x000000f0 88000000 00000000 08010000 00000000 ................
  0x00000100 48010000 00000000 48010000 00000000 H.......H.......
  0x00000110 48010000 00000000 48010000 00000000 H.......H.......
  0x00000120 48010000 00000000 48010000 00000000 H.......H.......
  0x00000130 28010000 00000000 48010000 00000000 (.......H.......
  0x00000140 18010000 00000000 48010000 00000000 ........H.......
  0x00000150 48010000 00000000 48010000 00000000 H.......H.......
  0x00000160 48010000 00000000 48010000 00000000 H.......H.......
  0x00000170 48010000 00000000 48010000 00000000 H.......H.......
  0x00000180 48010000 00000000 48010000 00000000 H.......H.......
  0x00000190 48010000 00000000 48010000 00000000 H.......H.......
  0x000001a0 48010000 00000000 48010000 00000000 H.......H.......
  0x000001b0 48010000 00000000 48010000 00000000 H.......H.......
  0x000001c0 48010000 00000000 48010000 00000000 H.......H.......
  0x000001d0 48010000 00000000 48010000 00000000 H.......H.......
  0x000001e0 48010000 00000000 48010000 00000000 H.......H.......
  0x000001f0 38010000 00000000

Related relocations:
  $ llvm-readelf -r test.o
  ...
  Relocation section '.rel.rodata.foo' at offset 0x4e0 contains 63 entries:
      Offset             Info             Type               Symbol's Value  Symbol's Name
  0000000000000000  0000000200000002 R_BPF_64_ABS64         0000000000000000 .text
  0000000000000008  0000000200000002 R_BPF_64_ABS64         0000000000000000 .text
  ...

  Relocation section '.rel.llvm_jump_table_sizes' at offset 0x8d0 contains 2 entries:
      Offset             Info             Type               Symbol's Value  Symbol's Name
  0000000000000000  0000000300000002 R_BPF_64_ABS64         0000000000000000 .rodata.foo
  0000000000000010  0000000300000002 R_BPF_64_ABS64         0000000000000000 .rodata.foo
@yonghong-song yonghong-song force-pushed the br-jt branch 6 times, most recently from be893e9 to c5b53c2 Compare July 10, 2025 16:41
@yonghong-song
Copy link
Contributor Author

Just uploaded a version on top of latest llvm-project and merged with additional changes from @eddyz87. Note that there is a test failure

LLVM :: Transforms/InferAddressSpaces/SPIRV/generic-cast-explicit.ll

which also failed with latest llvm-project.

I will update later once this test failure is gone from upstream.

Yonghong Song and others added 2 commits July 11, 2025 08:14
NOTE: We probably need cpu v5 or other flags to enable this feature.
We can add it later when necessary.

- Generate all jump tables in a single section named .jumptables.
- Represent each jump table as a symbol:
  - value points to an offset within .jumptables;
  - size encodes jump table size in bytes.
- Indirect jump is a gotox instruction:
  - dst register is an index within the table;
  - accompanied by a R_BPF_64_64 relocation pointing to a jump table
    symbol.

clang -S:

  .LJTI0_0:
          .reloc 0, FK_SecRel_8, .BPF.JT.0.0
          gotox r1
          goto LBB0_2
  LBB0_4:
          ...

          .section        .jumptables,"",@progbits
  .L0_0_set_4 = ((LBB0_4-.LBPF.JX.0.0)>>3)-1
  .L0_0_set_2 = ((LBB0_2-.LBPF.JX.0.0)>>3)-1
  ...
  .BPF.JT.0.0:
          .long   .L0_0_set_4
          .long   .L0_0_set_2
          ...

llvm-readelf -r --sections --symbols:

  Section Headers:
    [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
    ...
    [ 4] .jumptables       PROGBITS        0000000000000000 000118 000100 00      0   0  1
    ...

  Relocation section '.rel.text' at offset 0x2a8 contains 2 entries:
      Offset             Info             Type               Symbol's Value  Symbol's Name
  0000000000000010  0000000300000001 R_BPF_64_64            0000000000000000 .BPF.JT.0.0
  ...

  Symbol table '.symtab' contains 6 entries:
     Num:    Value          Size Type    Bind   Vis       Ndx Name
       ...
       2: 0000000000000000   112 FUNC    GLOBAL DEFAULT     2 foo
       3: 0000000000000000   128 NOTYPE  GLOBAL DEFAULT     4 .BPF.JT.0.0
       ...

llvm-objdump -Sdr:

  0000000000000000 <foo>:
       ...
       2:	gotox r1
		0000000000000010:  R_BPF_64_64	.BPF.JT.0.0

An option -bpf-min-jump-table-entries is implemented to control the minimum
number of entries to use a jump table on BPF. The default value 4, but it
can be changed with the following clang option
  clang ... -mllvm -bpf-min-jump-table-entries=6
where the number of jump table cases needs to be >= 6 in order to
use jump table.
Update BPFInstrInfo::analyzeBranch() to comply with
TargetInstrInfo::analyzeBranch() requirements for JX instruction:
if branch instruction can't be categorized as a conditional with
true/false branches -- return true.

Because of this bug MachineBlockPlacement transformation inserted an
additional unreachabe jump after JX, e.g.:

  bb.1.entry:
    ...
    JX killed $r1, %jump-table.0
    JMP %bb.2

Additionally, isNotDuplicable annotation is necessary to avoid machine
level transformations creating several JX instruction copies.
Such copies would refer to the same jump table and would make it not
possible to calculate jump offsets inside the table.
Files triggering such duplication are present in kernel selftests.
eddyz87 added 3 commits July 11, 2025 08:32
- one testing a general structure of the generated code;
- another testing that several jump tables within the same functions
  are generated independently.
Coincidentally this fixes two test failures:
- LLVM :: CodeGen/BPF/CORE/offset-reloc-fieldinfo-2-bpfeb.ll
- LLVM :: CodeGen/BPF/CORE/offset-reloc-fieldinfo-2.ll

These tests invoke llc with -mcpuv1 and have a switch statement in the
IR. Both tests failed with assertion in
SelectionDAGLegalize::LegalizeOp():

  for (const SDValue &Op : Node->op_values())
    assert((TLI.getTypeAction(*DAG.getContext(), Op.getValueType()) ==
              TargetLowering::TypeLegal ||
            Op.getOpcode() == ISD::TargetConstant ||
            Op.getOpcode() == ISD::Register) &&
            "Unexpected illegal type!");

At the moment of the failure:

  Op.getOpcode() == BPFISD::BPF_BR_JT

The error happened because one of the BPFBrJt parameters has i32 type:

  def SDT_BPFBrJt         : SDTypeProfile<0, 2, [SDTCisVT<0, i32>,    // jump table
                                                 SDTCisVT<1, i64>]>;  // index
  def BPFBrJt         : SDNode<"BPFISD::BPF_BR_JT", SDT_BPFBrJt,
                               [SDNPHasChain]>;
The requirement to emit jump table entries as offsets measured in
instructions, e.g. as follows:

  .L0_0_set_7 = ((LBB0_7-.LBPF.JX.0.0)>>3)-1

Makes it impossible to use generic AsmPrinter::emitJumpTableInfo()
function. Merge request used this generic function before
(and incorrect offsets were generated).
This generic function required two overloads:
- AsmPrinter::GetJTISymbol()
- TargetLowering::getPICJumpTableRelocBaseExpr()

Now all jump table emission logic is located in the
BPFAsmPrinter::emitJumpTableInfo(), which does not require above
overloads. Hence, remove the overloads and move corresponding code to
BPFAsmPrinter to keep it in one place.
@yonghong-song
Copy link
Contributor Author

Added additional changes from @eddyz87 which includes some BPF backend changes and tests.

@aspsk
Copy link

aspsk commented Jul 11, 2025

FYI, all my tests from bpf_goto_x.c selftest pass now with this branch. Will post v1 early next week.

aspsk added a commit to aspsk/bpf-next that referenced this pull request Jul 12, 2025
Make the libbpf parse and pass proper offsets for "new" gotox instructions
(generated by llvm/llvm-project#133856).

Hack fast, so there are leftovers from the old patch. (And the blindness which
was presumably fixed, breaks again in bpf_goto_x tests.)

Signed-off-by: Anton Protopopov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants