Message ID | cover.1625970383.git.Tony.Ambardar@gmail.com |
---|---|
Headers | show |
Series | MIPS: eBPF: refactor code, add MIPS32 JIT | expand |
Hi Tony, I am glad that there are more people interested in having a JIT for MIPS32. We seem to have been working in parallel on the same thing though. I sent a summary on the state of the MIPS32 JIT on the linux-mips list a couple of months ago, asking for feedback on the best way to complete it. When I received no response, I started to work on a MIPS32 JIT implementation myself. I'll be glad to share what I have got so we can work together on this. When I dug deeper into the 64-bit JIT code, I realised that a lot of fundamental things such as 32-bit register mappings were completely missing. Most of 32-bit operations were unimplemented. The code is also quite complex already, so adding full 32-bit hardware support into the mix did not seem like a good idea. I am sure there is some common code that can be factored out and re-used, but I do think the 64-bit and 32-bit JITs would be better off as two different implementations. My 32-bit implementation is now complete and I am currently testing it. Test suite output below. What remains to be tested is tail calls. test_bpf: Summary: 676 PASSED, 0 FAILED, [664/664 JIT'ed] Tested with kernel 5.14 on MIPS32r2 big-endian and little-endian under QEMU. Also tested with kernel 5.4 on MIPS 24KEc (MT7628) physical hardware. (I have added a lot of new tests in the eBPF test suite during the JIT development, which explains the higher count) The implementation supports both 32-bit and 64-bit eBPF instructions, including all atomic operations. 64-bit atomics and div/mod are implemented as function calls to atomic64 functions, while 32-bit variants are implemented natively by the JIT. Register mapping ================= My 32-bit implementation maps all 64-bit eBPF registers to native 32-bit MIPS registers. In addition, there are four temporary 32-bit registers available, which is precisely what is needed for doing the more complex ALU64 operations. This means that the JIT does not use any stack scratch space for registers. It should be a good thing from a performance perspective. The register mapping is as follows. R0: v0,v1 (return) R1-R2: a0-a3 (args passed in registers) R3-R5: t0-t5 (args passed on stack) R6-R9: s0-s7 (callee-saved) R10: r0,fp (frame pointer) AX: gp,at (constant blinding) Temp: t6-t9 To squeeze out enough MIPS registers for the eBPF mapping I had to make a few unusual choices. First, I use the at (assembler temporary) register, which should be fine because the JIT is the assembler. I also use use the gp (global pointer) register. It is callee-saved, so I save it on stack and restore it in the epilogue. The eBPF frame pointer R10 is mapped to fp, also callee-saved, and r0. The latter is always zero, but on a 32-bit architecture it will also be used to "store" zeros, so it should be perfectly fine for the 32-bit JIT. According to the ISA documentation r0 is valid both as a source and a destination operand. The complete register mapping simplifies the code since we get rid of all the swapping to/from the stack scratch space. I have been focusing on the code the last couple of weeks so I didn't see your email until now. I am sure that this comes as much of a surprise to you as it did to me. Anyway, can send a patch with my JIT implementation tomorrow. Cheers, Johan On Mon, Jul 12, 2021 at 2:35 AM Tony Ambardar <tony.ambardar@gmail.com> wrote: > > Greetings! > > This patch series adds an eBPF JIT for MIPS32. The approach taken first > updates existing code to support MIPS64/MIPS32 systems, then refactors > source into a common core and dependent MIPS64 JIT, and finally adds a > MIPS32 eBPF JIT implementation using the common framework. > > Compared to writing a standalone MIPS32 JIT, this approach has benefits > for long-term maintainability, but has taken much longer than expected. > This RFC posting is intended to share progress, gather feedback, and > raise some questions with BPF and MIPS experts (which I'll cover later). > > > Code Overview > ============= > > The initial code updates and refactoring exposed a number of problems in > the existing MIPS64 JIT, which the first several patches fix. Patch #11 > updates common code to support MIPS64/MIPS32 operation. Patch #12 > separates the common core from the MIPS64 JIT code. Patch #13 adds a > needed MIPS32 uasm opcode, while patch #14 adds the MIPS32 eBPF JIT. > > On MIPS32, 64-bit BPF registers are mapped to 32-bit register pairs, and > all 64-bit operations are built on 32-bit subregister ops. The MIPS32 > tailcall counter is stored on the stack however. Notable changes from the > MIPS64 JIT include: > > * BPF_JMP32: implement all conditionals > * BPF_JMP | JSET | BPF_K: drop bbit insns only usable on MIPS64 Octeon > > Since MIPS32 does not include 64-bit div/mod or atomic opcodes, these BPF > insns are implemented by directly calling the built-in kernel functions: > (with thanks to Luke Nelson for posting similar code online) > > * BPF_STX | BPF_DW | BPF_XADD > * BPF_ALU64 | BPF_DIV | BPF_X > * BPF_ALU64 | BPF_DIV | BPF_K > * BPF_ALU64 | BPF_MOD | BPF_X > * BPF_ALU64 | BPF_MOD | BPF_K > > > Testing > ======= > > Testing used LTS kernel 5.10.x and stable 5.13.x running under QEMU. > The test suite included the 'test_bpf' module and 'test_verifier' from > kselftests. Using 'test_progs' from kselftests is too difficult in general > since cross-compilation depends on libbpf/bpftool, which does not support > cross-endian builds. > > The matrix of test configurations executed for this series covered the > expected register sizes, MIPS ISA releases, and JIT settings: > > WORDSIZE={64-bit,32-bit} x ISA={R2,R6} x JIT={off,on,hardened} > > On MIPS32BE and MIPS32LE there was general parity between the results of > interpreter vs. JIT-backed tests with respect to the numbers of PASSED, > SKIPPED, and FAILED tests. The same was also true of MIPS64 retesting. > > For example, the results below on MIPS32 are typical. Note that skipped > tests 854 and 855 are "scale" tests which result in OOM on the QEMU malta > MIPS32 test systems. > > root@OpenWrt:~# sysctl net.core.bpf_jit_enable=1 > root@OpenWrt:~# modprobe test_bpf > ... > test_bpf: Summary: 378 PASSED, 0 FAILED, [366/366 JIT'ed] > root@OpenWrt:~# ./test_verifier 0 853 > ... > Summary: 1127 PASSED, 0 SKIPPED, 89 FAILED > root@OpenWrt:~# ./test_verifier 855 1149 > ... > Summary: 408 PASSED, 7 SKIPPED, 53 FAILED > > > Open Questions > ============== > > 1. As seen in the patch series, the static analysis used by the MIPS64 JIT > tends to be fragile in the face of verifier, insn and patching changes. > After tracking down and fixing several related bugs, I wonder if it were > better to remove the static analysis and leave things more robust and > maintainable going forward. > > Paul, Thomas, David, what are your views? Do you have thoughts on how best > to do this? > > Would it be possible to replace the static analysis by accessing verifier > analysis results from a JIT? Daniel, Alexei, or Andrii? > > > 2. The series tries to correctly handle tailcall counter across bpf2bpf > and tailcalls, and it would be nice to properly support mixing these, > but this is still a WIP for me. Much of what I've read seems very specific > to the x86_64 JIT. Is there a good summary of the required changes for a > JIT in general? > > Note: I built a MIPS32LE 'test_progs' after some horrible, ugly hacking, > and the 'tailcall' tests pass but the 'tailcall_bpf2bpf' tests fail > cryptically. I can send a log and strace if someone helpful could kindly > take a look. Is there an alternative, good standalone test available? > > > > Possible Next Steps > =================== > > 1. Implementing the new BPF_ATOMIC insns *should* be straightforward > on MIPS32. I'm less certain of MIPS64 given the static analysis and > related zext/sext logic. > > 2. The BPF_JMP32 class is another big gap on MIPS64. Has anyone looked at > this before? It also ties to the static analysis, but on first glance > appears feasible. > > > > Thanks in advance for any feedback or suggestions! > > > Tony Ambardar (14): > MIPS: eBPF: support BPF_TAIL_CALL in JIT static analysis > MIPS: eBPF: mask 32-bit index for tail calls > MIPS: eBPF: fix BPF_ALU|ARSH handling in JIT static analysis > MIPS: eBPF: support BPF_JMP32 in JIT static analysis > MIPS: eBPF: fix system hang with verifier dead-code patching > MIPS: eBPF: fix JIT static analysis hang with bounded loops > MIPS: eBPF: fix MOD64 insn on R6 ISA > MIPS: eBPF: support long jump for BPF_JMP|EXIT > MIPS: eBPF: drop src_reg restriction in BPF_LD|BPF_DW|BPF_IMM > MIPS: eBPF: improve and clarify enum 'which_ebpf_reg' > MIPS: eBPF: add core support for 32/64-bit systems > MIPS: eBPF: refactor common MIPS64/MIPS32 functions and headers > MIPS: uasm: Enable muhu opcode for MIPS R6 > MIPS: eBPF: add MIPS32 JIT > > Documentation/admin-guide/sysctl/net.rst | 6 +- > Documentation/networking/filter.rst | 6 +- > arch/mips/Kconfig | 4 +- > arch/mips/include/asm/uasm.h | 1 + > arch/mips/mm/uasm-mips.c | 4 +- > arch/mips/mm/uasm.c | 3 +- > arch/mips/net/Makefile | 8 +- > arch/mips/net/ebpf_jit.c | 1935 ---------------------- > arch/mips/net/ebpf_jit.h | 295 ++++ > arch/mips/net/ebpf_jit_comp32.c | 1241 ++++++++++++++ > arch/mips/net/ebpf_jit_comp64.c | 987 +++++++++++ > arch/mips/net/ebpf_jit_core.c | 1118 +++++++++++++ > 12 files changed, 3663 insertions(+), 1945 deletions(-) > delete mode 100644 arch/mips/net/ebpf_jit.c > create mode 100644 arch/mips/net/ebpf_jit.h > create mode 100644 arch/mips/net/ebpf_jit_comp32.c > create mode 100644 arch/mips/net/ebpf_jit_comp64.c > create mode 100644 arch/mips/net/ebpf_jit_core.c > > -- > 2.25.1 >
On Mon, Jul 19, 2021 at 6:25 PM Johan Almbladh <johan.almbladh@anyfinetworks.com> wrote: > > I have been focusing on the code the last couple of weeks so I didn't > see your email until now. I am sure that this comes as much of a > surprise to you as it did to me. Anyway, can send a patch with my JIT > implementation tomorrow. It is surprising to have not one but two mips32 JITs :) I really hope you folks can figure out the common path forward. It sounds to me that the register mapping choices in both implementations are the same (which would be the most debated part to agree on). Not seeing Johan's patches it's hard to make any comparison. So far I like Tony's patches. The refactoring and code sharing is great. Tony, what 'static analysis' by the JIT you're referring to? re: bpf_jit_needs_zext issue between JIT and the verifier. It's a difficult one. opt_subreg_zext_lo32_rnd_hi32() shouldn't depend on JIT (other than bpf_jit_needs_zext). But you're setting that callback the same way as x86-32 JIT. So the same bug should be seen there too. Could you double check if it's the case? It's either a regression (if both x86-32 and mips32 JITs fail this test_verifier test) or endianness related (if it's mip32 JIT only). Thank you both for the exciting work!