mbox series

[RFC,00/14] bpf/tests: Extend the eBPF test suite

Message ID 20210726081738.1833704-1-johan.almbladh@anyfinetworks.com
Headers show
Series bpf/tests: Extend the eBPF test suite | expand

Message

Johan Almbladh July 26, 2021, 8:17 a.m. UTC
Greetings,

During my work with the 32-bit MIPS JIT implementation I also added a
number of new test cases in the test_bpf kernel module. I found it
valuable to be able to throughly test the JIT on a low level with
minimum dependency on user space tooling. If you think it would be useful,
I have prepared a patch set with my additions. I have verified it on
x86_64 and i386, with/without JIT and JIT hardening. The interpreter
passes all tests. The JITs do too, with one exception, see NOTE below.
The result for the x86_64 JIT is summarized below.

    test_bpf: Summary: 577 PASSED, 0 FAILED, [565/565 JIT'ed]
    test_bpf: test_tail_calls: Summary: 6 PASSED, 1 FAILED, [7/7 JIT'ed]

I have inserted the new tests in the location where related tests are run,
rather than putting them at the end. I have also tried to use the same
description style as the surrounding tests. Below is a summary of the
new tests.

* Operations not previously covered
  JMP32, ALU32 ARSH, remaining ATOMIC operations including
  XCHG and CMPXCHG.

* ALU operations with edge cases
  32-bit JITs implement ALU64 operations with two 32-bit registers per
  operand. Even "trivial" operations like bit shifts are non-trivial to
  implement. Test different input values that may trigger different JIT
  code paths. JITs may also implement BPF_K operations differently
  depending on if the immediate fits the corresponding field width of the
  native CPU instruction or not, so test that too.

* Word order in load/store
  The word order should follow endianness. Test that DW load/store
  operations result in the expected word order in memory.

* 32-bit eBPF argument zero extension
  On a 32-bit JIT the eBPF argument is a 32-bit pointer. If passed in
  a CPU register only one register in the mapped pair contains valid
  data. Verify that value is properly zero-extended.

* Long conditional jumps
  Test to trigger the relative-to-absolute branch conversion in MIPS JITs,
  when the PC-relative offset overflows the field width of the MIPS branch
  instruction.

* Tail calls
  A new test suite to test tail calls. Also test error paths and TCC
  limit.

NOTE: There is a minor discrepancy between the interpreter and the
(x86) JITs. With MAX_TAIL_CALL_CNT = 32, the interpreter seems to allow
up to 33 tail calls, whereas the JITs stop at 32. This causes the max TCC
test to fail for the JITs, since I used the interpreter as reference.
Either we change the interpreter behavior, change the JITs, or relax the
test to allow both behaviors.

Let me know what you think.

Cheers,
Johan

Johan Almbladh (14):
  bpf/tests: add BPF_JMP32 test cases
  bpf/tests: add BPF_MOV tests for zero and sign extension
  bpf/tests: fix typos in test case descriptions
  bpf/tests: add more tests of ALU32 and ALU64 bitwise operations
  bpf/tests: add more ALU32 tests for BPF_LSH/RSH/ARSH
  bpf/tests: add more BPF_LSH/RSH/ARSH tests for ALU64
  bpf/tests: add more ALU64 BPF_MUL tests
  bpf/tests: add tests for ALU operations implemented with function
    calls
  bpf/tests: add word-order tests for load/store of double words
  bpf/tests: add branch conversion JIT test
  bpf/tests: add test for 32-bit context pointer argument passing
  bpf/tests: add tests for atomic operations
  bpf/tests: add tests for BPF_CMPXCHG
  bpf/tests: add tail call test suite

 lib/test_bpf.c | 2732 +++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 2475 insertions(+), 257 deletions(-)

Comments

Daniel Borkmann July 28, 2021, 8:27 a.m. UTC | #1
On 7/27/21 12:53 AM, Andrii Nakryiko wrote:
> On Mon, Jul 26, 2021 at 1:18 AM Johan Almbladh

> <johan.almbladh@anyfinetworks.com> wrote:

>>

>> Greetings,

>>

>> During my work with the 32-bit MIPS JIT implementation I also added a

>> number of new test cases in the test_bpf kernel module. I found it

>> valuable to be able to throughly test the JIT on a low level with

>> minimum dependency on user space tooling. If you think it would be useful,

>> I have prepared a patch set with my additions. I have verified it on

>> x86_64 and i386, with/without JIT and JIT hardening. The interpreter

>> passes all tests. The JITs do too, with one exception, see NOTE below.

>> The result for the x86_64 JIT is summarized below.

>>

>>      test_bpf: Summary: 577 PASSED, 0 FAILED, [565/565 JIT'ed]

>>      test_bpf: test_tail_calls: Summary: 6 PASSED, 1 FAILED, [7/7 JIT'ed]

>>

>> I have inserted the new tests in the location where related tests are run,

>> rather than putting them at the end. I have also tried to use the same

>> description style as the surrounding tests. Below is a summary of the

>> new tests.

>>

>> * Operations not previously covered

>>    JMP32, ALU32 ARSH, remaining ATOMIC operations including

>>    XCHG and CMPXCHG.

>>

>> * ALU operations with edge cases

>>    32-bit JITs implement ALU64 operations with two 32-bit registers per

>>    operand. Even "trivial" operations like bit shifts are non-trivial to

>>    implement. Test different input values that may trigger different JIT

>>    code paths. JITs may also implement BPF_K operations differently

>>    depending on if the immediate fits the corresponding field width of the

>>    native CPU instruction or not, so test that too.

>>

>> * Word order in load/store

>>    The word order should follow endianness. Test that DW load/store

>>    operations result in the expected word order in memory.

>>

>> * 32-bit eBPF argument zero extension

>>    On a 32-bit JIT the eBPF argument is a 32-bit pointer. If passed in

>>    a CPU register only one register in the mapped pair contains valid

>>    data. Verify that value is properly zero-extended.

>>

>> * Long conditional jumps

>>    Test to trigger the relative-to-absolute branch conversion in MIPS JITs,

>>    when the PC-relative offset overflows the field width of the MIPS branch

>>    instruction.

>>

>> * Tail calls

>>    A new test suite to test tail calls. Also test error paths and TCC

>>    limit.

>>

>> NOTE: There is a minor discrepancy between the interpreter and the

>> (x86) JITs. With MAX_TAIL_CALL_CNT = 32, the interpreter seems to allow

>> up to 33 tail calls, whereas the JITs stop at 32. This causes the max TCC

> 

> Given the intended case was to allow 32, let's fix up the interpreter

> to be in line with JITs?


Yes, lets fix up the interpreter.

Could you send a fix for the latter, Johan, along with this series?

Big thanks for adding all the new tests by the way!

>> test to fail for the JITs, since I used the interpreter as reference.

>> Either we change the interpreter behavior, change the JITs, or relax the

>> test to allow both behaviors.

>>

>> Let me know what you think.

>>

>> Cheers,

>> Johan

>>

>> Johan Almbladh (14):

>>    bpf/tests: add BPF_JMP32 test cases

>>    bpf/tests: add BPF_MOV tests for zero and sign extension

>>    bpf/tests: fix typos in test case descriptions

>>    bpf/tests: add more tests of ALU32 and ALU64 bitwise operations

>>    bpf/tests: add more ALU32 tests for BPF_LSH/RSH/ARSH

>>    bpf/tests: add more BPF_LSH/RSH/ARSH tests for ALU64

>>    bpf/tests: add more ALU64 BPF_MUL tests

>>    bpf/tests: add tests for ALU operations implemented with function

>>      calls

>>    bpf/tests: add word-order tests for load/store of double words

>>    bpf/tests: add branch conversion JIT test

>>    bpf/tests: add test for 32-bit context pointer argument passing

>>    bpf/tests: add tests for atomic operations

>>    bpf/tests: add tests for BPF_CMPXCHG

>>    bpf/tests: add tail call test suite

>>

>>   lib/test_bpf.c | 2732 +++++++++++++++++++++++++++++++++++++++++++-----

>>   1 file changed, 2475 insertions(+), 257 deletions(-)

>>

>> --

>> 2.25.1

>>
Johan Almbladh July 28, 2021, 12:15 p.m. UTC | #2
On Wed, Jul 28, 2021 at 10:27 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
> Yes, lets fix up the interpreter.

>

> Could you send a fix for the latter, Johan, along with this series?


Thanks for the comments, Andrii and Daniel. Yes, I agree that fixing
up the interpreter makes most sense. I'll submit the patches shortly.

Johan
Yonghong Song July 28, 2021, 7:13 p.m. UTC | #3
On 7/28/21 9:47 AM, Johan Almbladh wrote:
> Before, the interpreter allowed up to MAX_TAIL_CALL_CNT + 1 tail calls.

> Now precisely MAX_TAIL_CALL_CNT is allowed, which is in line with the

> behavior of the x86 JITs.

> 

> Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>


LGTM.

Acked-by: Yonghong Song <yhs@fb.com>


I also checked arm/arm64 jit. I saw the following comments:

         /* if (tail_call_cnt > MAX_TAIL_CALL_CNT)
          *      goto out;
          * tail_call_cnt++;
          */

Maybe we have this MAX_TAIL_CALL_CNT + 1 issue
for arm/arm64 jit?

> ---

>   kernel/bpf/core.c | 2 +-

>   1 file changed, 1 insertion(+), 1 deletion(-)

> 

> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c

> index 9b1577498373..67682b3afc84 100644

> --- a/kernel/bpf/core.c

> +++ b/kernel/bpf/core.c

> @@ -1559,7 +1559,7 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn)

>   

>   		if (unlikely(index >= array->map.max_entries))

>   			goto out;

> -		if (unlikely(tail_call_cnt > MAX_TAIL_CALL_CNT))

> +		if (unlikely(tail_call_cnt >= MAX_TAIL_CALL_CNT))

>   			goto out;

>   

>   		tail_call_cnt++;

>
Johan Almbladh July 29, 2021, 9:37 p.m. UTC | #4
On Wed, Jul 28, 2021 at 9:13 PM Yonghong Song <yhs@fb.com> wrote:
> I also checked arm/arm64 jit. I saw the following comments:

>

>          /* if (tail_call_cnt > MAX_TAIL_CALL_CNT)

>           *      goto out;

>           * tail_call_cnt++;

>           */

>

> Maybe we have this MAX_TAIL_CALL_CNT + 1 issue

> for arm/arm64 jit?


That wouldn't be unreasonable. I don't have an arm or arm64 setup
available right now, but I can try to test it in qemu.
Andrii Nakryiko July 29, 2021, 10:29 p.m. UTC | #5
On Thu, Jul 29, 2021 at 2:38 PM Johan Almbladh
<johan.almbladh@anyfinetworks.com> wrote:
>

> On Wed, Jul 28, 2021 at 9:13 PM Yonghong Song <yhs@fb.com> wrote:

> > I also checked arm/arm64 jit. I saw the following comments:

> >

> >          /* if (tail_call_cnt > MAX_TAIL_CALL_CNT)

> >           *      goto out;

> >           * tail_call_cnt++;

> >           */

> >

> > Maybe we have this MAX_TAIL_CALL_CNT + 1 issue

> > for arm/arm64 jit?

>

> That wouldn't be unreasonable. I don't have an arm or arm64 setup

> available right now, but I can try to test it in qemu.


On a brief check, there seems to be quite a mess in terms of the code
and comments.

E.g., in arch/x86/net/bpf_jit_comp32.c:

        /*
         * if (tail_call_cnt > MAX_TAIL_CALL_CNT)
         *     goto out;
         */

                            ^^^^ here comment is wrong

        [...]

        /* cmp edx,hi */
        EMIT3(0x83, add_1reg(0xF8, IA32_EBX), hi);
        EMIT2(IA32_JNE, 3);
        /* cmp ecx,lo */
        EMIT3(0x83, add_1reg(0xF8, IA32_ECX), lo);

        /* ja out */
        EMIT2(IA32_JAE, jmp_label(jmp_label1, 2));

        ^^^ JAE is >=, right? But the comment says JA.


As for arch/x86/net/bpf_jit_comp.c, both comment and the code seem to
do > MAX_TAIL_CALL_CNT, but you are saying JIT is correct. What am I
missing?

Can you please check all the places where MAX_TAIL_CALL_CNT is used
throughout the code? Let's clean this up in one go.

Also, given it's so easy to do this off-by-one error, can you please
add a negative test validating that 33 tail calls are not allowed? I
assume we have a positive test that allows exactly MAX_TAIL_CALL_CNT,
but please double-check that as well.

I also wonder if it would make sense to convert these
internal-but-sort-of-advertised constants like MAX_TAIL_CALL_CNT and
BPF_COMPLEXITY_LIMIT_INSNS into enums so that they can be "discovered"
from BTF. This should be discussed/attempted outside of this fix,
though. Just bringing it up here.
Andrii Nakryiko July 29, 2021, 10:48 p.m. UTC | #6
On Thu, Jul 29, 2021 at 3:29 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>

> On Thu, Jul 29, 2021 at 2:38 PM Johan Almbladh

> <johan.almbladh@anyfinetworks.com> wrote:

> >

> > On Wed, Jul 28, 2021 at 9:13 PM Yonghong Song <yhs@fb.com> wrote:

> > > I also checked arm/arm64 jit. I saw the following comments:

> > >

> > >          /* if (tail_call_cnt > MAX_TAIL_CALL_CNT)

> > >           *      goto out;

> > >           * tail_call_cnt++;

> > >           */

> > >

> > > Maybe we have this MAX_TAIL_CALL_CNT + 1 issue

> > > for arm/arm64 jit?

> >

> > That wouldn't be unreasonable. I don't have an arm or arm64 setup

> > available right now, but I can try to test it in qemu.

>

> On a brief check, there seems to be quite a mess in terms of the code

> and comments.

>

> E.g., in arch/x86/net/bpf_jit_comp32.c:

>

>         /*

>          * if (tail_call_cnt > MAX_TAIL_CALL_CNT)

>          *     goto out;

>          */

>

>                             ^^^^ here comment is wrong

>

>         [...]

>

>         /* cmp edx,hi */

>         EMIT3(0x83, add_1reg(0xF8, IA32_EBX), hi);

>         EMIT2(IA32_JNE, 3);

>         /* cmp ecx,lo */

>         EMIT3(0x83, add_1reg(0xF8, IA32_ECX), lo);

>

>         /* ja out */

>         EMIT2(IA32_JAE, jmp_label(jmp_label1, 2));

>

>         ^^^ JAE is >=, right? But the comment says JA.

>

>

> As for arch/x86/net/bpf_jit_comp.c, both comment and the code seem to

> do > MAX_TAIL_CALL_CNT, but you are saying JIT is correct. What am I

> missing?

>

> Can you please check all the places where MAX_TAIL_CALL_CNT is used

> throughout the code? Let's clean this up in one go.

>

> Also, given it's so easy to do this off-by-one error, can you please

> add a negative test validating that 33 tail calls are not allowed? I

> assume we have a positive test that allows exactly MAX_TAIL_CALL_CNT,

> but please double-check that as well.


Ok, I see that you've added this in your bpf tests patch set. Please
consider, additionally, implementing a similar test as part of
selftests/bpf (specifically in test_progs). We run test_progs
continuously in CI for every incoming patch/patchset, so it has much
higher chances of capturing any regressions.

I'm also thinking that this MAX_TAIL_CALL_CNT change should probably
go into the bpf-next tree. First, this off-by-one behavior was around
for a while and it doesn't cause serious issues, even if abused. But
on the other hand, it will make your tail call tests fail, when
applied into bpf-next without your change. So I think we should apply
both into bpf-next.

On a related topic, please don't forget to include the target kernel
tree for your patches: [PATCH bpf] or [PATCH bpf-next].


>

> I also wonder if it would make sense to convert these

> internal-but-sort-of-advertised constants like MAX_TAIL_CALL_CNT and

> BPF_COMPLEXITY_LIMIT_INSNS into enums so that they can be "discovered"

> from BTF. This should be discussed/attempted outside of this fix,

> though. Just bringing it up here.
Johan Almbladh Aug. 1, 2021, 8:37 a.m. UTC | #7
On Fri, Jul 30, 2021 at 12:48 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>

> On Thu, Jul 29, 2021 at 3:29 PM Andrii Nakryiko

> <andrii.nakryiko@gmail.com> wrote:

> >

> > On Thu, Jul 29, 2021 at 2:38 PM Johan Almbladh

> > <johan.almbladh@anyfinetworks.com> wrote:

> > >

> > > On Wed, Jul 28, 2021 at 9:13 PM Yonghong Song <yhs@fb.com> wrote:

> > > > I also checked arm/arm64 jit. I saw the following comments:

> > > >

> > > >          /* if (tail_call_cnt > MAX_TAIL_CALL_CNT)

> > > >           *      goto out;

> > > >           * tail_call_cnt++;

> > > >           */

> > > >

> > > > Maybe we have this MAX_TAIL_CALL_CNT + 1 issue

> > > > for arm/arm64 jit?

> > >

> > > That wouldn't be unreasonable. I don't have an arm or arm64 setup

> > > available right now, but I can try to test it in qemu.

> >

> > On a brief check, there seems to be quite a mess in terms of the code

> > and comments.

> >

> > E.g., in arch/x86/net/bpf_jit_comp32.c:

> >

> >         /*

> >          * if (tail_call_cnt > MAX_TAIL_CALL_CNT)

> >          *     goto out;

> >          */

> >

> >                             ^^^^ here comment is wrong

> >

> >         [...]

> >

> >         /* cmp edx,hi */

> >         EMIT3(0x83, add_1reg(0xF8, IA32_EBX), hi);

> >         EMIT2(IA32_JNE, 3);

> >         /* cmp ecx,lo */

> >         EMIT3(0x83, add_1reg(0xF8, IA32_ECX), lo);

> >

> >         /* ja out */

> >         EMIT2(IA32_JAE, jmp_label(jmp_label1, 2));

> >

> >         ^^^ JAE is >=, right? But the comment says JA.

> >

> >

> > As for arch/x86/net/bpf_jit_comp.c, both comment and the code seem to

> > do > MAX_TAIL_CALL_CNT, but you are saying JIT is correct. What am I

> > missing?

> >

> > Can you please check all the places where MAX_TAIL_CALL_CNT is used

> > throughout the code? Let's clean this up in one go.

> >

> > Also, given it's so easy to do this off-by-one error, can you please

> > add a negative test validating that 33 tail calls are not allowed? I

> > assume we have a positive test that allows exactly MAX_TAIL_CALL_CNT,

> > but please double-check that as well.

>

> Ok, I see that you've added this in your bpf tests patch set. Please

> consider, additionally, implementing a similar test as part of

> selftests/bpf (specifically in test_progs). We run test_progs

> continuously in CI for every incoming patch/patchset, so it has much

> higher chances of capturing any regressions.

>

> I'm also thinking that this MAX_TAIL_CALL_CNT change should probably

> go into the bpf-next tree. First, this off-by-one behavior was around

> for a while and it doesn't cause serious issues, even if abused. But

> on the other hand, it will make your tail call tests fail, when

> applied into bpf-next without your change. So I think we should apply

> both into bpf-next.


I can confirm that the off-by-one behaviour is present on arm. Below
is the test output running on qemu. Test #4 calls itself recursively
and increments a counter each time, so the correct result should be 1
+ MAX_TAIL_CALL_CNT.

test_bpf: #0 Tail call leaf jited:1 71 PASS
test_bpf: #1 Tail call 2 jited:1 134 PASS
test_bpf: #2 Tail call 3 jited:1 164 PASS
test_bpf: #3 Tail call 4 jited:1 257 PASS
test_bpf: #4 Tail call error path, max count reached jited:1 ret 34 != 33 FAIL
test_bpf: #5 Tail call error path, NULL target jited:1 114 PASS
test_bpf: #6 Tail call error path, index out of range jited:1 112 PASS
test_bpf: test_tail_calls: Summary: 6 PASSED, 1 FAILED, [7/7 JIT'ed]

The MAX_TAIL_CALL_CNT constant is referenced in the following JITs.

arch/arm64/net/bpf_jit_comp.c
arch/arm/net/bpf_jit_32.c
arch/mips/net/ebpf_jit.c
arch/powerpc/net/bpf_jit_comp32.c
arch/powerpc/net/bpf_jit_comp64.c
arch/riscv/net/bpf_jit_comp32.c
arch/riscv/net/bpf_jit_comp64.c
arch/s390/net/bpf_jit_comp.c
arch/sparc/net/bpf_jit_comp_64.c
arch/x86/net/bpf_jit_comp32.c
arch/x86/net/bpf_jit_comp.c

The x86 JITs all pass the test, even though the comments are wrong.
The comments can easily be fixed of course. For JITs that have the
off-by-one behaviour, an easy fix would be to change all occurrences
of MAX_TAIL_CALL_CNT to MAX_TAIL_CALL_CNT - 1. We must first know
which JITs affected though.

The fix is easy but setting up the test is hard. It took me quite some
time to get the qemu/arm setup up and running. If the same has to be
done for arm64, mips64, powerpc, powerpc64, riscv32, risc64, sparc and
s390, I will need some help with this. If someone already has a
working setup for any of the systems, the test can be performed on
that.

Or perhaps there is a better way to do this? If I implement a similar
test in selftest/bpf, that would trigger the CI when the patch is
submitted and we will see which JITs we need to fix.

> On a related topic, please don't forget to include the target kernel

> tree for your patches: [PATCH bpf] or [PATCH bpf-next].


I'll add that! All patches I sent related to this are for the bpf-next tree.

Johan
Andrii Nakryiko Aug. 2, 2021, 8:28 p.m. UTC | #8
On Sun, Aug 1, 2021 at 1:38 AM Johan Almbladh
<johan.almbladh@anyfinetworks.com> wrote:
>

> On Fri, Jul 30, 2021 at 12:48 AM Andrii Nakryiko

> <andrii.nakryiko@gmail.com> wrote:

> >

> > On Thu, Jul 29, 2021 at 3:29 PM Andrii Nakryiko

> > <andrii.nakryiko@gmail.com> wrote:

> > >

> > > On Thu, Jul 29, 2021 at 2:38 PM Johan Almbladh

> > > <johan.almbladh@anyfinetworks.com> wrote:

> > > >

> > > > On Wed, Jul 28, 2021 at 9:13 PM Yonghong Song <yhs@fb.com> wrote:

> > > > > I also checked arm/arm64 jit. I saw the following comments:

> > > > >

> > > > >          /* if (tail_call_cnt > MAX_TAIL_CALL_CNT)

> > > > >           *      goto out;

> > > > >           * tail_call_cnt++;

> > > > >           */

> > > > >

> > > > > Maybe we have this MAX_TAIL_CALL_CNT + 1 issue

> > > > > for arm/arm64 jit?

> > > >

> > > > That wouldn't be unreasonable. I don't have an arm or arm64 setup

> > > > available right now, but I can try to test it in qemu.

> > >

> > > On a brief check, there seems to be quite a mess in terms of the code

> > > and comments.

> > >

> > > E.g., in arch/x86/net/bpf_jit_comp32.c:

> > >

> > >         /*

> > >          * if (tail_call_cnt > MAX_TAIL_CALL_CNT)

> > >          *     goto out;

> > >          */

> > >

> > >                             ^^^^ here comment is wrong

> > >

> > >         [...]

> > >

> > >         /* cmp edx,hi */

> > >         EMIT3(0x83, add_1reg(0xF8, IA32_EBX), hi);

> > >         EMIT2(IA32_JNE, 3);

> > >         /* cmp ecx,lo */

> > >         EMIT3(0x83, add_1reg(0xF8, IA32_ECX), lo);

> > >

> > >         /* ja out */

> > >         EMIT2(IA32_JAE, jmp_label(jmp_label1, 2));

> > >

> > >         ^^^ JAE is >=, right? But the comment says JA.

> > >

> > >

> > > As for arch/x86/net/bpf_jit_comp.c, both comment and the code seem to

> > > do > MAX_TAIL_CALL_CNT, but you are saying JIT is correct. What am I

> > > missing?

> > >

> > > Can you please check all the places where MAX_TAIL_CALL_CNT is used

> > > throughout the code? Let's clean this up in one go.

> > >

> > > Also, given it's so easy to do this off-by-one error, can you please

> > > add a negative test validating that 33 tail calls are not allowed? I

> > > assume we have a positive test that allows exactly MAX_TAIL_CALL_CNT,

> > > but please double-check that as well.

> >

> > Ok, I see that you've added this in your bpf tests patch set. Please

> > consider, additionally, implementing a similar test as part of

> > selftests/bpf (specifically in test_progs). We run test_progs

> > continuously in CI for every incoming patch/patchset, so it has much

> > higher chances of capturing any regressions.

> >

> > I'm also thinking that this MAX_TAIL_CALL_CNT change should probably

> > go into the bpf-next tree. First, this off-by-one behavior was around

> > for a while and it doesn't cause serious issues, even if abused. But

> > on the other hand, it will make your tail call tests fail, when

> > applied into bpf-next without your change. So I think we should apply

> > both into bpf-next.

>

> I can confirm that the off-by-one behaviour is present on arm. Below

> is the test output running on qemu. Test #4 calls itself recursively

> and increments a counter each time, so the correct result should be 1

> + MAX_TAIL_CALL_CNT.

>

> test_bpf: #0 Tail call leaf jited:1 71 PASS

> test_bpf: #1 Tail call 2 jited:1 134 PASS

> test_bpf: #2 Tail call 3 jited:1 164 PASS

> test_bpf: #3 Tail call 4 jited:1 257 PASS

> test_bpf: #4 Tail call error path, max count reached jited:1 ret 34 != 33 FAIL

> test_bpf: #5 Tail call error path, NULL target jited:1 114 PASS

> test_bpf: #6 Tail call error path, index out of range jited:1 112 PASS

> test_bpf: test_tail_calls: Summary: 6 PASSED, 1 FAILED, [7/7 JIT'ed]

>

> The MAX_TAIL_CALL_CNT constant is referenced in the following JITs.

>

> arch/arm64/net/bpf_jit_comp.c

> arch/arm/net/bpf_jit_32.c

> arch/mips/net/ebpf_jit.c

> arch/powerpc/net/bpf_jit_comp32.c

> arch/powerpc/net/bpf_jit_comp64.c

> arch/riscv/net/bpf_jit_comp32.c

> arch/riscv/net/bpf_jit_comp64.c

> arch/s390/net/bpf_jit_comp.c

> arch/sparc/net/bpf_jit_comp_64.c

> arch/x86/net/bpf_jit_comp32.c

> arch/x86/net/bpf_jit_comp.c

>

> The x86 JITs all pass the test, even though the comments are wrong.

> The comments can easily be fixed of course. For JITs that have the

> off-by-one behaviour, an easy fix would be to change all occurrences

> of MAX_TAIL_CALL_CNT to MAX_TAIL_CALL_CNT - 1. We must first know

> which JITs affected though.


If you are going to fix ARM, please send a fix to comments for x86 as well.

>

> The fix is easy but setting up the test is hard. It took me quite some

> time to get the qemu/arm setup up and running. If the same has to be

> done for arm64, mips64, powerpc, powerpc64, riscv32, risc64, sparc and

> s390, I will need some help with this. If someone already has a

> working setup for any of the systems, the test can be performed on

> that.

>


Unfortunately, I myself have only x86-64 setup. libbpf
CI/kernel-patches CI we use to run all tests are running selftests
against x86-64 only as well. There was temporarily halted effort to
add s390x support as well, but it's not done yet. No one yet
volunteered to set up any other platforms and I don't know if that's
possible and how hard it would be to do within Github Actions platform
we are currently using.

So in short, I understand the challenges of testing all those
platforms and I don't really expect any single person to do all that
work. I've applied your fix, please follow up with ARM and comment
fixes.

> Or perhaps there is a better way to do this? If I implement a similar

> test in selftest/bpf, that would trigger the CI when the patch is

> submitted and we will see which JITs we need to fix.


The other nice benefit of implementing this in selftest/bpf, besides
continuous testing, is that you write it in C, which allows you to
express much more complicated logic more easily.

>

> > On a related topic, please don't forget to include the target kernel

> > tree for your patches: [PATCH bpf] or [PATCH bpf-next].

>

> I'll add that! All patches I sent related to this are for the bpf-next tree.

>

> Johan
Johan Almbladh Aug. 5, 2021, 2:37 p.m. UTC | #9
On Mon, Aug 2, 2021 at 10:28 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>

> On Sun, Aug 1, 2021 at 1:38 AM Johan Almbladh

> <johan.almbladh@anyfinetworks.com> wrote:

> >

> > On Fri, Jul 30, 2021 at 12:48 AM Andrii Nakryiko

> > <andrii.nakryiko@gmail.com> wrote:

> > >

> > > On Thu, Jul 29, 2021 at 3:29 PM Andrii Nakryiko

> > > <andrii.nakryiko@gmail.com> wrote:

> > > >

> > > > On Thu, Jul 29, 2021 at 2:38 PM Johan Almbladh

> > > > <johan.almbladh@anyfinetworks.com> wrote:

> > > > >

> > > > > On Wed, Jul 28, 2021 at 9:13 PM Yonghong Song <yhs@fb.com> wrote:

> > > > > > I also checked arm/arm64 jit. I saw the following comments:

> > > > > >

> > > > > >          /* if (tail_call_cnt > MAX_TAIL_CALL_CNT)

> > > > > >           *      goto out;

> > > > > >           * tail_call_cnt++;

> > > > > >           */

> > > > > >

> > > > > > Maybe we have this MAX_TAIL_CALL_CNT + 1 issue

> > > > > > for arm/arm64 jit?

> > > > >

> > > > > That wouldn't be unreasonable. I don't have an arm or arm64 setup

> > > > > available right now, but I can try to test it in qemu.

> > > >

> > > > On a brief check, there seems to be quite a mess in terms of the code

> > > > and comments.

> > > >

> > > > E.g., in arch/x86/net/bpf_jit_comp32.c:

> > > >

> > > >         /*

> > > >          * if (tail_call_cnt > MAX_TAIL_CALL_CNT)

> > > >          *     goto out;

> > > >          */

> > > >

> > > >                             ^^^^ here comment is wrong

> > > >

> > > >         [...]

> > > >

> > > >         /* cmp edx,hi */

> > > >         EMIT3(0x83, add_1reg(0xF8, IA32_EBX), hi);

> > > >         EMIT2(IA32_JNE, 3);

> > > >         /* cmp ecx,lo */

> > > >         EMIT3(0x83, add_1reg(0xF8, IA32_ECX), lo);

> > > >

> > > >         /* ja out */

> > > >         EMIT2(IA32_JAE, jmp_label(jmp_label1, 2));

> > > >

> > > >         ^^^ JAE is >=, right? But the comment says JA.

> > > >

> > > >

> > > > As for arch/x86/net/bpf_jit_comp.c, both comment and the code seem to

> > > > do > MAX_TAIL_CALL_CNT, but you are saying JIT is correct. What am I

> > > > missing?

> > > >

> > > > Can you please check all the places where MAX_TAIL_CALL_CNT is used

> > > > throughout the code? Let's clean this up in one go.

> > > >

> > > > Also, given it's so easy to do this off-by-one error, can you please

> > > > add a negative test validating that 33 tail calls are not allowed? I

> > > > assume we have a positive test that allows exactly MAX_TAIL_CALL_CNT,

> > > > but please double-check that as well.

> > >

> > > Ok, I see that you've added this in your bpf tests patch set. Please

> > > consider, additionally, implementing a similar test as part of

> > > selftests/bpf (specifically in test_progs). We run test_progs

> > > continuously in CI for every incoming patch/patchset, so it has much

> > > higher chances of capturing any regressions.

> > >

> > > I'm also thinking that this MAX_TAIL_CALL_CNT change should probably

> > > go into the bpf-next tree. First, this off-by-one behavior was around

> > > for a while and it doesn't cause serious issues, even if abused. But

> > > on the other hand, it will make your tail call tests fail, when

> > > applied into bpf-next without your change. So I think we should apply

> > > both into bpf-next.

> >

> > I can confirm that the off-by-one behaviour is present on arm. Below

> > is the test output running on qemu. Test #4 calls itself recursively

> > and increments a counter each time, so the correct result should be 1

> > + MAX_TAIL_CALL_CNT.

> >

> > test_bpf: #0 Tail call leaf jited:1 71 PASS

> > test_bpf: #1 Tail call 2 jited:1 134 PASS

> > test_bpf: #2 Tail call 3 jited:1 164 PASS

> > test_bpf: #3 Tail call 4 jited:1 257 PASS

> > test_bpf: #4 Tail call error path, max count reached jited:1 ret 34 != 33 FAIL

> > test_bpf: #5 Tail call error path, NULL target jited:1 114 PASS

> > test_bpf: #6 Tail call error path, index out of range jited:1 112 PASS

> > test_bpf: test_tail_calls: Summary: 6 PASSED, 1 FAILED, [7/7 JIT'ed]

> >

> > The MAX_TAIL_CALL_CNT constant is referenced in the following JITs.

> >

> > arch/arm64/net/bpf_jit_comp.c

> > arch/arm/net/bpf_jit_32.c

> > arch/mips/net/ebpf_jit.c

> > arch/powerpc/net/bpf_jit_comp32.c

> > arch/powerpc/net/bpf_jit_comp64.c

> > arch/riscv/net/bpf_jit_comp32.c

> > arch/riscv/net/bpf_jit_comp64.c

> > arch/s390/net/bpf_jit_comp.c

> > arch/sparc/net/bpf_jit_comp_64.c

> > arch/x86/net/bpf_jit_comp32.c

> > arch/x86/net/bpf_jit_comp.c

> >

> > The x86 JITs all pass the test, even though the comments are wrong.

> > The comments can easily be fixed of course. For JITs that have the

> > off-by-one behaviour, an easy fix would be to change all occurrences

> > of MAX_TAIL_CALL_CNT to MAX_TAIL_CALL_CNT - 1. We must first know

> > which JITs affected though.

>

> If you are going to fix ARM, please send a fix to comments for x86 as well.

>

> >

> > The fix is easy but setting up the test is hard. It took me quite some

> > time to get the qemu/arm setup up and running. If the same has to be

> > done for arm64, mips64, powerpc, powerpc64, riscv32, risc64, sparc and

> > s390, I will need some help with this. If someone already has a

> > working setup for any of the systems, the test can be performed on

> > that.

> >

>

> Unfortunately, I myself have only x86-64 setup. libbpf

> CI/kernel-patches CI we use to run all tests are running selftests

> against x86-64 only as well. There was temporarily halted effort to

> add s390x support as well, but it's not done yet. No one yet

> volunteered to set up any other platforms and I don't know if that's

> possible and how hard it would be to do within Github Actions platform

> we are currently using.

>

> So in short, I understand the challenges of testing all those

> platforms and I don't really expect any single person to do all that

> work. I've applied your fix, please follow up with ARM and comment

> fixes.


Thanks! I will fix the ARM JIT and the comments, then submit an
updated patch set for the test suite with changes after Yonghong's
review.

My current test setup can easily cross-compile the kernel with busybox
as userspace. However, getting it to run on QEMU has required some
amount of detective work. Every platforms seems to be different in
terms of what to boot (vmlinux, zImage, bzImage), how to boot it (dtb,
bios, uBoot requirements) and QEMU vs Kconfig settings. Currently I
can run i386, x86_64, MIPS, MIPS64 and ARM under QEMU. I can verify
and if needed fix the JIT on some of the other platforms as well, if I
can get it to run on QEMU with a reasonable effort. However, I cannot
build for RISC-V since I don't have a toolchain for that. I build my
toolchains with crosstool-ng using libmusl, and the latter does not
currently support RISC-V.

As a side note, I think having a QEMU-compatible defconfig for each
platform would make it easier to test arch-specific code. It could
also be a first step towards fully automated arch-specific CI.

Sorry for being a bit slow to respond. I am currently travelling with
only sporadic access to e-mail.

>

> > Or perhaps there is a better way to do this? If I implement a similar

> > test in selftest/bpf, that would trigger the CI when the patch is

> > submitted and we will see which JITs we need to fix.

>

> The other nice benefit of implementing this in selftest/bpf, besides

> continuous testing, is that you write it in C, which allows you to

> express much more complicated logic more easily.

>

> >

> > > On a related topic, please don't forget to include the target kernel

> > > tree for your patches: [PATCH bpf] or [PATCH bpf-next].

> >

> > I'll add that! All patches I sent related to this are for the bpf-next tree.

> >

> > Johan
Andrii Nakryiko Aug. 5, 2021, 10:54 p.m. UTC | #10
On Thu, Aug 5, 2021 at 7:38 AM Johan Almbladh
<johan.almbladh@anyfinetworks.com> wrote:
>

> On Mon, Aug 2, 2021 at 10:28 PM Andrii Nakryiko

> <andrii.nakryiko@gmail.com> wrote:

> >

> > On Sun, Aug 1, 2021 at 1:38 AM Johan Almbladh

> > <johan.almbladh@anyfinetworks.com> wrote:

> > >

> > > On Fri, Jul 30, 2021 at 12:48 AM Andrii Nakryiko

> > > <andrii.nakryiko@gmail.com> wrote:

> > > >

> > > > On Thu, Jul 29, 2021 at 3:29 PM Andrii Nakryiko

> > > > <andrii.nakryiko@gmail.com> wrote:

> > > > >

> > > > > On Thu, Jul 29, 2021 at 2:38 PM Johan Almbladh

> > > > > <johan.almbladh@anyfinetworks.com> wrote:

> > > > > >

> > > > > > On Wed, Jul 28, 2021 at 9:13 PM Yonghong Song <yhs@fb.com> wrote:

> > > > > > > I also checked arm/arm64 jit. I saw the following comments:

> > > > > > >

> > > > > > >          /* if (tail_call_cnt > MAX_TAIL_CALL_CNT)

> > > > > > >           *      goto out;

> > > > > > >           * tail_call_cnt++;

> > > > > > >           */

> > > > > > >

> > > > > > > Maybe we have this MAX_TAIL_CALL_CNT + 1 issue

> > > > > > > for arm/arm64 jit?

> > > > > >

> > > > > > That wouldn't be unreasonable. I don't have an arm or arm64 setup

> > > > > > available right now, but I can try to test it in qemu.

> > > > >

> > > > > On a brief check, there seems to be quite a mess in terms of the code

> > > > > and comments.

> > > > >

> > > > > E.g., in arch/x86/net/bpf_jit_comp32.c:

> > > > >

> > > > >         /*

> > > > >          * if (tail_call_cnt > MAX_TAIL_CALL_CNT)

> > > > >          *     goto out;

> > > > >          */

> > > > >

> > > > >                             ^^^^ here comment is wrong

> > > > >

> > > > >         [...]

> > > > >

> > > > >         /* cmp edx,hi */

> > > > >         EMIT3(0x83, add_1reg(0xF8, IA32_EBX), hi);

> > > > >         EMIT2(IA32_JNE, 3);

> > > > >         /* cmp ecx,lo */

> > > > >         EMIT3(0x83, add_1reg(0xF8, IA32_ECX), lo);

> > > > >

> > > > >         /* ja out */

> > > > >         EMIT2(IA32_JAE, jmp_label(jmp_label1, 2));

> > > > >

> > > > >         ^^^ JAE is >=, right? But the comment says JA.

> > > > >

> > > > >

> > > > > As for arch/x86/net/bpf_jit_comp.c, both comment and the code seem to

> > > > > do > MAX_TAIL_CALL_CNT, but you are saying JIT is correct. What am I

> > > > > missing?

> > > > >

> > > > > Can you please check all the places where MAX_TAIL_CALL_CNT is used

> > > > > throughout the code? Let's clean this up in one go.

> > > > >

> > > > > Also, given it's so easy to do this off-by-one error, can you please

> > > > > add a negative test validating that 33 tail calls are not allowed? I

> > > > > assume we have a positive test that allows exactly MAX_TAIL_CALL_CNT,

> > > > > but please double-check that as well.

> > > >

> > > > Ok, I see that you've added this in your bpf tests patch set. Please

> > > > consider, additionally, implementing a similar test as part of

> > > > selftests/bpf (specifically in test_progs). We run test_progs

> > > > continuously in CI for every incoming patch/patchset, so it has much

> > > > higher chances of capturing any regressions.

> > > >

> > > > I'm also thinking that this MAX_TAIL_CALL_CNT change should probably

> > > > go into the bpf-next tree. First, this off-by-one behavior was around

> > > > for a while and it doesn't cause serious issues, even if abused. But

> > > > on the other hand, it will make your tail call tests fail, when

> > > > applied into bpf-next without your change. So I think we should apply

> > > > both into bpf-next.

> > >

> > > I can confirm that the off-by-one behaviour is present on arm. Below

> > > is the test output running on qemu. Test #4 calls itself recursively

> > > and increments a counter each time, so the correct result should be 1

> > > + MAX_TAIL_CALL_CNT.

> > >

> > > test_bpf: #0 Tail call leaf jited:1 71 PASS

> > > test_bpf: #1 Tail call 2 jited:1 134 PASS

> > > test_bpf: #2 Tail call 3 jited:1 164 PASS

> > > test_bpf: #3 Tail call 4 jited:1 257 PASS

> > > test_bpf: #4 Tail call error path, max count reached jited:1 ret 34 != 33 FAIL

> > > test_bpf: #5 Tail call error path, NULL target jited:1 114 PASS

> > > test_bpf: #6 Tail call error path, index out of range jited:1 112 PASS

> > > test_bpf: test_tail_calls: Summary: 6 PASSED, 1 FAILED, [7/7 JIT'ed]

> > >

> > > The MAX_TAIL_CALL_CNT constant is referenced in the following JITs.

> > >

> > > arch/arm64/net/bpf_jit_comp.c

> > > arch/arm/net/bpf_jit_32.c

> > > arch/mips/net/ebpf_jit.c

> > > arch/powerpc/net/bpf_jit_comp32.c

> > > arch/powerpc/net/bpf_jit_comp64.c

> > > arch/riscv/net/bpf_jit_comp32.c

> > > arch/riscv/net/bpf_jit_comp64.c

> > > arch/s390/net/bpf_jit_comp.c

> > > arch/sparc/net/bpf_jit_comp_64.c

> > > arch/x86/net/bpf_jit_comp32.c

> > > arch/x86/net/bpf_jit_comp.c

> > >

> > > The x86 JITs all pass the test, even though the comments are wrong.

> > > The comments can easily be fixed of course. For JITs that have the

> > > off-by-one behaviour, an easy fix would be to change all occurrences

> > > of MAX_TAIL_CALL_CNT to MAX_TAIL_CALL_CNT - 1. We must first know

> > > which JITs affected though.

> >

> > If you are going to fix ARM, please send a fix to comments for x86 as well.

> >

> > >

> > > The fix is easy but setting up the test is hard. It took me quite some

> > > time to get the qemu/arm setup up and running. If the same has to be

> > > done for arm64, mips64, powerpc, powerpc64, riscv32, risc64, sparc and

> > > s390, I will need some help with this. If someone already has a

> > > working setup for any of the systems, the test can be performed on

> > > that.

> > >

> >

> > Unfortunately, I myself have only x86-64 setup. libbpf

> > CI/kernel-patches CI we use to run all tests are running selftests

> > against x86-64 only as well. There was temporarily halted effort to

> > add s390x support as well, but it's not done yet. No one yet

> > volunteered to set up any other platforms and I don't know if that's

> > possible and how hard it would be to do within Github Actions platform

> > we are currently using.

> >

> > So in short, I understand the challenges of testing all those

> > platforms and I don't really expect any single person to do all that

> > work. I've applied your fix, please follow up with ARM and comment

> > fixes.

>

> Thanks! I will fix the ARM JIT and the comments, then submit an

> updated patch set for the test suite with changes after Yonghong's

> review.

>

> My current test setup can easily cross-compile the kernel with busybox

> as userspace. However, getting it to run on QEMU has required some

> amount of detective work. Every platforms seems to be different in

> terms of what to boot (vmlinux, zImage, bzImage), how to boot it (dtb,

> bios, uBoot requirements) and QEMU vs Kconfig settings. Currently I

> can run i386, x86_64, MIPS, MIPS64 and ARM under QEMU. I can verify


At some point I tried to setup MIPS and ARM qemu and eventually just
gave up. So if you have it figured out, it would be nice to document
and share the process somewhere, for future needs.

> and if needed fix the JIT on some of the other platforms as well, if I

> can get it to run on QEMU with a reasonable effort. However, I cannot

> build for RISC-V since I don't have a toolchain for that. I build my

> toolchains with crosstool-ng using libmusl, and the latter does not

> currently support RISC-V.

>

> As a side note, I think having a QEMU-compatible defconfig for each

> platform would make it easier to test arch-specific code. It could

> also be a first step towards fully automated arch-specific CI.

>

> Sorry for being a bit slow to respond. I am currently travelling with

> only sporadic access to e-mail.

>

> >

> > > Or perhaps there is a better way to do this? If I implement a similar

> > > test in selftest/bpf, that would trigger the CI when the patch is

> > > submitted and we will see which JITs we need to fix.

> >

> > The other nice benefit of implementing this in selftest/bpf, besides

> > continuous testing, is that you write it in C, which allows you to

> > express much more complicated logic more easily.

> >

> > >

> > > > On a related topic, please don't forget to include the target kernel

> > > > tree for your patches: [PATCH bpf] or [PATCH bpf-next].

> > >

> > > I'll add that! All patches I sent related to this are for the bpf-next tree.

> > >

> > > Johan