[v2,bpf-next,09/16] libbpf: Support for fd_idx

Message ID	20210423002646.35043-10-alexei.starovoitov@gmail.com
State	New
Headers	show Return-Path: <netdev-owner@kernel.org> From: Alexei Starovoitov <alexei.starovoitov@gmail.com> To: davem@davemloft.net Cc: daniel@iogearbox.net, andrii@kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 09/16] libbpf: Support for fd_idx Date: Thu, 22 Apr 2021 17:26:39 -0700 Message-Id: <20210423002646.35043-10-alexei.starovoitov@gmail.com> In-Reply-To: <20210423002646.35043-1-alexei.starovoitov@gmail.com> References: <20210423002646.35043-1-alexei.starovoitov@gmail.com> Precedence: bulk
Series	bpf: syscall program, FD array, loader program, light skeleton. \| expand [v2,bpf-next,00/16] bpf: syscall program, FD array, loader program, light skeleton. [v2,bpf-next,02/16] bpf: Introduce bpfptr_t user/kernel pointer. [v2,bpf-next,04/16] libbpf: Support for syscall program type [v2,bpf-next,06/16] bpf: Make btf_load command to be bpfptr_t compatible. [v2,bpf-next,09/16] libbpf: Support for fd_idx [v2,bpf-next,10/16] bpf: Add bpf_btf_find_by_name_kind() helper. [v2,bpf-next,11/16] bpf: Add bpf_sys_close() helper. [v2,bpf-next,15/16] bpftool: Use syscall/loader program in "prog load" and "gen skeleton" command. [v2,bpf-next,16/16] selftests/bpf: Convert few tests to light skeleton.

Alexei Starovoitov April 23, 2021, 12:26 a.m. UTC

From: Alexei Starovoitov <ast@kernel.org>

Add support for FD_IDX make libbpf prefer that approach to loading programs.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/lib/bpf/bpf.c             |  1 +
 tools/lib/bpf/libbpf.c          | 70 +++++++++++++++++++++++++++++----
 tools/lib/bpf/libbpf_internal.h |  1 +
 3 files changed, 65 insertions(+), 7 deletions(-)

Andrii Nakryiko April 26, 2021, 5:14 p.m. UTC | #1

On Thu, Apr 22, 2021 at 5:27 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>

> From: Alexei Starovoitov <ast@kernel.org>

>

> Add support for FD_IDX make libbpf prefer that approach to loading programs.

>

> Signed-off-by: Alexei Starovoitov <ast@kernel.org>

> ---

>  tools/lib/bpf/bpf.c             |  1 +

>  tools/lib/bpf/libbpf.c          | 70 +++++++++++++++++++++++++++++----

>  tools/lib/bpf/libbpf_internal.h |  1 +

>  3 files changed, 65 insertions(+), 7 deletions(-)

>


[...]

> +static int probe_kern_fd_idx(void)

> +{

> +       struct bpf_load_program_attr attr;

> +       struct bpf_insn insns[] = {

> +               BPF_LD_IMM64_RAW(BPF_REG_0, BPF_PSEUDO_MAP_IDX, 0),

> +               BPF_EXIT_INSN(),

> +       };

> +

> +       memset(&attr, 0, sizeof(attr));

> +       attr.prog_type = BPF_PROG_TYPE_SOCKET_FILTER;

> +       attr.insns = insns;

> +       attr.insns_cnt = ARRAY_SIZE(insns);

> +       attr.license = "GPL";

> +

> +       probe_fd(bpf_load_program_xattr(&attr, NULL, 0));


probe_fd() calls close(fd) internally, which technically can interfere
with errno, though close() shouldn't be called because syscall has to
fail on correct kernels... So this should work, but I feel like
open-coding that logic is better than ignoring probe_fd() result.

> +       return errno == EPROTO;

> +}

> +


[...]

> @@ -7239,6 +7279,8 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)

>         struct bpf_program *prog;

>         size_t i;

>         int err;

> +       struct bpf_map *map;

> +       int *fd_array = NULL;

>

>         for (i = 0; i < obj->nr_programs; i++) {

>                 prog = &obj->programs[i];

> @@ -7247,6 +7289,16 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)

>                         return err;

>         }

>

> +       if (kernel_supports(FEAT_FD_IDX) && obj->nr_maps) {

> +               fd_array = malloc(sizeof(int) * obj->nr_maps);

> +               if (!fd_array)

> +                       return -ENOMEM;

> +               for (i = 0; i < obj->nr_maps; i++) {

> +                       map = &obj->maps[i];

> +                       fd_array[i] = map->fd;


nit: obj->maps[i].fd will keep it a single line

> +               }

> +       }

> +

>         for (i = 0; i < obj->nr_programs; i++) {

>                 prog = &obj->programs[i];

>                 if (prog_is_subprog(obj, prog))

> @@ -7256,10 +7308,14 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)

>                         continue;

>                 }

>                 prog->log_level |= log_level;

> +               prog->fd_array = fd_array;


you are not freeing this memory on success, as far as I can see. And
given multiple programs are sharing fd_array, it's a bit problematic
for prog to have fd_array. This is per-object properly, so let's add
it at bpf_object level and clean it up on bpf_object__close()? And by
assigning to obj->fd_array at malloc() site, you won't need to do all
the error-handling free()s below.

>                 err = bpf_program__load(prog, obj->license, obj->kern_version);

> -               if (err)

> +               if (err) {

> +                       free(fd_array);

>                         return err;

> +               }

>         }

> +       free(fd_array);

>         return 0;

>  }

>

> diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h

> index 6017902c687e..9114c7085f2a 100644

> --- a/tools/lib/bpf/libbpf_internal.h

> +++ b/tools/lib/bpf/libbpf_internal.h

> @@ -204,6 +204,7 @@ struct bpf_prog_load_params {

>         __u32 log_level;

>         char *log_buf;

>         size_t log_buf_sz;

> +       int *fd_array;

>  };

>

>  int libbpf__bpf_prog_load(const struct bpf_prog_load_params *load_attr);

> --

> 2.30.2

>

Alexei Starovoitov April 27, 2021, 2:53 a.m. UTC | #2

On Mon, Apr 26, 2021 at 10:14:45AM -0700, Andrii Nakryiko wrote:
> On Thu, Apr 22, 2021 at 5:27 PM Alexei Starovoitov

> <alexei.starovoitov@gmail.com> wrote:

> >

> > From: Alexei Starovoitov <ast@kernel.org>

> >

> > Add support for FD_IDX make libbpf prefer that approach to loading programs.

> >

> > Signed-off-by: Alexei Starovoitov <ast@kernel.org>

> > ---

> >  tools/lib/bpf/bpf.c             |  1 +

> >  tools/lib/bpf/libbpf.c          | 70 +++++++++++++++++++++++++++++----

> >  tools/lib/bpf/libbpf_internal.h |  1 +

> >  3 files changed, 65 insertions(+), 7 deletions(-)

> >

> 

> [...]

> 

> > +static int probe_kern_fd_idx(void)

> > +{

> > +       struct bpf_load_program_attr attr;

> > +       struct bpf_insn insns[] = {

> > +               BPF_LD_IMM64_RAW(BPF_REG_0, BPF_PSEUDO_MAP_IDX, 0),

> > +               BPF_EXIT_INSN(),

> > +       };

> > +

> > +       memset(&attr, 0, sizeof(attr));

> > +       attr.prog_type = BPF_PROG_TYPE_SOCKET_FILTER;

> > +       attr.insns = insns;

> > +       attr.insns_cnt = ARRAY_SIZE(insns);

> > +       attr.license = "GPL";

> > +

> > +       probe_fd(bpf_load_program_xattr(&attr, NULL, 0));

> 

> probe_fd() calls close(fd) internally, which technically can interfere

> with errno, though close() shouldn't be called because syscall has to

> fail on correct kernels... So this should work, but I feel like

> open-coding that logic is better than ignoring probe_fd() result.


It will fail on all kernels.
That probe_fd was a left over of earlier detection approach where it would
proceed to load all the way, but then I switched to:

> > +       return errno == EPROTO;


since such style of probing is much cheaper for the kernel and user space.
But point taken. Will open code it.

> > +}

> > +

> 

> [...]

> 

> > @@ -7239,6 +7279,8 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)

> >         struct bpf_program *prog;

> >         size_t i;

> >         int err;

> > +       struct bpf_map *map;

> > +       int *fd_array = NULL;

> >

> >         for (i = 0; i < obj->nr_programs; i++) {

> >                 prog = &obj->programs[i];

> > @@ -7247,6 +7289,16 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)

> >                         return err;

> >         }

> >

> > +       if (kernel_supports(FEAT_FD_IDX) && obj->nr_maps) {

> > +               fd_array = malloc(sizeof(int) * obj->nr_maps);

> > +               if (!fd_array)

> > +                       return -ENOMEM;

> > +               for (i = 0; i < obj->nr_maps; i++) {

> > +                       map = &obj->maps[i];

> > +                       fd_array[i] = map->fd;

> 

> nit: obj->maps[i].fd will keep it a single line

> 

> > +               }

> > +       }

> > +

> >         for (i = 0; i < obj->nr_programs; i++) {

> >                 prog = &obj->programs[i];

> >                 if (prog_is_subprog(obj, prog))

> > @@ -7256,10 +7308,14 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)

> >                         continue;

> >                 }

> >                 prog->log_level |= log_level;

> > +               prog->fd_array = fd_array;

> 

> you are not freeing this memory on success, as far as I can see. 


hmm. there is free on success below.

> And

> given multiple programs are sharing fd_array, it's a bit problematic

> for prog to have fd_array. This is per-object properly, so let's add

> it at bpf_object level and clean it up on bpf_object__close()? And by

> assigning to obj->fd_array at malloc() site, you won't need to do all

> the error-handling free()s below.


hmm. that sounds worse.
why add another 8 byte to bpf_object that won't be used
until this last step of bpf_object__load_progs.
And only for the duration of this loading.
It's cheaper to have this alloc here with two free()s below.

> 

> >                 err = bpf_program__load(prog, obj->license, obj->kern_version);

> > -               if (err)

> > +               if (err) {

> > +                       free(fd_array);

> >                         return err;

> > +               }

> >         }

> > +       free(fd_array);

> >         return 0;

> >  }

> >

> > diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h

> > index 6017902c687e..9114c7085f2a 100644

> > --- a/tools/lib/bpf/libbpf_internal.h

> > +++ b/tools/lib/bpf/libbpf_internal.h

> > @@ -204,6 +204,7 @@ struct bpf_prog_load_params {

> >         __u32 log_level;

> >         char *log_buf;

> >         size_t log_buf_sz;

> > +       int *fd_array;

> >  };

> >

> >  int libbpf__bpf_prog_load(const struct bpf_prog_load_params *load_attr);

> > --

> > 2.30.2

> >


--

Andrii Nakryiko April 27, 2021, 4:36 p.m. UTC | #3

On Mon, Apr 26, 2021 at 7:53 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>

> On Mon, Apr 26, 2021 at 10:14:45AM -0700, Andrii Nakryiko wrote:

> > On Thu, Apr 22, 2021 at 5:27 PM Alexei Starovoitov

> > <alexei.starovoitov@gmail.com> wrote:

> > >

> > > From: Alexei Starovoitov <ast@kernel.org>

> > >

> > > Add support for FD_IDX make libbpf prefer that approach to loading programs.

> > >

> > > Signed-off-by: Alexei Starovoitov <ast@kernel.org>

> > > ---

> > >  tools/lib/bpf/bpf.c             |  1 +

> > >  tools/lib/bpf/libbpf.c          | 70 +++++++++++++++++++++++++++++----

> > >  tools/lib/bpf/libbpf_internal.h |  1 +

> > >  3 files changed, 65 insertions(+), 7 deletions(-)

> > >

> >


[...]

> > >         for (i = 0; i < obj->nr_programs; i++) {

> > >                 prog = &obj->programs[i];

> > >                 if (prog_is_subprog(obj, prog))

> > > @@ -7256,10 +7308,14 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)

> > >                         continue;

> > >                 }

> > >                 prog->log_level |= log_level;

> > > +               prog->fd_array = fd_array;

> >

> > you are not freeing this memory on success, as far as I can see.

>

> hmm. there is free on success below.


right, my bad, I somehow understood as if it was only for error case

>

> > And

> > given multiple programs are sharing fd_array, it's a bit problematic

> > for prog to have fd_array. This is per-object properly, so let's add

> > it at bpf_object level and clean it up on bpf_object__close()? And by

> > assigning to obj->fd_array at malloc() site, you won't need to do all

> > the error-handling free()s below.

>

> hmm. that sounds worse.

> why add another 8 byte to bpf_object that won't be used

> until this last step of bpf_object__load_progs.

> And only for the duration of this loading.

> It's cheaper to have this alloc here with two free()s below.


So if you care about extra 8 bytes, then it's even more efficient to
have just one obj->fd_array rather than N prog->fd_array, no? And it's
also not very clean that prog->fd_array will have a dangling pointer
to deallocated memory after bpf_object__load_progs().

But that brings the entire question of why use fd_array at all here?
Commit description doesn't explain why libbpf has to use fd_array and
why it should be preferred. What are the advantages justifying added
complexity and extra memory allocation/clean up? It also reduces test
coverage of the "old ways" that offer the same capabilities. I think
this should be part of the commit description, if we agree that
fd_array has to be used outside of the auto-generated loader program.


>

> >

> > >                 err = bpf_program__load(prog, obj->license, obj->kern_version);

> > > -               if (err)

> > > +               if (err) {

> > > +                       free(fd_array);

> > >                         return err;

> > > +               }

> > >         }

> > > +       free(fd_array);

> > >         return 0;

> > >  }

> > >

> > > diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h

> > > index 6017902c687e..9114c7085f2a 100644

> > > --- a/tools/lib/bpf/libbpf_internal.h

> > > +++ b/tools/lib/bpf/libbpf_internal.h

> > > @@ -204,6 +204,7 @@ struct bpf_prog_load_params {

> > >         __u32 log_level;

> > >         char *log_buf;

> > >         size_t log_buf_sz;

> > > +       int *fd_array;

> > >  };

> > >

> > >  int libbpf__bpf_prog_load(const struct bpf_prog_load_params *load_attr);

> > > --

> > > 2.30.2

> > >

>

> --

Alexei Starovoitov April 28, 2021, 1:32 a.m. UTC | #4

On Tue, Apr 27, 2021 at 09:36:54AM -0700, Andrii Nakryiko wrote:
> On Mon, Apr 26, 2021 at 7:53 PM Alexei Starovoitov

> <alexei.starovoitov@gmail.com> wrote:

> >

> > On Mon, Apr 26, 2021 at 10:14:45AM -0700, Andrii Nakryiko wrote:

> > > On Thu, Apr 22, 2021 at 5:27 PM Alexei Starovoitov

> > > <alexei.starovoitov@gmail.com> wrote:

> > > >

> > > > From: Alexei Starovoitov <ast@kernel.org>

> > > >

> > > > Add support for FD_IDX make libbpf prefer that approach to loading programs.

> > > >

> > > > Signed-off-by: Alexei Starovoitov <ast@kernel.org>

> > > > ---

> > > >  tools/lib/bpf/bpf.c             |  1 +

> > > >  tools/lib/bpf/libbpf.c          | 70 +++++++++++++++++++++++++++++----

> > > >  tools/lib/bpf/libbpf_internal.h |  1 +

> > > >  3 files changed, 65 insertions(+), 7 deletions(-)

> > > >

> > >

> 

> [...]

> 

> > > >         for (i = 0; i < obj->nr_programs; i++) {

> > > >                 prog = &obj->programs[i];

> > > >                 if (prog_is_subprog(obj, prog))

> > > > @@ -7256,10 +7308,14 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)

> > > >                         continue;

> > > >                 }

> > > >                 prog->log_level |= log_level;

> > > > +               prog->fd_array = fd_array;

> > >

> > > you are not freeing this memory on success, as far as I can see.

> >

> > hmm. there is free on success below.

> 

> right, my bad, I somehow understood as if it was only for error case

> 

> >

> > > And

> > > given multiple programs are sharing fd_array, it's a bit problematic

> > > for prog to have fd_array. This is per-object properly, so let's add

> > > it at bpf_object level and clean it up on bpf_object__close()? And by

> > > assigning to obj->fd_array at malloc() site, you won't need to do all

> > > the error-handling free()s below.

> >

> > hmm. that sounds worse.

> > why add another 8 byte to bpf_object that won't be used

> > until this last step of bpf_object__load_progs.

> > And only for the duration of this loading.

> > It's cheaper to have this alloc here with two free()s below.

> 

> So if you care about extra 8 bytes, then it's even more efficient to

> have just one obj->fd_array rather than N prog->fd_array, no?


I think it's layer breaking when bpf_program__load()->load_program()
has to reach out to prog->obj to do its work.
The layers are already a mess due to:
&prog->obj->maps[prog->obj->rodata_map_idx]
I wanted to avoid making it uglier.

> And it's

> also not very clean that prog->fd_array will have a dangling pointer

> to deallocated memory after bpf_object__load_progs().


prog->reloc_desc is free and zeroed after __relocate.
prog->insns are freed and _not_ zereod after __load_progs.
so prog->fd_array won't be the first such pointer.
I can add zeroing, of course.

> 

> But that brings the entire question of why use fd_array at all here?

> Commit description doesn't explain why libbpf has to use fd_array and

> why it should be preferred. What are the advantages justifying added

> complexity and extra memory allocation/clean up? It also reduces test

> coverage of the "old ways" that offer the same capabilities. I think

> this should be part of the commit description, if we agree that

> fd_array has to be used outside of the auto-generated loader program.


I can add a knob to it to use it during loader gen for the loader gen
and for the runner of the loader prog.
I think it will add more complexity.
The bpf CI runs on older kernels, so the test coverage of "old ways"
is not reduced regardless.
From the kernel pov BPF_PSEUDO_MAP_FD vs BPF_PSEUDO_MAP_IDX there is
no advantage.
From the libbpf side patch 9 looked trivial enough _not_ do it conditionally,
but whatever. I don't mind more 'if'-s.

Andrii Nakryiko April 28, 2021, 6:40 p.m. UTC | #5

On Tue, Apr 27, 2021 at 6:32 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>

> On Tue, Apr 27, 2021 at 09:36:54AM -0700, Andrii Nakryiko wrote:

> > On Mon, Apr 26, 2021 at 7:53 PM Alexei Starovoitov

> > <alexei.starovoitov@gmail.com> wrote:

> > >

> > > On Mon, Apr 26, 2021 at 10:14:45AM -0700, Andrii Nakryiko wrote:

> > > > On Thu, Apr 22, 2021 at 5:27 PM Alexei Starovoitov

> > > > <alexei.starovoitov@gmail.com> wrote:

> > > > >

> > > > > From: Alexei Starovoitov <ast@kernel.org>

> > > > >

> > > > > Add support for FD_IDX make libbpf prefer that approach to loading programs.

> > > > >

> > > > > Signed-off-by: Alexei Starovoitov <ast@kernel.org>

> > > > > ---

> > > > >  tools/lib/bpf/bpf.c             |  1 +

> > > > >  tools/lib/bpf/libbpf.c          | 70 +++++++++++++++++++++++++++++----

> > > > >  tools/lib/bpf/libbpf_internal.h |  1 +

> > > > >  3 files changed, 65 insertions(+), 7 deletions(-)

> > > > >

> > > >

> >

> > [...]

> >

> > > > >         for (i = 0; i < obj->nr_programs; i++) {

> > > > >                 prog = &obj->programs[i];

> > > > >                 if (prog_is_subprog(obj, prog))

> > > > > @@ -7256,10 +7308,14 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)

> > > > >                         continue;

> > > > >                 }

> > > > >                 prog->log_level |= log_level;

> > > > > +               prog->fd_array = fd_array;

> > > >

> > > > you are not freeing this memory on success, as far as I can see.

> > >

> > > hmm. there is free on success below.

> >

> > right, my bad, I somehow understood as if it was only for error case

> >

> > >

> > > > And

> > > > given multiple programs are sharing fd_array, it's a bit problematic

> > > > for prog to have fd_array. This is per-object properly, so let's add

> > > > it at bpf_object level and clean it up on bpf_object__close()? And by

> > > > assigning to obj->fd_array at malloc() site, you won't need to do all

> > > > the error-handling free()s below.

> > >

> > > hmm. that sounds worse.

> > > why add another 8 byte to bpf_object that won't be used

> > > until this last step of bpf_object__load_progs.

> > > And only for the duration of this loading.

> > > It's cheaper to have this alloc here with two free()s below.

> >

> > So if you care about extra 8 bytes, then it's even more efficient to

> > have just one obj->fd_array rather than N prog->fd_array, no?

>

> I think it's layer breaking when bpf_program__load()->load_program()

> has to reach out to prog->obj to do its work.

> The layers are already a mess due to:

> &prog->obj->maps[prog->obj->rodata_map_idx]

> I wanted to avoid making it uglier.


I don't think it's breaking any layer. bpf_program is not an
independent entity from libbpf's point of view, it always belongs to
bpf_object. And there are bpf_object-scoped properties, shared across
all progs, like BTF, global variables, maps, license, etc.

It's another thing that bpf_program__load() just shouldn't be a public
API, and we are going to address that in libbpf 1.0.

>

> > And it's

> > also not very clean that prog->fd_array will have a dangling pointer

> > to deallocated memory after bpf_object__load_progs().

>

> prog->reloc_desc is free and zeroed after __relocate.

> prog->insns are freed and _not_ zereod after __load_progs.

> so prog->fd_array won't be the first such pointer.

> I can add zeroing, of course.


cool, it would be great to fix prog->insns to be zeroed out as well

>

> >

> > But that brings the entire question of why use fd_array at all here?

> > Commit description doesn't explain why libbpf has to use fd_array and

> > why it should be preferred. What are the advantages justifying added

> > complexity and extra memory allocation/clean up? It also reduces test

> > coverage of the "old ways" that offer the same capabilities. I think

> > this should be part of the commit description, if we agree that

> > fd_array has to be used outside of the auto-generated loader program.

>

> I can add a knob to it to use it during loader gen for the loader gen

> and for the runner of the loader prog.


So that's why I'm saying a better commit description is necessary. I
lost track, again, that those patched instructions with embedded
map_idx are assumed by prog loader program and then only fd_array is
modified in runtime by BPF loader program. Please, don't skim on
commit description, there are many moving pieces that are obvious only
in hindsight.

Getting back to code, given it's necessary for gen_loader only, I'd
switch out all those `kernel_supports(FEAT_FD_IDX)` checks with
`obj->gen_loader` and leave the current behavior as is. And we also
won't need to do FEAT_FD_IDX feature probing and extra memory
allocation at all. And bpf_load_and_run() uses fd_array
unconditionally without feature probing anyways.

> I think it will add more complexity.

> The bpf CI runs on older kernels, so the test coverage of "old ways"

> is not reduced regardless.


I'm the only one who checks that, and we keep shrinking the set of
tests that run on older kernels because we update existing ones with
dependencies on newer kernel features. So coverage is shrinking, but
basic stuff is still tested, of course.

> From the kernel pov BPF_PSEUDO_MAP_FD vs BPF_PSEUDO_MAP_IDX there is

> no advantage.

> From the libbpf side patch 9 looked trivial enough _not_ do it conditionally,

> but whatever. I don't mind more 'if'-s.


I do mind unnecessary ifs, that's not what I was proposing.

[v2,bpf-next,09/16] libbpf: Support for fd_idx

Commit Message

Comments

Patch