mbox series

[RESEND,bpf-next,v4,00/11] Make check_func_arg type checks table driven

Message ID 20200921121227.255763-1-lmb@cloudflare.com
Headers show
Series Make check_func_arg type checks table driven | expand

Message

Lorenz Bauer Sept. 21, 2020, 12:12 p.m. UTC
I'm not sure why, but I missed sending the patchset to netdev@ last
week. I guess that is why it's slipped through the cracks.

Changes in v4:
- Output the desired type on BTF ID mismatch (Martin)

Changes in v3:
- Fix BTF_ID_LIST_SINGLE if BTF is disabled (Martin)
- Drop incorrect arg_btf_id in bpf_sk_storage.c (Martin)
- Check for arg_btf_id in check_func_proto (Martin)
- Drop incorrect PTR_TO_BTF_ID from fullsock_types (Martin)
- Introduce btf_seq_file_ids in bpf_trace.c to reduce duplication

Changes in v2:
- Make the series stand alone (Martin)
- Drop incorrect BTF_SET_START fix (Andrii)
- Only support a single BTF ID per argument (Martin)
- Introduce BTF_ID_LIST_SINGLE macro (Andrii)
- Skip check_ctx_reg iff register is NULL
- Change output of check_reg_type slightly, to avoid touching tests

Original cover letter:

Currently, check_func_arg has this pretty gnarly if statement that
compares the valid arg_type with the actualy reg_type. Sprinkled
in-between are checks for register_is_null, to short circuit these
tests if we're dealing with a nullable arg_type. There is also some
code for later bounds / access checking hidden away in there.

This series of patches refactors the function into something like this:

   if (reg_is_null && arg_type_is_nullable)
     skip type checking

   do type checking, including BTF validation

   do bounds / access checking

The type checking is now table driven, which makes it easy to extend
the acceptable types. Maybe more importantly, using a table makes it
easy to provide more helpful verifier output (see the last patch).

Lorenz Bauer (11):
  btf: make btf_set_contains take a const pointer
  bpf: check scalar or invalid register in check_helper_mem_access
  btf: Add BTF_ID_LIST_SINGLE macro
  bpf: allow specifying a BTF ID per argument in function protos
  bpf: make BTF pointer type checking generic
  bpf: make reference tracking generic
  bpf: make context access check generic
  bpf: set meta->raw_mode for pointers close to use
  bpf: check ARG_PTR_TO_SPINLOCK register type in check_func_arg
  bpf: hoist type checking for nullable arg types
  bpf: use a table to drive helper arg type checks

 include/linux/bpf.h            |  21 +-
 include/linux/btf_ids.h        |   8 +
 kernel/bpf/bpf_inode_storage.c |   8 +-
 kernel/bpf/btf.c               |  15 +-
 kernel/bpf/stackmap.c          |   5 +-
 kernel/bpf/verifier.c          | 338 ++++++++++++++++++---------------
 kernel/trace/bpf_trace.c       |  15 +-
 net/core/bpf_sk_storage.c      |   8 +-
 net/core/filter.c              |  31 +--
 net/ipv4/bpf_tcp_ca.c          |  19 +-
 tools/include/linux/btf_ids.h  |   8 +
 11 files changed, 239 insertions(+), 237 deletions(-)

Comments

Alexei Starovoitov Sept. 21, 2020, 10:23 p.m. UTC | #1
On Mon, Sep 21, 2020 at 01:12:27PM +0100, Lorenz Bauer wrote:
> +struct bpf_reg_types {
> +	const enum bpf_reg_type types[10];
> +};

any idea on how to make it more robust?

> +
> +static const struct bpf_reg_types *compatible_reg_types[] = {
> +	[ARG_PTR_TO_MAP_KEY]		= &map_key_value_types,
> +	[ARG_PTR_TO_MAP_VALUE]		= &map_key_value_types,
> +	[ARG_PTR_TO_UNINIT_MAP_VALUE]	= &map_key_value_types,
> +	[ARG_PTR_TO_MAP_VALUE_OR_NULL]	= &map_key_value_types,
> +	[ARG_CONST_SIZE]		= &scalar_types,
> +	[ARG_CONST_SIZE_OR_ZERO]	= &scalar_types,
> +	[ARG_CONST_ALLOC_SIZE_OR_ZERO]	= &scalar_types,
> +	[ARG_CONST_MAP_PTR]		= &const_map_ptr_types,
> +	[ARG_PTR_TO_CTX]		= &context_types,
> +	[ARG_PTR_TO_CTX_OR_NULL]	= &context_types,
> +	[ARG_PTR_TO_SOCK_COMMON]	= &sock_types,
> +	[ARG_PTR_TO_SOCKET]		= &fullsock_types,
> +	[ARG_PTR_TO_SOCKET_OR_NULL]	= &fullsock_types,
> +	[ARG_PTR_TO_BTF_ID]		= &btf_ptr_types,
> +	[ARG_PTR_TO_SPIN_LOCK]		= &spin_lock_types,
> +	[ARG_PTR_TO_MEM]		= &mem_types,
> +	[ARG_PTR_TO_MEM_OR_NULL]	= &mem_types,
> +	[ARG_PTR_TO_UNINIT_MEM]		= &mem_types,
> +	[ARG_PTR_TO_ALLOC_MEM]		= &alloc_mem_types,
> +	[ARG_PTR_TO_ALLOC_MEM_OR_NULL]	= &alloc_mem_types,
> +	[ARG_PTR_TO_INT]		= &int_ptr_types,
> +	[ARG_PTR_TO_LONG]		= &int_ptr_types,
> +	[__BPF_ARG_TYPE_MAX]		= NULL,

I don't understand what this extra value is for.
I tried:
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index fc5c901c7542..87b0d5dcc1ff 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -292,7 +292,6 @@ enum bpf_arg_type {
        ARG_PTR_TO_ALLOC_MEM,   /* pointer to dynamically allocated memory */
        ARG_PTR_TO_ALLOC_MEM_OR_NULL,   /* pointer to dynamically allocated memory or NULL */
        ARG_CONST_ALLOC_SIZE_OR_ZERO,   /* number of allocated bytes requested */
-       __BPF_ARG_TYPE_MAX,
 };

 /* type of values returned from helper functions */
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 15ab889b0a3f..83faa67858b6 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4025,7 +4025,6 @@ static const struct bpf_reg_types *compatible_reg_types[] = {
        [ARG_PTR_TO_ALLOC_MEM_OR_NULL]  = &alloc_mem_types,
        [ARG_PTR_TO_INT]                = &int_ptr_types,
        [ARG_PTR_TO_LONG]               = &int_ptr_types,
-       [__BPF_ARG_TYPE_MAX]            = NULL,
 };

and everything is fine as I think it should be.

> +	compatible = compatible_reg_types[arg_type];
> +	if (!compatible) {
> +		verbose(env, "verifier internal error: unsupported arg type %d\n", arg_type);
>  		return -EFAULT;
>  	}

This check will trigger the same way when somebody adds new ARG_* and doesn't add to the table.

>  
> +	err = check_reg_type(env, regno, compatible);
> +	if (err)
> +		return err;
> +
>  	if (type == PTR_TO_BTF_ID) {
>  		const u32 *btf_id = fn->arg_btf_id[arg];
>  
> @@ -4174,10 +4213,6 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
>  	}
>  
>  	return err;
> -err_type:
> -	verbose(env, "R%d type=%s expected=%s\n", regno,
> -		reg_type_str[type], reg_type_str[expected_type]);
> -	return -EACCES;

I'm not a fan of table driven checks. I think one explicit switch statement
would have been easier to read, but I guess we can convert back to it later if
table becomes too limiting. The improvement in the verifier output is important
and justifies this approach.

Applied to bpf-next. Thanks!
Lorenz Bauer Sept. 22, 2020, 8:20 a.m. UTC | #2
On Mon, 21 Sep 2020 at 23:23, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>

> On Mon, Sep 21, 2020 at 01:12:27PM +0100, Lorenz Bauer wrote:

> > +struct bpf_reg_types {

> > +     const enum bpf_reg_type types[10];

> > +};

>

> any idea on how to make it more robust?


I kind of copied this from the bpf_iter context. I prototyped using an
enum bpf_reg_type * and then terminating the array with NOT_INIT.
Writing this out is more involved, and might need some macro magic to
make it palatable. The current approach is a lot simpler, and I
figured that the compiler will error out if we ever exceed the 10
items.

>

> > +

> > +static const struct bpf_reg_types *compatible_reg_types[] = {

> > +     [ARG_PTR_TO_MAP_KEY]            = &map_key_value_types,

> > +     [ARG_PTR_TO_MAP_VALUE]          = &map_key_value_types,

> > +     [ARG_PTR_TO_UNINIT_MAP_VALUE]   = &map_key_value_types,

> > +     [ARG_PTR_TO_MAP_VALUE_OR_NULL]  = &map_key_value_types,

> > +     [ARG_CONST_SIZE]                = &scalar_types,

> > +     [ARG_CONST_SIZE_OR_ZERO]        = &scalar_types,

> > +     [ARG_CONST_ALLOC_SIZE_OR_ZERO]  = &scalar_types,

> > +     [ARG_CONST_MAP_PTR]             = &const_map_ptr_types,

> > +     [ARG_PTR_TO_CTX]                = &context_types,

> > +     [ARG_PTR_TO_CTX_OR_NULL]        = &context_types,

> > +     [ARG_PTR_TO_SOCK_COMMON]        = &sock_types,

> > +     [ARG_PTR_TO_SOCKET]             = &fullsock_types,

> > +     [ARG_PTR_TO_SOCKET_OR_NULL]     = &fullsock_types,

> > +     [ARG_PTR_TO_BTF_ID]             = &btf_ptr_types,

> > +     [ARG_PTR_TO_SPIN_LOCK]          = &spin_lock_types,

> > +     [ARG_PTR_TO_MEM]                = &mem_types,

> > +     [ARG_PTR_TO_MEM_OR_NULL]        = &mem_types,

> > +     [ARG_PTR_TO_UNINIT_MEM]         = &mem_types,

> > +     [ARG_PTR_TO_ALLOC_MEM]          = &alloc_mem_types,

> > +     [ARG_PTR_TO_ALLOC_MEM_OR_NULL]  = &alloc_mem_types,

> > +     [ARG_PTR_TO_INT]                = &int_ptr_types,

> > +     [ARG_PTR_TO_LONG]               = &int_ptr_types,

> > +     [__BPF_ARG_TYPE_MAX]            = NULL,

>

> I don't understand what this extra value is for.

> I tried:

> diff --git a/include/linux/bpf.h b/include/linux/bpf.h

> index fc5c901c7542..87b0d5dcc1ff 100644

> --- a/include/linux/bpf.h

> +++ b/include/linux/bpf.h

> @@ -292,7 +292,6 @@ enum bpf_arg_type {

>         ARG_PTR_TO_ALLOC_MEM,   /* pointer to dynamically allocated memory */

>         ARG_PTR_TO_ALLOC_MEM_OR_NULL,   /* pointer to dynamically allocated memory or NULL */

>         ARG_CONST_ALLOC_SIZE_OR_ZERO,   /* number of allocated bytes requested */

> -       __BPF_ARG_TYPE_MAX,

>  };

>

>  /* type of values returned from helper functions */

> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c

> index 15ab889b0a3f..83faa67858b6 100644

> --- a/kernel/bpf/verifier.c

> +++ b/kernel/bpf/verifier.c

> @@ -4025,7 +4025,6 @@ static const struct bpf_reg_types *compatible_reg_types[] = {

>         [ARG_PTR_TO_ALLOC_MEM_OR_NULL]  = &alloc_mem_types,

>         [ARG_PTR_TO_INT]                = &int_ptr_types,

>         [ARG_PTR_TO_LONG]               = &int_ptr_types,

> -       [__BPF_ARG_TYPE_MAX]            = NULL,

>  };

>

> and everything is fine as I think it should be.

>

> > +     compatible = compatible_reg_types[arg_type];

> > +     if (!compatible) {

> > +             verbose(env, "verifier internal error: unsupported arg type %d\n", arg_type);

> >               return -EFAULT;

> >       }

>

> This check will trigger the same way when somebody adds new ARG_* and doesn't add to the table.


I think in that case that value of compatible will be undefined, since
it points past the end of compatible_reg_types. Hence the
__BPF_ARG_TYPE_MAX to ensure that the array has a NULL slot for new
arg types.

>

> >

> > +     err = check_reg_type(env, regno, compatible);

> > +     if (err)

> > +             return err;

> > +

> >       if (type == PTR_TO_BTF_ID) {

> >               const u32 *btf_id = fn->arg_btf_id[arg];

> >

> > @@ -4174,10 +4213,6 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,

> >       }

> >

> >       return err;

> > -err_type:

> > -     verbose(env, "R%d type=%s expected=%s\n", regno,

> > -             reg_type_str[type], reg_type_str[expected_type]);

> > -     return -EACCES;

>

> I'm not a fan of table driven checks. I think one explicit switch statement

> would have been easier to read, but I guess we can convert back to it later if

> table becomes too limiting. The improvement in the verifier output is important

> and justifies this approach.

>

> Applied to bpf-next. Thanks!


Thank you!

--
Lorenz Bauer  |  Systems Engineer
6th Floor, County Hall/The Riverside Building, SE1 7PB, UK

www.cloudflare.com
Alexei Starovoitov Sept. 22, 2020, 8:07 p.m. UTC | #3
On Tue, Sep 22, 2020 at 09:20:27AM +0100, Lorenz Bauer wrote:
> On Mon, 21 Sep 2020 at 23:23, Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Mon, Sep 21, 2020 at 01:12:27PM +0100, Lorenz Bauer wrote:
> > > +struct bpf_reg_types {
> > > +     const enum bpf_reg_type types[10];
> > > +};
> >
> > any idea on how to make it more robust?
> 
> I kind of copied this from the bpf_iter context. I prototyped using an
> enum bpf_reg_type * and then terminating the array with NOT_INIT.
> Writing this out is more involved, and might need some macro magic to
> make it palatable. The current approach is a lot simpler, and I
> figured that the compiler will error out if we ever exceed the 10
> items.

The compiler will be silent if number of types is exactly 10,
but at run-time the loop will access out of bounds.

> >
> > > +
> > > +static const struct bpf_reg_types *compatible_reg_types[] = {
> > > +     [ARG_PTR_TO_MAP_KEY]            = &map_key_value_types,
> > > +     [ARG_PTR_TO_MAP_VALUE]          = &map_key_value_types,
> > > +     [ARG_PTR_TO_UNINIT_MAP_VALUE]   = &map_key_value_types,
> > > +     [ARG_PTR_TO_MAP_VALUE_OR_NULL]  = &map_key_value_types,
> > > +     [ARG_CONST_SIZE]                = &scalar_types,
> > > +     [ARG_CONST_SIZE_OR_ZERO]        = &scalar_types,
> > > +     [ARG_CONST_ALLOC_SIZE_OR_ZERO]  = &scalar_types,
> > > +     [ARG_CONST_MAP_PTR]             = &const_map_ptr_types,
> > > +     [ARG_PTR_TO_CTX]                = &context_types,
> > > +     [ARG_PTR_TO_CTX_OR_NULL]        = &context_types,
> > > +     [ARG_PTR_TO_SOCK_COMMON]        = &sock_types,
> > > +     [ARG_PTR_TO_SOCKET]             = &fullsock_types,
> > > +     [ARG_PTR_TO_SOCKET_OR_NULL]     = &fullsock_types,
> > > +     [ARG_PTR_TO_BTF_ID]             = &btf_ptr_types,
> > > +     [ARG_PTR_TO_SPIN_LOCK]          = &spin_lock_types,
> > > +     [ARG_PTR_TO_MEM]                = &mem_types,
> > > +     [ARG_PTR_TO_MEM_OR_NULL]        = &mem_types,
> > > +     [ARG_PTR_TO_UNINIT_MEM]         = &mem_types,
> > > +     [ARG_PTR_TO_ALLOC_MEM]          = &alloc_mem_types,
> > > +     [ARG_PTR_TO_ALLOC_MEM_OR_NULL]  = &alloc_mem_types,
> > > +     [ARG_PTR_TO_INT]                = &int_ptr_types,
> > > +     [ARG_PTR_TO_LONG]               = &int_ptr_types,
> > > +     [__BPF_ARG_TYPE_MAX]            = NULL,
> >
> > I don't understand what this extra value is for.
> > I tried:
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index fc5c901c7542..87b0d5dcc1ff 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -292,7 +292,6 @@ enum bpf_arg_type {
> >         ARG_PTR_TO_ALLOC_MEM,   /* pointer to dynamically allocated memory */
> >         ARG_PTR_TO_ALLOC_MEM_OR_NULL,   /* pointer to dynamically allocated memory or NULL */
> >         ARG_CONST_ALLOC_SIZE_OR_ZERO,   /* number of allocated bytes requested */
> > -       __BPF_ARG_TYPE_MAX,
> >  };
> >
> >  /* type of values returned from helper functions */
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 15ab889b0a3f..83faa67858b6 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -4025,7 +4025,6 @@ static const struct bpf_reg_types *compatible_reg_types[] = {
> >         [ARG_PTR_TO_ALLOC_MEM_OR_NULL]  = &alloc_mem_types,
> >         [ARG_PTR_TO_INT]                = &int_ptr_types,
> >         [ARG_PTR_TO_LONG]               = &int_ptr_types,
> > -       [__BPF_ARG_TYPE_MAX]            = NULL,
> >  };
> >
> > and everything is fine as I think it should be.
> >
> > > +     compatible = compatible_reg_types[arg_type];
> > > +     if (!compatible) {
> > > +             verbose(env, "verifier internal error: unsupported arg type %d\n", arg_type);
> > >               return -EFAULT;
> > >       }
> >
> > This check will trigger the same way when somebody adds new ARG_* and doesn't add to the table.
> 
> I think in that case that value of compatible will be undefined, since
> it points past the end of compatible_reg_types. Hence the
> __BPF_ARG_TYPE_MAX to ensure that the array has a NULL slot for new
> arg types.

I still don't see a point.
If anyone adds one more ARG_ to the end (or anywhere else)
the compatible_reg_types array will be zero inited in that place by the compiler.
Just like it does already for ARG_ANYTHING and ARG_DONTCARE.
Am I still missing something?
If not please follow up with removal of __BPF_ARG_TYPE_MAX.
Lorenz Bauer Sept. 23, 2020, 9:45 a.m. UTC | #4
On Tue, 22 Sep 2020 at 21:07, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>

> On Tue, Sep 22, 2020 at 09:20:27AM +0100, Lorenz Bauer wrote:

> > On Mon, 21 Sep 2020 at 23:23, Alexei Starovoitov

> > <alexei.starovoitov@gmail.com> wrote:

> > >

> > > On Mon, Sep 21, 2020 at 01:12:27PM +0100, Lorenz Bauer wrote:

> > > > +struct bpf_reg_types {

> > > > +     const enum bpf_reg_type types[10];

> > > > +};

> > >

> > > any idea on how to make it more robust?

> >

> > I kind of copied this from the bpf_iter context. I prototyped using an

> > enum bpf_reg_type * and then terminating the array with NOT_INIT.

> > Writing this out is more involved, and might need some macro magic to

> > make it palatable. The current approach is a lot simpler, and I

> > figured that the compiler will error out if we ever exceed the 10

> > items.

>

> The compiler will be silent if number of types is exactly 10,

> but at run-time the loop will access out of bounds.


Which loop do you refer to?

The one in check_reg_type shouldn't go out of bounds due to ARRAY_SIZE:

    for (i = 0; i < ARRAY_SIZE(compatible->types); i++) {
        expected = compatible->types[i];
        if (expected == NOT_INIT)
            break;

>

> > >

> > > > +

> > > > +static const struct bpf_reg_types *compatible_reg_types[] = {

> > > > +     [ARG_PTR_TO_MAP_KEY]            = &map_key_value_types,

> > > > +     [ARG_PTR_TO_MAP_VALUE]          = &map_key_value_types,

> > > > +     [ARG_PTR_TO_UNINIT_MAP_VALUE]   = &map_key_value_types,

> > > > +     [ARG_PTR_TO_MAP_VALUE_OR_NULL]  = &map_key_value_types,

> > > > +     [ARG_CONST_SIZE]                = &scalar_types,

> > > > +     [ARG_CONST_SIZE_OR_ZERO]        = &scalar_types,

> > > > +     [ARG_CONST_ALLOC_SIZE_OR_ZERO]  = &scalar_types,

> > > > +     [ARG_CONST_MAP_PTR]             = &const_map_ptr_types,

> > > > +     [ARG_PTR_TO_CTX]                = &context_types,

> > > > +     [ARG_PTR_TO_CTX_OR_NULL]        = &context_types,

> > > > +     [ARG_PTR_TO_SOCK_COMMON]        = &sock_types,

> > > > +     [ARG_PTR_TO_SOCKET]             = &fullsock_types,

> > > > +     [ARG_PTR_TO_SOCKET_OR_NULL]     = &fullsock_types,

> > > > +     [ARG_PTR_TO_BTF_ID]             = &btf_ptr_types,

> > > > +     [ARG_PTR_TO_SPIN_LOCK]          = &spin_lock_types,

> > > > +     [ARG_PTR_TO_MEM]                = &mem_types,

> > > > +     [ARG_PTR_TO_MEM_OR_NULL]        = &mem_types,

> > > > +     [ARG_PTR_TO_UNINIT_MEM]         = &mem_types,

> > > > +     [ARG_PTR_TO_ALLOC_MEM]          = &alloc_mem_types,

> > > > +     [ARG_PTR_TO_ALLOC_MEM_OR_NULL]  = &alloc_mem_types,

> > > > +     [ARG_PTR_TO_INT]                = &int_ptr_types,

> > > > +     [ARG_PTR_TO_LONG]               = &int_ptr_types,

> > > > +     [__BPF_ARG_TYPE_MAX]            = NULL,

> > >

> > > I don't understand what this extra value is for.

> > > I tried:

> > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h

> > > index fc5c901c7542..87b0d5dcc1ff 100644

> > > --- a/include/linux/bpf.h

> > > +++ b/include/linux/bpf.h

> > > @@ -292,7 +292,6 @@ enum bpf_arg_type {

> > >         ARG_PTR_TO_ALLOC_MEM,   /* pointer to dynamically allocated memory */

> > >         ARG_PTR_TO_ALLOC_MEM_OR_NULL,   /* pointer to dynamically allocated memory or NULL */

> > >         ARG_CONST_ALLOC_SIZE_OR_ZERO,   /* number of allocated bytes requested */

> > > -       __BPF_ARG_TYPE_MAX,

> > >  };

> > >

> > >  /* type of values returned from helper functions */

> > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c

> > > index 15ab889b0a3f..83faa67858b6 100644

> > > --- a/kernel/bpf/verifier.c

> > > +++ b/kernel/bpf/verifier.c

> > > @@ -4025,7 +4025,6 @@ static const struct bpf_reg_types *compatible_reg_types[] = {

> > >         [ARG_PTR_TO_ALLOC_MEM_OR_NULL]  = &alloc_mem_types,

> > >         [ARG_PTR_TO_INT]                = &int_ptr_types,

> > >         [ARG_PTR_TO_LONG]               = &int_ptr_types,

> > > -       [__BPF_ARG_TYPE_MAX]            = NULL,

> > >  };

> > >

> > > and everything is fine as I think it should be.

> > >

> > > > +     compatible = compatible_reg_types[arg_type];

> > > > +     if (!compatible) {

> > > > +             verbose(env, "verifier internal error: unsupported arg type %d\n", arg_type);

> > > >               return -EFAULT;

> > > >       }

> > >

> > > This check will trigger the same way when somebody adds new ARG_* and doesn't add to the table.

> >

> > I think in that case that value of compatible will be undefined, since

> > it points past the end of compatible_reg_types. Hence the

> > __BPF_ARG_TYPE_MAX to ensure that the array has a NULL slot for new

> > arg types.

>

> I still don't see a point.

> If anyone adds one more ARG_ to the end (or anywhere else)

> the compatible_reg_types array will be zero inited in that place by the compiler.

> Just like it does already for ARG_ANYTHING and ARG_DONTCARE.


I looked up designated initializers when I wrote this, since I wasn't
super familiar with them:
https://gcc.gnu.org/onlinedocs/gcc/Designated-Inits.html#Designated-Inits

    Note that the length of the array is the highest value specified plus one.

So ARG_ANYTHING and ARG_DONTCARE are OK since there is a higher enum
value present in the initializer. If someone adds a new item to enum
bpf_arg_type I assume they would add it to the end. In that case the
highest value of the initializer doesn't change, and then indexing
into compatible_reg_types with the new enum value would be out of
bounds. Adding __BPF_ARG_TYPE_MAX fixes that.

It's very possible I misunderstood how this whole contraption works,
happy to send a patch.

-- 
Lorenz Bauer  |  Systems Engineer
6th Floor, County Hall/The Riverside Building, SE1 7PB, UK

www.cloudflare.com
Alexei Starovoitov Sept. 23, 2020, 3:24 p.m. UTC | #5
On Wed, Sep 23, 2020 at 2:45 AM Lorenz Bauer <lmb@cloudflare.com> wrote:
>
> On Tue, 22 Sep 2020 at 21:07, Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Tue, Sep 22, 2020 at 09:20:27AM +0100, Lorenz Bauer wrote:
> > > On Mon, 21 Sep 2020 at 23:23, Alexei Starovoitov
> > > <alexei.starovoitov@gmail.com> wrote:
> > > >
> > > > On Mon, Sep 21, 2020 at 01:12:27PM +0100, Lorenz Bauer wrote:
> > > > > +struct bpf_reg_types {
> > > > > +     const enum bpf_reg_type types[10];
> > > > > +};
> > > >
> > > > any idea on how to make it more robust?
> > >
> > > I kind of copied this from the bpf_iter context. I prototyped using an
> > > enum bpf_reg_type * and then terminating the array with NOT_INIT.
> > > Writing this out is more involved, and might need some macro magic to
> > > make it palatable. The current approach is a lot simpler, and I
> > > figured that the compiler will error out if we ever exceed the 10
> > > items.
> >
> > The compiler will be silent if number of types is exactly 10,
> > but at run-time the loop will access out of bounds.
>
> Which loop do you refer to?
>
> The one in check_reg_type shouldn't go out of bounds due to ARRAY_SIZE:
>
>     for (i = 0; i < ARRAY_SIZE(compatible->types); i++) {

ahh. right. it will always be 10 here. got it.

>         expected = compatible->types[i];
>         if (expected == NOT_INIT)
>             break;
>
> >
> > > >
> > > > > +
> > > > > +static const struct bpf_reg_types *compatible_reg_types[] = {
> > > > > +     [ARG_PTR_TO_MAP_KEY]            = &map_key_value_types,
> > > > > +     [ARG_PTR_TO_MAP_VALUE]          = &map_key_value_types,
> > > > > +     [ARG_PTR_TO_UNINIT_MAP_VALUE]   = &map_key_value_types,
> > > > > +     [ARG_PTR_TO_MAP_VALUE_OR_NULL]  = &map_key_value_types,
> > > > > +     [ARG_CONST_SIZE]                = &scalar_types,
> > > > > +     [ARG_CONST_SIZE_OR_ZERO]        = &scalar_types,
> > > > > +     [ARG_CONST_ALLOC_SIZE_OR_ZERO]  = &scalar_types,
> > > > > +     [ARG_CONST_MAP_PTR]             = &const_map_ptr_types,
> > > > > +     [ARG_PTR_TO_CTX]                = &context_types,
> > > > > +     [ARG_PTR_TO_CTX_OR_NULL]        = &context_types,
> > > > > +     [ARG_PTR_TO_SOCK_COMMON]        = &sock_types,
> > > > > +     [ARG_PTR_TO_SOCKET]             = &fullsock_types,
> > > > > +     [ARG_PTR_TO_SOCKET_OR_NULL]     = &fullsock_types,
> > > > > +     [ARG_PTR_TO_BTF_ID]             = &btf_ptr_types,
> > > > > +     [ARG_PTR_TO_SPIN_LOCK]          = &spin_lock_types,
> > > > > +     [ARG_PTR_TO_MEM]                = &mem_types,
> > > > > +     [ARG_PTR_TO_MEM_OR_NULL]        = &mem_types,
> > > > > +     [ARG_PTR_TO_UNINIT_MEM]         = &mem_types,
> > > > > +     [ARG_PTR_TO_ALLOC_MEM]          = &alloc_mem_types,
> > > > > +     [ARG_PTR_TO_ALLOC_MEM_OR_NULL]  = &alloc_mem_types,
> > > > > +     [ARG_PTR_TO_INT]                = &int_ptr_types,
> > > > > +     [ARG_PTR_TO_LONG]               = &int_ptr_types,
> > > > > +     [__BPF_ARG_TYPE_MAX]            = NULL,
> > > >
> > > > I don't understand what this extra value is for.
> > > > I tried:
> > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > > index fc5c901c7542..87b0d5dcc1ff 100644
> > > > --- a/include/linux/bpf.h
> > > > +++ b/include/linux/bpf.h
> > > > @@ -292,7 +292,6 @@ enum bpf_arg_type {
> > > >         ARG_PTR_TO_ALLOC_MEM,   /* pointer to dynamically allocated memory */
> > > >         ARG_PTR_TO_ALLOC_MEM_OR_NULL,   /* pointer to dynamically allocated memory or NULL */
> > > >         ARG_CONST_ALLOC_SIZE_OR_ZERO,   /* number of allocated bytes requested */
> > > > -       __BPF_ARG_TYPE_MAX,
> > > >  };
> > > >
> > > >  /* type of values returned from helper functions */
> > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > > index 15ab889b0a3f..83faa67858b6 100644
> > > > --- a/kernel/bpf/verifier.c
> > > > +++ b/kernel/bpf/verifier.c
> > > > @@ -4025,7 +4025,6 @@ static const struct bpf_reg_types *compatible_reg_types[] = {
> > > >         [ARG_PTR_TO_ALLOC_MEM_OR_NULL]  = &alloc_mem_types,
> > > >         [ARG_PTR_TO_INT]                = &int_ptr_types,
> > > >         [ARG_PTR_TO_LONG]               = &int_ptr_types,
> > > > -       [__BPF_ARG_TYPE_MAX]            = NULL,
> > > >  };
> > > >
> > > > and everything is fine as I think it should be.
> > > >
> > > > > +     compatible = compatible_reg_types[arg_type];
> > > > > +     if (!compatible) {
> > > > > +             verbose(env, "verifier internal error: unsupported arg type %d\n", arg_type);
> > > > >               return -EFAULT;
> > > > >       }
> > > >
> > > > This check will trigger the same way when somebody adds new ARG_* and doesn't add to the table.
> > >
> > > I think in that case that value of compatible will be undefined, since
> > > it points past the end of compatible_reg_types. Hence the
> > > __BPF_ARG_TYPE_MAX to ensure that the array has a NULL slot for new
> > > arg types.
> >
> > I still don't see a point.
> > If anyone adds one more ARG_ to the end (or anywhere else)
> > the compatible_reg_types array will be zero inited in that place by the compiler.
> > Just like it does already for ARG_ANYTHING and ARG_DONTCARE.
>
> I looked up designated initializers when I wrote this, since I wasn't
> super familiar with them:
> https://gcc.gnu.org/onlinedocs/gcc/Designated-Inits.html#Designated-Inits
>
>     Note that the length of the array is the highest value specified plus one.
>
> So ARG_ANYTHING and ARG_DONTCARE are OK since there is a higher enum
> value present in the initializer. If someone adds a new item to enum
> bpf_arg_type I assume they would add it to the end. In that case the
> highest value of the initializer doesn't change, and then indexing
> into compatible_reg_types with the new enum value would be out of
> bounds. Adding __BPF_ARG_TYPE_MAX fixes that.

I see. Could you do this instead then:
-static const struct bpf_reg_types *compatible_reg_types[] = {
+static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
        [ARG_PTR_TO_MAP_KEY]            = &map_key_value_types,
        [ARG_PTR_TO_MAP_VALUE]          = &map_key_value_types,
        [ARG_PTR_TO_UNINIT_MAP_VALUE]   = &map_key_value_types,
@@ -4025,7 +4025,6 @@ static const struct bpf_reg_types
*compatible_reg_types[] = {
        [ARG_PTR_TO_ALLOC_MEM_OR_NULL]  = &alloc_mem_types,
        [ARG_PTR_TO_INT]                = &int_ptr_types,
        [ARG_PTR_TO_LONG]               = &int_ptr_types,
-       [__BPF_ARG_TYPE_MAX]            = NULL,
 };

That way is more obvious.
That =NULL initializer just for the last element and not for
ARG_ANYTHING/DONTCARE
bothered me enough to start this whole discussion.