mbox series

[RFC,bpf-next,v3,00/16] sleepable bpf_timer (was: allow HID-BPF to do device IOs)

Message ID 20240221-hid-bpf-sleepable-v3-0-1fb378ca6301@kernel.org
Headers show
Series sleepable bpf_timer (was: allow HID-BPF to do device IOs) | expand

Message

Benjamin Tissoires Feb. 21, 2024, 4:25 p.m. UTC
[Partly a RFC/formal submission: there are still FIXMEs in the code]
[Also using bpf-next as the base tree for HID changes as there will
be conflicting changes otherwise, so I'm personaly fine for the HID
commits to go through bpf-next]

IMO, patches 1-3 and 9-14 are ready to go, rest is still pending review.

For reference, the use cases I have in mind:

---

Basically, I need to be able to defer a HID-BPF program for the
following reasons (from the aforementioned patch):
1. defer an event:
   Sometimes we receive an out of proximity event, but the device can not
   be trusted enough, and we need to ensure that we won't receive another
   one in the following n milliseconds. So we need to wait those n
   milliseconds, and eventually re-inject that event in the stack.

2. inject new events in reaction to one given event:
   We might want to transform one given event into several. This is the
   case for macro keys where a single key press is supposed to send
   a sequence of key presses. But this could also be used to patch a
   faulty behavior, if a device forgets to send a release event.

3. communicate with the device in reaction to one event:
   We might want to communicate back to the device after a given event.
   For example a device might send us an event saying that it came back
   from sleeping state and needs to be re-initialized.

Currently we can achieve that by keeping a userspace program around,
raise a bpf event, and let that userspace program inject the events and
commands.
However, we are just keeping that program alive as a daemon for just
scheduling commands. There is no logic in it, so it doesn't really justify
an actual userspace wakeup. So a kernel workqueue seems simpler to handle.

The other part I'm not sure is whether we can say that BPF maps of type
queue/stack can be used in sleepable context.
I don't see any warning when running the test programs, but that's probably
not a guarantee I'm doing the things properly :)

Cheers,
Benjamin

To: Alexei Starovoitov <ast@kernel.org>
To: Daniel Borkmann <daniel@iogearbox.net>
To: John Fastabend <john.fastabend@gmail.com>
To: Andrii Nakryiko <andrii@kernel.org>
To: Martin KaFai Lau <martin.lau@linux.dev>
To: Eduard Zingerman <eddyz87@gmail.com>
To: Song Liu <song@kernel.org>
To: Yonghong Song <yonghong.song@linux.dev>
To: KP Singh <kpsingh@kernel.org>
To: Stanislav Fomichev <sdf@google.com>
To: Hao Luo <haoluo@google.com>
To: Jiri Olsa <jolsa@kernel.org>
To: Jiri Kosina <jikos@kernel.org>
To: Benjamin Tissoires <benjamin.tissoires@redhat.com>
To: Jonathan Corbet <corbet@lwn.net>
To: Shuah Khan <shuah@kernel.org>
Cc:  <bpf@vger.kernel.org>
Cc:  <linux-kernel@vger.kernel.org>
Cc:  <linux-input@vger.kernel.org>
Cc:  <linux-doc@vger.kernel.org>
Cc:  <linux-kselftest@vger.kernel.org>
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>

---
Changes in v3:
- fixed the crash from v2
- changed the API to have only BPF_F_TIMER_SLEEPABLE for
  bpf_timer_start()
- split the new kfuncs/verifier patch into several sub-patches, for
  easier reviews
- Link to v2: https://lore.kernel.org/r/20240214-hid-bpf-sleepable-v2-0-5756b054724d@kernel.org

Changes in v2:
- make use of bpf_timer (and dropped the custom HID handling)
- implemented bpf_timer_set_sleepable_cb as a kfunc
- still not implemented global subprogs
- no sleepable bpf_timer selftests yet
- Link to v1: https://lore.kernel.org/r/20240209-hid-bpf-sleepable-v1-0-4cc895b5adbd@kernel.org

---
Benjamin Tissoires (16):
      bpf/verifier: allow more maps in sleepable bpf programs
      bpf/verifier: introduce in_sleepable() helper
      bpf/verifier: add is_async_callback_calling_insn() helper
      bpf/helpers: introduce sleepable bpf_timers
      bpf/verifier: add bpf_timer as a kfunc capable type
      bpf/helpers: introduce bpf_timer_set_sleepable_cb() kfunc
      bpf/helpers: mark the callback of bpf_timer_set_sleepable_cb() as sleepable
      bpf/verifier: do_misc_fixups for is_bpf_timer_set_sleepable_cb_kfunc
      HID: bpf/dispatch: regroup kfuncs definitions
      HID: bpf: export hid_hw_output_report as a BPF kfunc
      selftests/hid: Add test for hid_bpf_hw_output_report
      HID: bpf: allow to inject HID event from BPF
      selftests/hid: add tests for hid_bpf_input_report
      HID: bpf: allow to use bpf_timer_set_sleepable_cb() in tracing callbacks.
      selftests/hid: add test for bpf_timer
      selftests/hid: add KASAN to the VM tests

 Documentation/hid/hid-bpf.rst                      |   2 +-
 drivers/hid/bpf/hid_bpf_dispatch.c                 | 232 ++++++++++++++-------
 drivers/hid/hid-core.c                             |   2 +
 include/linux/bpf_verifier.h                       |   2 +
 include/linux/hid_bpf.h                            |   3 +
 include/uapi/linux/bpf.h                           |   4 +
 kernel/bpf/helpers.c                               | 140 +++++++++++--
 kernel/bpf/verifier.c                              | 114 ++++++++--
 tools/testing/selftests/hid/config.common          |   1 +
 tools/testing/selftests/hid/hid_bpf.c              | 195 ++++++++++++++++-
 tools/testing/selftests/hid/progs/hid.c            | 198 ++++++++++++++++++
 .../testing/selftests/hid/progs/hid_bpf_helpers.h  |   8 +
 12 files changed, 795 insertions(+), 106 deletions(-)
---
base-commit: 5c331823b3fc52ffd27524bf5b7e0d137114f470
change-id: 20240205-hid-bpf-sleepable-c01260fd91c4

Best regards,

Comments

Eduard Zingerman Feb. 22, 2024, 8:17 p.m. UTC | #1
On Wed, 2024-02-21 at 17:25 +0100, Benjamin Tissoires wrote:

[...]

> diff --git a/drivers/hid/bpf/hid_bpf_dispatch.c b/drivers/hid/bpf/hid_bpf_dispatch.c
> index e630caf644e8..52abb27426f4 100644
> --- a/drivers/hid/bpf/hid_bpf_dispatch.c
> +++ b/drivers/hid/bpf/hid_bpf_dispatch.c
> @@ -143,48 +143,6 @@ u8 *call_hid_bpf_rdesc_fixup(struct hid_device *hdev, u8 *rdesc, unsigned int *s
>  }
>  EXPORT_SYMBOL_GPL(call_hid_bpf_rdesc_fixup);
>  
> -/* Disables missing prototype warnings */
> -__bpf_kfunc_start_defs();

Note:
this patch does not apply on top of current bpf-next [0] because
__bpf_kfunc_start_defs and __bpf_kfunc are not present in [0].

[0] commit 58fd62e0aa50 ("bpf: Clarify batch lookup/lookup_and_delete semantics")

> -
> -/**
> - * hid_bpf_get_data - Get the kernel memory pointer associated with the context @ctx
> - *
> - * @ctx: The HID-BPF context
> - * @offset: The offset within the memory
> - * @rdwr_buf_size: the const size of the buffer
> - *
> - * @returns %NULL on error, an %__u8 memory pointer on success
> - */
> -__bpf_kfunc __u8 *
> -hid_bpf_get_data(struct hid_bpf_ctx *ctx, unsigned int offset, const size_t rdwr_buf_size)
> -{
> -	struct hid_bpf_ctx_kern *ctx_kern;
> -
> -	if (!ctx)
> -		return NULL;

[...]
Eduard Zingerman Feb. 23, 2024, 12:22 a.m. UTC | #2
On Wed, 2024-02-21 at 17:25 +0100, Benjamin Tissoires wrote:

[...]

> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index f81c799b2c80..2b11687063ff 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -5444,6 +5444,26 @@ static int check_map_access(struct bpf_verifier_env *env, u32 regno,
>  					return -EACCES;
>  				}
>  				break;
> +			case BPF_TIMER:
> +				/* FIXME: kptr does the above, should we use the same? */

I don't think so.
Basically this allows double word reads / writes from timer address,
which probably should not be allowed.

The ACCESS_DIRECT is passed to check_map_access() from
check_mem_access() and I don't see points where check_mem_access()
call would be triggered for pointer parameter of kfunc
(unless it is accompanied by a size parameter).

I tried the following simple program and it verifies fine:

    struct elem {
    	struct bpf_timer t;
    };

    struct {
    	__uint(type, BPF_MAP_TYPE_ARRAY);
    	__uint(max_entries, 2);
    	__type(key, int);
    	__type(value, struct elem);
    } array SEC(".maps");

    int bpf_timer_set_sleepable_cb
      (struct bpf_timer *timer,
       int (callback_fn)(void *map, int *key, struct bpf_timer *timer))
      __ksym __weak;

    static int cb_sleepable(void *map, int *key, struct bpf_timer *timer)
    {
    	return 0;
    }

    SEC("fentry/bpf_fentry_test5")
    int BPF_PROG2(test_sleepable, int, a)
    {
    	struct bpf_timer *arr_timer;
    	int array_key = ARRAY;

    	arr_timer = bpf_map_lookup_elem(&array, &array_key);
    	if (!arr_timer)
    		return 0;
    	bpf_timer_init(arr_timer, &array, CLOCK_MONOTONIC);

    	bpf_timer_set_sleepable_cb(arr_timer, cb_sleepable);
    	bpf_timer_start(arr_timer, 0, 0);

    	return 0;
    }

(in general, it would be easier to review if there were some test
 cases to play with).

> +				if (src != ACCESS_DIRECT) {
> +					verbose(env, "bpf_timer cannot be accessed indirectly by helper\n");
> +					return -EACCES;
> +				}
> +				if (!tnum_is_const(reg->var_off)) {
> +					verbose(env, "bpf_timer access cannot have variable offset\n");
> +					return -EACCES;
> +				}
> +				if (p != off + reg->var_off.value) {
> +					verbose(env, "bpf_timer access misaligned expected=%u off=%llu\n",
> +						p, off + reg->var_off.value);
> +					return -EACCES;
> +				}
> +				if (size != bpf_size_to_bytes(BPF_DW)) {
> +					verbose(env, "bpf_timer access size must be BPF_DW\n");
> +					return -EACCES;
> +				}
> +				break;
>  			default:
>  				verbose(env, "%s cannot be accessed directly by load/store\n",
>  					btf_field_type_name(field->type));

[...]
Eduard Zingerman Feb. 23, 2024, 12:26 a.m. UTC | #3
On Fri, 2024-02-23 at 02:22 +0200, Eduard Zingerman wrote:
[...]

> > +			case BPF_TIMER:
> > +				/* FIXME: kptr does the above, should we use the same? */

[...]

> I tried the following simple program and it verifies fine: 

Sorry, I meant that I tried it with the above check removed.
Eduard Zingerman Feb. 23, 2024, 2:54 p.m. UTC | #4
On Wed, 2024-02-21 at 17:25 +0100, Benjamin Tissoires wrote:
[...]

> @@ -11973,6 +12006,9 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
>  			if (ret)
>  				return ret;
>  			break;
> +		case KF_ARG_PTR_TO_TIMER:
> +			/* FIXME: should we do anything here? */
> +			break;

I think that here it is necessary to enforce that R1
is PTR_TO_MAP_VALUE and that it points to the timer field of the map value.

As is, the following program leads to in-kernel page fault when
printing verifier log:

--- 8< ----------------------------

struct elem {
	struct bpf_timer t;
};

struct {
	__uint(type, BPF_MAP_TYPE_ARRAY);
	__uint(max_entries, 2);
	__type(key, int);
	__type(value, struct elem);
} array SEC(".maps");

int bpf_timer_set_sleepable_cb
  (struct bpf_timer *timer,
   int (callback_fn)(void *map, int *key, struct bpf_timer *timer))
  __ksym __weak;

static int cb_sleepable(void *map, int *key, struct bpf_timer *timer)
{
	return 0;
}

SEC("fentry/bpf_fentry_test5")
int BPF_PROG2(test_sleepable, int, a)
{
	struct bpf_timer *arr_timer;
	int array_key = 1;

	arr_timer = bpf_map_lookup_elem(&array, &array_key);
	if (!arr_timer)
		return 0;
	bpf_timer_init(arr_timer, &array, CLOCK_MONOTONIC);
	bpf_timer_set_sleepable_cb((void *)&arr_timer, // note incorrrect pointer type!
				   cb_sleepable);
	bpf_timer_start(arr_timer, 0, 0);
	return 0;
}

---------------------------- >8 ---

I get the page fault when doing:

    $ ./veristat -l7 -vvv -f test_sleepable timer.bpf.o

[   21.014886] BUG: kernel NULL pointer dereference, address: 0000000000000060
...
[   21.015780] RIP: 0010:print_reg_state (kernel/bpf/log.c:715)

And here is a relevant fragment of print_reg_state():

713	if (type_is_map_ptr(t)) {
714		if (reg->map_ptr->name[0])
715			verbose_a("map=%s", reg->map_ptr->name);
716		verbose_a("ks=%d,vs=%d",
717			  reg->map_ptr->key_size,
718			  reg->map_ptr->value_size);
719	}

The error is caused by reg->map_ptr being NULL.
The code in check_kfunc_args() allows anything in R1,
including registers for which type is not pointer to map and reg->map_ptr is NULL.
When later the check_kfunc_call() is done it does push_callback_call():

12152		err = push_callback_call(env, insn, insn_idx, meta.subprogno,
12153					 set_timer_callback_state);

Which calls set_timer_callback_state(), that sets bogus state for R{1,2,3}:

9683 static int set_timer_callback_state(...)
9684 {
9685	struct bpf_map *map_ptr = caller->regs[BPF_REG_1].map_ptr;
9687
9688	/* bpf_timer_set_callback(struct bpf_timer *timer, void *callback_fn);
9689	 * callback_fn(struct bpf_map *map, void *key, void *value);
9690	 */
9691	callee->regs[BPF_REG_1].type = CONST_PTR_TO_MAP;
9692	__mark_reg_known_zero(&callee->regs[BPF_REG_1]);
9693	callee->regs[BPF_REG_1].map_ptr = map_ptr;
                                         ^^^^^^^^^
                                         This is NULL!
Eduard Zingerman Feb. 23, 2024, 3:35 p.m. UTC | #5
On Wed, 2024-02-21 at 17:25 +0100, Benjamin Tissoires wrote:

[...]

> @@ -626,6 +627,7 @@ struct bpf_subprog_info {
>  	bool is_async_cb: 1;
>  	bool is_exception_cb: 1;
>  	bool args_cached: 1;
> +	bool is_sleepable: 1;
>  
>  	u8 arg_cnt;
>  	struct bpf_subprog_arg_info args[MAX_BPF_FUNC_REG_ARGS];

[...]

> @@ -2421,6 +2424,7 @@ static struct bpf_verifier_state *push_async_cb(struct bpf_verifier_env *env,
>  	 * Initialize it similar to do_check_common().
>  	 */
>  	elem->st.branches = 1;
> +	elem->st.in_sleepable = env->subprog_info[subprog].is_sleepable;
>  	frame = kzalloc(sizeof(*frame), GFP_KERNEL);
>  	if (!frame)
>  		goto err;

[...]

> @@ -9478,6 +9483,7 @@ static int push_callback_call(struct bpf_verifier_env *env, struct bpf_insn *ins
>  
>  		/* there is no real recursion here. timer callbacks are async */
>  		env->subprog_info[subprog].is_async_cb = true;
> +		env->subprog_info[subprog].is_sleepable = is_bpf_timer_set_sleepable_cb_kfunc(insn->imm);
>  		async_cb = push_async_cb(env, env->subprog_info[subprog].start,
>  					 insn_idx, subprog);

I'd make is_sleepable a parameter for push_async_cb() instead of a field
in struct bpf_subprog_info.
I had to spend some time convincing myself that bpf_subprog_info->is_sleepable
does not have to be computed before do_check() in check_cfg(),
or what would happen if same callback is passed as both sleepable and
non-sleepable callback. These questions won't arise if this is a parameter.

[...]
Eduard Zingerman Feb. 23, 2024, 4:19 p.m. UTC | #6
On Wed, 2024-02-21 at 17:25 +0100, Benjamin Tissoires wrote:
> [Partly a RFC/formal submission: there are still FIXMEs in the code]
> [Also using bpf-next as the base tree for HID changes as there will
> be conflicting changes otherwise, so I'm personaly fine for the HID
> commits to go through bpf-next]

[...]

Could you please also add verifier selftests, e.g. extend
tools/testing/selftests/bpf/progs/timer.c       (bpf side)
tools/testing/selftests/bpf/prog_tests/timer.c  (userspace side triggering
                                                 bpf side)
Negative tests could be added in
tools/testing/selftests/bpf/progs/timer_failure.c

Please let me know if you need any help setting up local BPF test
environment, I have a short writeup on how to set it up in chroot.
Benjamin Tissoires Feb. 23, 2024, 7:42 p.m. UTC | #7
Hi,

On Feb 23 2024, Eduard Zingerman wrote:
> On Wed, 2024-02-21 at 17:25 +0100, Benjamin Tissoires wrote:
> > [Partly a RFC/formal submission: there are still FIXMEs in the code]
> > [Also using bpf-next as the base tree for HID changes as there will
> > be conflicting changes otherwise, so I'm personaly fine for the HID
> > commits to go through bpf-next]
> 
> [...]
> 
> Could you please also add verifier selftests, e.g. extend
> tools/testing/selftests/bpf/progs/timer.c       (bpf side)
> tools/testing/selftests/bpf/prog_tests/timer.c  (userspace side triggering
>                                                  bpf side)
> Negative tests could be added in
> tools/testing/selftests/bpf/progs/timer_failure.c
> 
> Please let me know if you need any help setting up local BPF test
> environment, I have a short writeup on how to set it up in chroot.

Thanks a lot for your review (and Alexei's). I was actually off today
and will be off next Monday too, but I'll work on those tests next week.

Cheers,
Benjamin
Benjamin Tissoires Feb. 23, 2024, 7:44 p.m. UTC | #8
On Feb 22 2024, Eduard Zingerman wrote:
> On Wed, 2024-02-21 at 17:25 +0100, Benjamin Tissoires wrote:
> 
> [...]
> 
> > diff --git a/drivers/hid/bpf/hid_bpf_dispatch.c b/drivers/hid/bpf/hid_bpf_dispatch.c
> > index e630caf644e8..52abb27426f4 100644
> > --- a/drivers/hid/bpf/hid_bpf_dispatch.c
> > +++ b/drivers/hid/bpf/hid_bpf_dispatch.c
> > @@ -143,48 +143,6 @@ u8 *call_hid_bpf_rdesc_fixup(struct hid_device *hdev, u8 *rdesc, unsigned int *s
> >  }
> >  EXPORT_SYMBOL_GPL(call_hid_bpf_rdesc_fixup);
> >  
> > -/* Disables missing prototype warnings */
> > -__bpf_kfunc_start_defs();
> 
> Note:
> this patch does not apply on top of current bpf-next [0] because
> __bpf_kfunc_start_defs and __bpf_kfunc are not present in [0].
> 
> [0] commit 58fd62e0aa50 ("bpf: Clarify batch lookup/lookup_and_delete semantics")

Right... this was in Linus' tree as a late 6.8-rcx addition. Depending
on how bpf-next will be rebased/merged, I'll see if I merge this
subseries through the HID tree or the BPF one.

Cheers,
Benjamin

> 
> > -
> > -/**
> > - * hid_bpf_get_data - Get the kernel memory pointer associated with the context @ctx
> > - *
> > - * @ctx: The HID-BPF context
> > - * @offset: The offset within the memory
> > - * @rdwr_buf_size: the const size of the buffer
> > - *
> > - * @returns %NULL on error, an %__u8 memory pointer on success
> > - */
> > -__bpf_kfunc __u8 *
> > -hid_bpf_get_data(struct hid_bpf_ctx *ctx, unsigned int offset, const size_t rdwr_buf_size)
> > -{
> > -	struct hid_bpf_ctx_kern *ctx_kern;
> > -
> > -	if (!ctx)
> > -		return NULL;
> 
> [...]