[RFC,bpf-next,0/9] allow HID-BPF to do device IOs

Message ID	20240209-hid-bpf-sleepable-v1-0-4cc895b5adbd@kernel.org
Headers	show Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A8568381AB; Fri, 9 Feb 2024 13:26:59 +0000 (UTC) From: Benjamin Tissoires <bentiss@kernel.org> Subject: [PATCH RFC bpf-next 0/9] allow HID-BPF to do device IOs Date: Fri, 09 Feb 2024 14:26:36 +0100 Message-Id: <20240209-hid-bpf-sleepable-v1-0-4cc895b5adbd@kernel.org> Precedence: bulk MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit To: Alexei Starovoitov <ast@kernel.org>, Daniel Borkmann <daniel@iogearbox.net>, John Fastabend <john.fastabend@gmail.com>, Andrii Nakryiko <andrii@kernel.org>, Martin KaFai Lau <martin.lau@linux.dev>, Eduard Zingerman <eddyz87@gmail.com>, Song Liu <song@kernel.org>, Yonghong Song <yonghong.song@linux.dev>, KP Singh <kpsingh@kernel.org>, Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>, Jiri Olsa <jolsa@kernel.org>, Jiri Kosina <jikos@kernel.org>, Benjamin Tissoires <benjamin.tissoires@redhat.com>, Jonathan Corbet <corbet@lwn.net>, Shuah Khan <shuah@kernel.org> Cc: bpf@vger.kernel.org, linux-kernel@vger.kernel.org, linux-input@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, Benjamin Tissoires <bentiss@kernel.org>
Series	allow HID-BPF to do device IOs \| expand [RFC,bpf-next,0/9] allow HID-BPF to do device IOs [RFC,bpf-next,1/9] bpf: allow more maps in sleepable bpf programs [RFC,bpf-next,2/9] HID: bpf/dispatch: regroup kfuncs definitions [RFC,bpf-next,3/9] HID: bpf: export hid_hw_output_report as a BPF kfunc [RFC,bpf-next,4/9] selftests/hid: Add test for hid_bpf_hw_output_report [RFC,bpf-next,5/9] HID: bpf: allow to inject HID event from BPF [RFC,bpf-next,6/9] selftests/hid: add tests for hid_bpf_input_report [RFC,bpf-next,7/9] HID: bpf: allow to defer work in a delayed workqueue [RFC,bpf-next,8/9] selftests/hid: add test for hid_bpf_schedule_delayed_work [RFC,bpf-next,9/9] selftests/hid: add another set of delayed work tests

Benjamin Tissoires Feb. 9, 2024, 1:26 p.m. UTC

[Putting this as a RFC because I'm pretty sure I'm not doing the things
correctly at the BPF level.]
[Also using bpf-next as the base tree as there will be conflicting
changes otherwise]

Ideally I'd like to have something similar to bpf_timers, but not
in soft IRQ context. So I'm emulating this with a sleepable
bpf_tail_call() (see "HID: bpf: allow to defer work in a delayed
workqueue").

Basically, I need to be able to defer a HID-BPF program for the
following reasons (from the aforementioned patch):
1. defer an event:
   Sometimes we receive an out of proximity event, but the device can not
   be trusted enough, and we need to ensure that we won't receive another
   one in the following n milliseconds. So we need to wait those n
   milliseconds, and eventually re-inject that event in the stack.

2. inject new events in reaction to one given event:
   We might want to transform one given event into several. This is the
   case for macro keys where a single key press is supposed to send
   a sequence of key presses. But this could also be used to patch a
   faulty behavior, if a device forgets to send a release event.

3. communicate with the device in reaction to one event:
   We might want to communicate back to the device after a given event.
   For example a device might send us an event saying that it came back
   from sleeping state and needs to be re-initialized.

Currently we can achieve that by keeping a userspace program around,
raise a bpf event, and let that userspace program inject the events and
commands.
However, we are just keeping that program alive as a daemon for just
scheduling commands. There is no logic in it, so it doesn't really justify
an actual userspace wakeup. So a kernel workqueue seems simpler to handle.

Besides the "this is insane to reimplement the wheel", the other part I'm
not sure is whether we can say that BPF maps of type prog_array and of
type queue/stack can be used in sleepable context.
I don't see any warning when running the test programs, but that's probably
not a guarantee I'm doing the things properly :)

Cheers,
Benjamin

Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
---
Benjamin Tissoires (9):
      bpf: allow more maps in sleepable bpf programs
      HID: bpf/dispatch: regroup kfuncs definitions
      HID: bpf: export hid_hw_output_report as a BPF kfunc
      selftests/hid: Add test for hid_bpf_hw_output_report
      HID: bpf: allow to inject HID event from BPF
      selftests/hid: add tests for hid_bpf_input_report
      HID: bpf: allow to defer work in a delayed workqueue
      selftests/hid: add test for hid_bpf_schedule_delayed_work
      selftests/hid: add another set of delayed work tests

 Documentation/hid/hid-bpf.rst                      |  26 +-
 drivers/hid/bpf/entrypoints/entrypoints.bpf.c      |   8 +
 drivers/hid/bpf/entrypoints/entrypoints.lskel.h    | 236 +++++++++------
 drivers/hid/bpf/hid_bpf_dispatch.c                 | 315 ++++++++++++++++-----
 drivers/hid/bpf/hid_bpf_jmp_table.c                |  34 ++-
 drivers/hid/hid-core.c                             |   2 +
 include/linux/hid_bpf.h                            |  10 +
 kernel/bpf/verifier.c                              |   3 +
 tools/testing/selftests/hid/hid_bpf.c              | 286 ++++++++++++++++++-
 tools/testing/selftests/hid/progs/hid.c            | 205 ++++++++++++++
 .../testing/selftests/hid/progs/hid_bpf_helpers.h  |   8 +
 11 files changed, 962 insertions(+), 171 deletions(-)
---
base-commit: 4f7a05917237b006ceae760507b3d15305769ade
change-id: 20240205-hid-bpf-sleepable-c01260fd91c4

Best regards,

Benjamin Tissoires Feb. 9, 2024, 4:26 p.m. UTC | #1

On Fri, Feb 9, 2024 at 4:42 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Benjamin Tissoires <bentiss@kernel.org> writes:
>
> > [Putting this as a RFC because I'm pretty sure I'm not doing the things
> > correctly at the BPF level.]
> > [Also using bpf-next as the base tree as there will be conflicting
> > changes otherwise]
> >
> > Ideally I'd like to have something similar to bpf_timers, but not
> > in soft IRQ context. So I'm emulating this with a sleepable
> > bpf_tail_call() (see "HID: bpf: allow to defer work in a delayed
> > workqueue").
>
> Why implement a new mechanism? Sounds like what you need is essentially
> the bpf_timer functionality, just running in a different context, right?

Heh, that's exactly why I put in a RFC :)

So yes, the bpf_timer approach is cleaner, but I need it in a
workqueue, as a hrtimer in a softIRQ would prevent me to kzalloc and
wait for the device.

> So why not just add a flag to the timer setup that controls the callback
> context? I've been toying with something similar for restarting XDP TX
> for my queueing patch series (though I'm not sure if this will actually
> end up being needed in the end):
>
> https://git.kernel.org/pub/scm/linux/kernel/git/toke/linux.git/commit/?h=xdp-queueing-08&id=54bc201a358d1ac6ebfe900099315bbd0a76e862
>

Oh, nice. Good idea. But would it be OK to have a "timer-like" where
it actually defers the job in a workqueue instead of using an hrtimer?

I thought I would have to rewrite the entire bpf_timer approach
without the softIRQ, but if I can just add a new flag, that will make
things way simpler for me.

This however raises another issue if I were to use the bpf_timers: now
the HID-BPF kfuncs will not be available as they are only available to
tracing prog types. And when I tried to call them from a bpf_timer (in
softIRQ) they were not available.

Cheers,
Benjamin

Benjamin Tissoires Feb. 12, 2024, 4:47 p.m. UTC | #2

On Fri, Feb 9, 2024 at 6:05 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Benjamin Tissoires <benjamin.tissoires@redhat.com> writes:
>
> > On Fri, Feb 9, 2024 at 4:42 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >>
> >> Benjamin Tissoires <bentiss@kernel.org> writes:
> >>
> >> > [Putting this as a RFC because I'm pretty sure I'm not doing the things
> >> > correctly at the BPF level.]
> >> > [Also using bpf-next as the base tree as there will be conflicting
> >> > changes otherwise]
> >> >
> >> > Ideally I'd like to have something similar to bpf_timers, but not
> >> > in soft IRQ context. So I'm emulating this with a sleepable
> >> > bpf_tail_call() (see "HID: bpf: allow to defer work in a delayed
> >> > workqueue").
> >>
> >> Why implement a new mechanism? Sounds like what you need is essentially
> >> the bpf_timer functionality, just running in a different context, right?
> >
> > Heh, that's exactly why I put in a RFC :)
> >
> > So yes, the bpf_timer approach is cleaner, but I need it in a
> > workqueue, as a hrtimer in a softIRQ would prevent me to kzalloc and
> > wait for the device.
>
> Right, makes sense.
>
> >> So why not just add a flag to the timer setup that controls the callback
> >> context? I've been toying with something similar for restarting XDP TX
> >> for my queueing patch series (though I'm not sure if this will actually
> >> end up being needed in the end):
> >>
> >> https://git.kernel.org/pub/scm/linux/kernel/git/toke/linux.git/commit/?h=xdp-queueing-08&id=54bc201a358d1ac6ebfe900099315bbd0a76e862
> >>
> >
> > Oh, nice. Good idea. But would it be OK to have a "timer-like" where
> > it actually defers the job in a workqueue instead of using an hrtimer?
>
> That's conceptually still a timer, though, isn't it? I.e., it's a
> mechanism whereby you specify a callback and a delay, and bpf_timer
> ensures that your callback is called after that delay. IMO it's totally
> congruent with that API to be able to specify a different execution
> context as part of the timer setup.

Yep :)

There is still a problem I wasn't able to fix over the week end and
today. How can I tell the verifier that the callback is sleepable,
when the tracing function that called the timer_start() function is
not?
(more on that below).

>
> As for how to implement it, I suspect the easiest may be something
> similar to what the patch I linked above does: keep the hrtimer, and
> just have a different (kernel) callback function when the timer fires
> which does an immediate schedule_work() (without the _delayed) and then
> runs the BPF callback in that workqueue. I.e., keep the delay handling
> the way the existing bpf_timer implementation does it, and just add an
> indirection to start the workqueue in the kernel dispatch code.

Sounds good, especially given that's roughly how the delayed_timers
are implemented.

>
> > I thought I would have to rewrite the entire bpf_timer approach
> > without the softIRQ, but if I can just add a new flag, that will make
> > things way simpler for me.
>
> IMO that would be fine. You may want to wait for the maintainers to
> chime in before going down this route, though :)
>
> > This however raises another issue if I were to use the bpf_timers: now
> > the HID-BPF kfuncs will not be available as they are only available to
> > tracing prog types. And when I tried to call them from a bpf_timer (in
> > softIRQ) they were not available.
>
> IIUC, the bpf_timer callback is just a function (subprog) from the
> verifier PoV, so it is verified as whatever program type is creating the
> timer. So in other words, as long as you setup the timer from inside a
> tracing prog type, you should have access to all the same kfuncs, I
> think?

Yep, you are correct. But as mentioned above, I am now in trouble with
the sleepable state:
- I need to call timer_start() from a non sleepable tracing function
(I'm in hard IRQ when dealing with a physical device)
- but then, ideally, the callback function needs to be tagged as a
sleepable one, so I can export my kfuncs which are doing kzalloc and
device IO as such.

However, I can not really teach the BPF verifier to do so:
- it seems to check for the callback first when it is loaded, and
there is no SEC() equivalent for static functions
- libbpf doesn't have access to the callback as a prog as it has to be
a static function, and thus isn't exported as a full-blown prog.
- the verifier only checks for the callback when dealing with
BPF_FUNC_timer_set_callback, which doesn't have a "flag" argument
(though the validation of the callback has already been done while
checking it first, so we are already too late to change the sleppable
state of the callback)

Right now, the only OK-ish version I have is declaring the kfunc as
non-sleepable, but checking that we are in a different context than
the IRQ of the initial event. This way, I am not crashing if this
function is called from the initial IRQ, but will still crash if used
outside of the hid context.

This is not satisfactory, but I feel like it's going to be hard to
teach the verifier that the callback function is sleepable in that
case (maybe we could suffix the callback name, like we do for
arguments, but this is not very clean either).

Cheers,
Benjamin

Toke Høiland-Jørgensen Feb. 12, 2024, 5:46 p.m. UTC | #3

Benjamin Tissoires <benjamin.tissoires@redhat.com> writes:

[...]
>> IIUC, the bpf_timer callback is just a function (subprog) from the
>> verifier PoV, so it is verified as whatever program type is creating the
>> timer. So in other words, as long as you setup the timer from inside a
>> tracing prog type, you should have access to all the same kfuncs, I
>> think?
>
> Yep, you are correct. But as mentioned above, I am now in trouble with
> the sleepable state:
> - I need to call timer_start() from a non sleepable tracing function
> (I'm in hard IRQ when dealing with a physical device)
> - but then, ideally, the callback function needs to be tagged as a
> sleepable one, so I can export my kfuncs which are doing kzalloc and
> device IO as such.
>
> However, I can not really teach the BPF verifier to do so:
> - it seems to check for the callback first when it is loaded, and
> there is no SEC() equivalent for static functions
> - libbpf doesn't have access to the callback as a prog as it has to be
> a static function, and thus isn't exported as a full-blown prog.
> - the verifier only checks for the callback when dealing with
> BPF_FUNC_timer_set_callback, which doesn't have a "flag" argument
> (though the validation of the callback has already been done while
> checking it first, so we are already too late to change the sleppable
> state of the callback)
>
> Right now, the only OK-ish version I have is declaring the kfunc as
> non-sleepable, but checking that we are in a different context than
> the IRQ of the initial event. This way, I am not crashing if this
> function is called from the initial IRQ, but will still crash if used
> outside of the hid context.
>
> This is not satisfactory, but I feel like it's going to be hard to
> teach the verifier that the callback function is sleepable in that
> case (maybe we could suffix the callback name, like we do for
> arguments, but this is not very clean either).

The callback is only set once when the timer is first setup; I *think*
it works to do the setup (bpf_timer_init() and bpf_timer_set_callback())
in the context you need (from a sleepable prog), but do the arming
(bpf_timer_start()) from a different program that is not itself sleepable?

-Toke

Benjamin Tissoires Feb. 12, 2024, 6:20 p.m. UTC | #4

On Mon, Feb 12, 2024 at 6:46 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Benjamin Tissoires <benjamin.tissoires@redhat.com> writes:
>
> [...]
> >> IIUC, the bpf_timer callback is just a function (subprog) from the
> >> verifier PoV, so it is verified as whatever program type is creating the
> >> timer. So in other words, as long as you setup the timer from inside a
> >> tracing prog type, you should have access to all the same kfuncs, I
> >> think?
> >
> > Yep, you are correct. But as mentioned above, I am now in trouble with
> > the sleepable state:
> > - I need to call timer_start() from a non sleepable tracing function
> > (I'm in hard IRQ when dealing with a physical device)
> > - but then, ideally, the callback function needs to be tagged as a
> > sleepable one, so I can export my kfuncs which are doing kzalloc and
> > device IO as such.
> >
> > However, I can not really teach the BPF verifier to do so:
> > - it seems to check for the callback first when it is loaded, and
> > there is no SEC() equivalent for static functions
> > - libbpf doesn't have access to the callback as a prog as it has to be
> > a static function, and thus isn't exported as a full-blown prog.
> > - the verifier only checks for the callback when dealing with
> > BPF_FUNC_timer_set_callback, which doesn't have a "flag" argument
> > (though the validation of the callback has already been done while
> > checking it first, so we are already too late to change the sleppable
> > state of the callback)
> >
> > Right now, the only OK-ish version I have is declaring the kfunc as
> > non-sleepable, but checking that we are in a different context than
> > the IRQ of the initial event. This way, I am not crashing if this
> > function is called from the initial IRQ, but will still crash if used
> > outside of the hid context.
> >
> > This is not satisfactory, but I feel like it's going to be hard to
> > teach the verifier that the callback function is sleepable in that
> > case (maybe we could suffix the callback name, like we do for
> > arguments, but this is not very clean either).
>
> The callback is only set once when the timer is first setup; I *think*
> it works to do the setup (bpf_timer_init() and bpf_timer_set_callback())
> in the context you need (from a sleepable prog), but do the arming
> (bpf_timer_start()) from a different program that is not itself sleepable?
>

Genius! It works, and I can just keep having them declared as a
syscall kfunc, not as a tracing kfunc.

But isn't this an issue outside of my use case? I mean, if the
callback is assuming the environment for when it is set up but can be
called from any context there seems to be a problem when 2 contexts
are not equivalent, no?

Alexei Starovoitov Feb. 12, 2024, 9:24 p.m. UTC | #5

On Mon, Feb 12, 2024 at 10:21 AM Benjamin Tissoires
<benjamin.tissoires@redhat.com> wrote:
>
> On Mon, Feb 12, 2024 at 6:46 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >
> > Benjamin Tissoires <benjamin.tissoires@redhat.com> writes:
> >
> > [...]
> > >> IIUC, the bpf_timer callback is just a function (subprog) from the
> > >> verifier PoV, so it is verified as whatever program type is creating the
> > >> timer. So in other words, as long as you setup the timer from inside a
> > >> tracing prog type, you should have access to all the same kfuncs, I
> > >> think?
> > >
> > > Yep, you are correct. But as mentioned above, I am now in trouble with
> > > the sleepable state:
> > > - I need to call timer_start() from a non sleepable tracing function
> > > (I'm in hard IRQ when dealing with a physical device)
> > > - but then, ideally, the callback function needs to be tagged as a
> > > sleepable one, so I can export my kfuncs which are doing kzalloc and
> > > device IO as such.
> > >
> > > However, I can not really teach the BPF verifier to do so:
> > > - it seems to check for the callback first when it is loaded, and
> > > there is no SEC() equivalent for static functions
> > > - libbpf doesn't have access to the callback as a prog as it has to be
> > > a static function, and thus isn't exported as a full-blown prog.
> > > - the verifier only checks for the callback when dealing with
> > > BPF_FUNC_timer_set_callback, which doesn't have a "flag" argument
> > > (though the validation of the callback has already been done while
> > > checking it first, so we are already too late to change the sleppable
> > > state of the callback)
> > >
> > > Right now, the only OK-ish version I have is declaring the kfunc as
> > > non-sleepable, but checking that we are in a different context than
> > > the IRQ of the initial event. This way, I am not crashing if this
> > > function is called from the initial IRQ, but will still crash if used
> > > outside of the hid context.
> > >
> > > This is not satisfactory, but I feel like it's going to be hard to
> > > teach the verifier that the callback function is sleepable in that
> > > case (maybe we could suffix the callback name, like we do for
> > > arguments, but this is not very clean either).
> >
> > The callback is only set once when the timer is first setup; I *think*
> > it works to do the setup (bpf_timer_init() and bpf_timer_set_callback())
> > in the context you need (from a sleepable prog), but do the arming
> > (bpf_timer_start()) from a different program that is not itself sleepable?
> >
>
> Genius! It works, and I can just keep having them declared as a
> syscall kfunc, not as a tracing kfunc.
>
> But isn't this an issue outside of my use case? I mean, if the
> callback is assuming the environment for when it is set up but can be
> called from any context there seems to be a problem when 2 contexts
> are not equivalent, no?

I agree that workqueue delegation fits into the bpf_timer concept and
a lot of code can and should be shared.
All the lessons(bugs) learned with bpf_timer don't need to be re-discovered :)
Too bad, bpf_timer_set_callback() doesn't have a flag argument,
so we need a new kfunc to set a sleepable callback.
Maybe
bpf_timer_set_sleepable_cb() ?
The verifier will set is_async_cb = true for it (like it does for regular cb-s).
And since prog->aux->sleepable is kinda "global" we need another
per subprog flag:
bool is_sleepable: 1;

We can factor out a check "if (prog->aux->sleepable)" into a helper
that will check that "global" flag and another env->cur_state->in_sleepable
flag that will work similar to active_rcu_lock.
Once the verifier starts processing subprog->is_sleepable
it will set cur_state->in_sleepable = true;
to make all subprogs called from that cb to be recognized as sleepable too.

A bit of a challenge is what to do with global subprogs,
since they're verified lazily. They can be called from
sleepable and non-sleepable contex. Should be solvable.

Overall I think this feature is needed urgently,
so if you don't have cycles to work on this soon,
I can prioritize it right after bpf_arena work.

Benjamin Tissoires Feb. 13, 2024, 5:46 p.m. UTC | #6

On Feb 12 2024, Alexei Starovoitov wrote:
> On Mon, Feb 12, 2024 at 10:21 AM Benjamin Tissoires
> <benjamin.tissoires@redhat.com> wrote:
> >
> > On Mon, Feb 12, 2024 at 6:46 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> > >
> > > Benjamin Tissoires <benjamin.tissoires@redhat.com> writes:
> > >
[...]
> I agree that workqueue delegation fits into the bpf_timer concept and
> a lot of code can and should be shared.

Thanks Alexei for the detailed answer. I've given it an attempt but still can not
figure it out entirely.

> All the lessons(bugs) learned with bpf_timer don't need to be re-discovered :)
> Too bad, bpf_timer_set_callback() doesn't have a flag argument,
> so we need a new kfunc to set a sleepable callback.
> Maybe
> bpf_timer_set_sleepable_cb() ?

OK. So I guess I should drop Toke's suggestion with the bpf_timer_ini() flag?

> The verifier will set is_async_cb = true for it (like it does for regular cb-s).
> And since prog->aux->sleepable is kinda "global" we need another
> per subprog flag:
> bool is_sleepable: 1;

done (in push_callback_call())

> 
> We can factor out a check "if (prog->aux->sleepable)" into a helper
> that will check that "global" flag and another env->cur_state->in_sleepable
> flag that will work similar to active_rcu_lock.

done (I think), cf patch 2 below

> Once the verifier starts processing subprog->is_sleepable
> it will set cur_state->in_sleepable = true;
> to make all subprogs called from that cb to be recognized as sleepable too.

That's the point I don't know where to put the new code.

It seems the best place would be in do_check(), but I am under the impression
that the code of the callback is added at the end of the instruction list, meaning
that I do not know where it starts, and which subprog index it corresponds to.

> 
> A bit of a challenge is what to do with global subprogs,
> since they're verified lazily. They can be called from
> sleepable and non-sleepable contex. Should be solvable.

I must confess this is way over me (and given that I didn't even managed to make
the "easy" case working, that might explain things a little :-P )

> 
> Overall I think this feature is needed urgently,
> so if you don't have cycles to work on this soon,
> I can prioritize it right after bpf_arena work.

I can try to spare a few cycles on it. Even if your instructions were on
spot, I still can't make the subprogs recognized as sleepable.

For reference, this is where I am (probably bogus, but seems to be
working when timer_set_sleepable_cb() is called from a sleepable context
as mentioned by Toke):

---

Kumar Kartikeya Dwivedi Feb. 13, 2024, 7:23 p.m. UTC | #7

On Tue, 13 Feb 2024 at 18:46, Benjamin Tissoires <bentiss@kernel.org> wrote:
>
> On Feb 12 2024, Alexei Starovoitov wrote:
> > On Mon, Feb 12, 2024 at 10:21 AM Benjamin Tissoires
> > <benjamin.tissoires@redhat.com> wrote:
> > >
> > > On Mon, Feb 12, 2024 at 6:46 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> > > >
> > > > Benjamin Tissoires <benjamin.tissoires@redhat.com> writes:
> > > >
> [...]
> > I agree that workqueue delegation fits into the bpf_timer concept and
> > a lot of code can and should be shared.
>
> Thanks Alexei for the detailed answer. I've given it an attempt but still can not
> figure it out entirely.
>
> > All the lessons(bugs) learned with bpf_timer don't need to be re-discovered :)
> > Too bad, bpf_timer_set_callback() doesn't have a flag argument,
> > so we need a new kfunc to set a sleepable callback.
> > Maybe
> > bpf_timer_set_sleepable_cb() ?
>
> OK. So I guess I should drop Toke's suggestion with the bpf_timer_ini() flag?
>
> > The verifier will set is_async_cb = true for it (like it does for regular cb-s).
> > And since prog->aux->sleepable is kinda "global" we need another
> > per subprog flag:
> > bool is_sleepable: 1;
>
> done (in push_callback_call())
>
> >
> > We can factor out a check "if (prog->aux->sleepable)" into a helper
> > that will check that "global" flag and another env->cur_state->in_sleepable
> > flag that will work similar to active_rcu_lock.
>
> done (I think), cf patch 2 below
>
> > Once the verifier starts processing subprog->is_sleepable
> > it will set cur_state->in_sleepable = true;
> > to make all subprogs called from that cb to be recognized as sleepable too.
>
> That's the point I don't know where to put the new code.
>

I think that would go in the already existing special case for
push_async_cb where you get the verifier state of the async callback.
You can make setting the boolean in that verifier state conditional on
whether it's your kfunc/helper you're processing taking a sleepable
callback.

> It seems the best place would be in do_check(), but I am under the impression
> that the code of the callback is added at the end of the instruction list, meaning
> that I do not know where it starts, and which subprog index it corresponds to.
>
> >
> > A bit of a challenge is what to do with global subprogs,
> > since they're verified lazily. They can be called from
> > sleepable and non-sleepable contex. Should be solvable.
>
> I must confess this is way over me (and given that I didn't even managed to make
> the "easy" case working, that might explain things a little :-P )
>

I think it will be solvable but made somewhat difficult by the fact
that even if we mark subprog_info of some global_func A as
in_sleepable, so that we explore it as sleepable during its
verification, we might encounter later another global_func that calls
a global func, already explored as non-sleepable, in sleepable
context. In this case I think we need to redo the verification of that
global func as sleepable once again. It could be that it is called
from both non-sleepable and sleepable contexts, so both paths
(in_sleepable = true, and in_sleepable = false) need to be explored,
or we could reject such cases, but it might be a little restrictive.

Some common helper global func unrelated to caller context doing some
auxiliary work, called from sleepable timer callback and normal main
subprog might be an example where rejection will be prohibitive.

An approach might be to explore main and global subprogs once as we do
now, and then keep a list of global subprogs that need to be revisited
as in_sleepable (due to being called from a sleepable context) and
trigger do_check_common for them again, this might have to be repeated
as the list grows on each iteration, but eventually we will have
explored all of them as in_sleepable if need be, and the loop will
end. Surely, this trades off logical simplicity of verifier code with
redoing verification of global subprogs again.

To add items to such a list, for each global subprog we encounter that
needs to be analyzed as in_sleepable, we will also collect all its
callee global subprogs by walking its instructions (a bit like
check_max_stack_depth does).

> >
> > Overall I think this feature is needed urgently,
> > so if you don't have cycles to work on this soon,
> > I can prioritize it right after bpf_arena work.
>
> I can try to spare a few cycles on it. Even if your instructions were on
> spot, I still can't make the subprogs recognized as sleepable.
>
> For reference, this is where I am (probably bogus, but seems to be
> working when timer_set_sleepable_cb() is called from a sleepable context
> as mentioned by Toke):
>

I just skimmed the patch but I think it's already 90% there. The only
other change I would suggest is switching from helper to kfunc, as
originally proposed by Alexei.

> [...]

Toke Høiland-Jørgensen Feb. 13, 2024, 7:51 p.m. UTC | #8

Kumar Kartikeya Dwivedi <memxor@gmail.com> writes:

> On Tue, 13 Feb 2024 at 18:46, Benjamin Tissoires <bentiss@kernel.org> wrote:
>>
>> On Feb 12 2024, Alexei Starovoitov wrote:
>> > On Mon, Feb 12, 2024 at 10:21 AM Benjamin Tissoires
>> > <benjamin.tissoires@redhat.com> wrote:
>> > >
>> > > On Mon, Feb 12, 2024 at 6:46 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> > > >
>> > > > Benjamin Tissoires <benjamin.tissoires@redhat.com> writes:
>> > > >
>> [...]
>> > I agree that workqueue delegation fits into the bpf_timer concept and
>> > a lot of code can and should be shared.
>>
>> Thanks Alexei for the detailed answer. I've given it an attempt but still can not
>> figure it out entirely.
>>
>> > All the lessons(bugs) learned with bpf_timer don't need to be re-discovered :)
>> > Too bad, bpf_timer_set_callback() doesn't have a flag argument,
>> > so we need a new kfunc to set a sleepable callback.
>> > Maybe
>> > bpf_timer_set_sleepable_cb() ?
>>
>> OK. So I guess I should drop Toke's suggestion with the bpf_timer_ini() flag?
>>
>> > The verifier will set is_async_cb = true for it (like it does for regular cb-s).
>> > And since prog->aux->sleepable is kinda "global" we need another
>> > per subprog flag:
>> > bool is_sleepable: 1;
>>
>> done (in push_callback_call())
>>
>> >
>> > We can factor out a check "if (prog->aux->sleepable)" into a helper
>> > that will check that "global" flag and another env->cur_state->in_sleepable
>> > flag that will work similar to active_rcu_lock.
>>
>> done (I think), cf patch 2 below
>>
>> > Once the verifier starts processing subprog->is_sleepable
>> > it will set cur_state->in_sleepable = true;
>> > to make all subprogs called from that cb to be recognized as sleepable too.
>>
>> That's the point I don't know where to put the new code.
>>
>
> I think that would go in the already existing special case for
> push_async_cb where you get the verifier state of the async callback.
> You can make setting the boolean in that verifier state conditional on
> whether it's your kfunc/helper you're processing taking a sleepable
> callback.
>
>> It seems the best place would be in do_check(), but I am under the impression
>> that the code of the callback is added at the end of the instruction list, meaning
>> that I do not know where it starts, and which subprog index it corresponds to.
>>
>> >
>> > A bit of a challenge is what to do with global subprogs,
>> > since they're verified lazily. They can be called from
>> > sleepable and non-sleepable contex. Should be solvable.
>>
>> I must confess this is way over me (and given that I didn't even managed to make
>> the "easy" case working, that might explain things a little :-P )
>>
>
> I think it will be solvable but made somewhat difficult by the fact
> that even if we mark subprog_info of some global_func A as
> in_sleepable, so that we explore it as sleepable during its
> verification, we might encounter later another global_func that calls
> a global func, already explored as non-sleepable, in sleepable
> context. In this case I think we need to redo the verification of that
> global func as sleepable once again. It could be that it is called
> from both non-sleepable and sleepable contexts, so both paths
> (in_sleepable = true, and in_sleepable = false) need to be explored,
> or we could reject such cases, but it might be a little restrictive.
>
> Some common helper global func unrelated to caller context doing some
> auxiliary work, called from sleepable timer callback and normal main
> subprog might be an example where rejection will be prohibitive.
>
> An approach might be to explore main and global subprogs once as we do
> now, and then keep a list of global subprogs that need to be revisited
> as in_sleepable (due to being called from a sleepable context) and
> trigger do_check_common for them again, this might have to be repeated
> as the list grows on each iteration, but eventually we will have
> explored all of them as in_sleepable if need be, and the loop will
> end. Surely, this trades off logical simplicity of verifier code with
> redoing verification of global subprogs again.
>
> To add items to such a list, for each global subprog we encounter that
> needs to be analyzed as in_sleepable, we will also collect all its
> callee global subprogs by walking its instructions (a bit like
> check_max_stack_depth does).

Sorry if I'm being dense, but why is all this needed if it's already
possible to just define the timer callback from a program type that
allows sleeping, and then set the actual timeout from a different
program that is not sleepable? Isn't the set_sleepable_cb() kfunc just a
convenience then? Or did I misunderstand and it's not actually possible
to mix callback/timer arming from different program types?

-Toke

Alexei Starovoitov Feb. 13, 2024, 8:48 p.m. UTC | #9

On Tue, Feb 13, 2024 at 08:23:17PM +0100, Kumar Kartikeya Dwivedi wrote:
> On Tue, 13 Feb 2024 at 18:46, Benjamin Tissoires <bentiss@kernel.org> wrote:
> >
> > On Feb 12 2024, Alexei Starovoitov wrote:
> > > On Mon, Feb 12, 2024 at 10:21 AM Benjamin Tissoires
> > > <benjamin.tissoires@redhat.com> wrote:
> > > >
> > > > On Mon, Feb 12, 2024 at 6:46 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> > > > >
> > > > > Benjamin Tissoires <benjamin.tissoires@redhat.com> writes:
> > > > >
> > [...]
> > > I agree that workqueue delegation fits into the bpf_timer concept and
> > > a lot of code can and should be shared.
> >
> > Thanks Alexei for the detailed answer. I've given it an attempt but still can not
> > figure it out entirely.
> >
> > > All the lessons(bugs) learned with bpf_timer don't need to be re-discovered :)
> > > Too bad, bpf_timer_set_callback() doesn't have a flag argument,
> > > so we need a new kfunc to set a sleepable callback.
> > > Maybe
> > > bpf_timer_set_sleepable_cb() ?
> >
> > OK. So I guess I should drop Toke's suggestion with the bpf_timer_ini() flag?

I think the flag for bpf_timer_init() is still needed.
Since we probably shouldn't combine hrtimer with workqueue.
bpf_timer_start() should do schedule_work() directly.
The latency matters in some cases.
We will use flags to specify which workqueue strategy to use.
Fingers crossed, WQ_BH will be merged in the next merge window and we'd need to expose it
as an option to bpf progs.

> >
> > > The verifier will set is_async_cb = true for it (like it does for regular cb-s).
> > > And since prog->aux->sleepable is kinda "global" we need another
> > > per subprog flag:
> > > bool is_sleepable: 1;
> >
> > done (in push_callback_call())
> >
> > >
> > > We can factor out a check "if (prog->aux->sleepable)" into a helper
> > > that will check that "global" flag and another env->cur_state->in_sleepable
> > > flag that will work similar to active_rcu_lock.
> >
> > done (I think), cf patch 2 below
> >
> > > Once the verifier starts processing subprog->is_sleepable
> > > it will set cur_state->in_sleepable = true;
> > > to make all subprogs called from that cb to be recognized as sleepable too.
> >
> > That's the point I don't know where to put the new code.
> >
> 
> I think that would go in the already existing special case for
> push_async_cb where you get the verifier state of the async callback.
> You can make setting the boolean in that verifier state conditional on
> whether it's your kfunc/helper you're processing taking a sleepable
> callback.
> 
> > It seems the best place would be in do_check(), but I am under the impression
> > that the code of the callback is added at the end of the instruction list, meaning
> > that I do not know where it starts, and which subprog index it corresponds to.
> >
> > >
> > > A bit of a challenge is what to do with global subprogs,
> > > since they're verified lazily. They can be called from
> > > sleepable and non-sleepable contex. Should be solvable.
> >
> > I must confess this is way over me (and given that I didn't even managed to make
> > the "easy" case working, that might explain things a little :-P )
> >
> 
> I think it will be solvable but made somewhat difficult by the fact
> that even if we mark subprog_info of some global_func A as
> in_sleepable, so that we explore it as sleepable during its
> verification, we might encounter later another global_func that calls
> a global func, already explored as non-sleepable, in sleepable
> context. In this case I think we need to redo the verification of that
> global func as sleepable once again. It could be that it is called
> from both non-sleepable and sleepable contexts, so both paths
> (in_sleepable = true, and in_sleepable = false) need to be explored,
> or we could reject such cases, but it might be a little restrictive.
> 
> Some common helper global func unrelated to caller context doing some
> auxiliary work, called from sleepable timer callback and normal main
> subprog might be an example where rejection will be prohibitive.
> 
> An approach might be to explore main and global subprogs once as we do
> now, and then keep a list of global subprogs that need to be revisited
> as in_sleepable (due to being called from a sleepable context) and
> trigger do_check_common for them again, this might have to be repeated
> as the list grows on each iteration, but eventually we will have
> explored all of them as in_sleepable if need be, and the loop will
> end. Surely, this trades off logical simplicity of verifier code with
> redoing verification of global subprogs again.
> 
> To add items to such a list, for each global subprog we encounter that
> needs to be analyzed as in_sleepable, we will also collect all its
> callee global subprogs by walking its instructions (a bit like
> check_max_stack_depth does).

imo that would be a serious increase in the verifier complexity.
It's not clear whether this is needed at this point.
For now I would drop cur_state->in_sleepable to false before verifying global subprogs.

> > >
> > > Overall I think this feature is needed urgently,
> > > so if you don't have cycles to work on this soon,
> > > I can prioritize it right after bpf_arena work.
> >
> > I can try to spare a few cycles on it. Even if your instructions were on
> > spot, I still can't make the subprogs recognized as sleepable.
> >
> > For reference, this is where I am (probably bogus, but seems to be
> > working when timer_set_sleepable_cb() is called from a sleepable context
> > as mentioned by Toke):
> >
> 
> I just skimmed the patch but I think it's already 90% there. The only
> other change I would suggest is switching from helper to kfunc, as
> originally proposed by Alexei.

+1. I quickly looked through the patches and it does look like 90% there.

Alexei Starovoitov Feb. 13, 2024, 8:52 p.m. UTC | #10

On Tue, Feb 13, 2024 at 08:51:26PM +0100, Toke Høiland-Jørgensen wrote:
> Kumar Kartikeya Dwivedi <memxor@gmail.com> writes:
> 
> > On Tue, 13 Feb 2024 at 18:46, Benjamin Tissoires <bentiss@kernel.org> wrote:
> >>
> >> On Feb 12 2024, Alexei Starovoitov wrote:
> >> > On Mon, Feb 12, 2024 at 10:21 AM Benjamin Tissoires
> >> > <benjamin.tissoires@redhat.com> wrote:
> >> > >
> >> > > On Mon, Feb 12, 2024 at 6:46 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >> > > >
> >> > > > Benjamin Tissoires <benjamin.tissoires@redhat.com> writes:
> >> > > >
> >> [...]
> >> > I agree that workqueue delegation fits into the bpf_timer concept and
> >> > a lot of code can and should be shared.
> >>
> >> Thanks Alexei for the detailed answer. I've given it an attempt but still can not
> >> figure it out entirely.
> >>
> >> > All the lessons(bugs) learned with bpf_timer don't need to be re-discovered :)
> >> > Too bad, bpf_timer_set_callback() doesn't have a flag argument,
> >> > so we need a new kfunc to set a sleepable callback.
> >> > Maybe
> >> > bpf_timer_set_sleepable_cb() ?
> >>
> >> OK. So I guess I should drop Toke's suggestion with the bpf_timer_ini() flag?
> >>
> >> > The verifier will set is_async_cb = true for it (like it does for regular cb-s).
> >> > And since prog->aux->sleepable is kinda "global" we need another
> >> > per subprog flag:
> >> > bool is_sleepable: 1;
> >>
> >> done (in push_callback_call())
> >>
> >> >
> >> > We can factor out a check "if (prog->aux->sleepable)" into a helper
> >> > that will check that "global" flag and another env->cur_state->in_sleepable
> >> > flag that will work similar to active_rcu_lock.
> >>
> >> done (I think), cf patch 2 below
> >>
> >> > Once the verifier starts processing subprog->is_sleepable
> >> > it will set cur_state->in_sleepable = true;
> >> > to make all subprogs called from that cb to be recognized as sleepable too.
> >>
> >> That's the point I don't know where to put the new code.
> >>
> >
> > I think that would go in the already existing special case for
> > push_async_cb where you get the verifier state of the async callback.
> > You can make setting the boolean in that verifier state conditional on
> > whether it's your kfunc/helper you're processing taking a sleepable
> > callback.
> >
> >> It seems the best place would be in do_check(), but I am under the impression
> >> that the code of the callback is added at the end of the instruction list, meaning
> >> that I do not know where it starts, and which subprog index it corresponds to.
> >>
> >> >
> >> > A bit of a challenge is what to do with global subprogs,
> >> > since they're verified lazily. They can be called from
> >> > sleepable and non-sleepable contex. Should be solvable.
> >>
> >> I must confess this is way over me (and given that I didn't even managed to make
> >> the "easy" case working, that might explain things a little :-P )
> >>
> >
> > I think it will be solvable but made somewhat difficult by the fact
> > that even if we mark subprog_info of some global_func A as
> > in_sleepable, so that we explore it as sleepable during its
> > verification, we might encounter later another global_func that calls
> > a global func, already explored as non-sleepable, in sleepable
> > context. In this case I think we need to redo the verification of that
> > global func as sleepable once again. It could be that it is called
> > from both non-sleepable and sleepable contexts, so both paths
> > (in_sleepable = true, and in_sleepable = false) need to be explored,
> > or we could reject such cases, but it might be a little restrictive.
> >
> > Some common helper global func unrelated to caller context doing some
> > auxiliary work, called from sleepable timer callback and normal main
> > subprog might be an example where rejection will be prohibitive.
> >
> > An approach might be to explore main and global subprogs once as we do
> > now, and then keep a list of global subprogs that need to be revisited
> > as in_sleepable (due to being called from a sleepable context) and
> > trigger do_check_common for them again, this might have to be repeated
> > as the list grows on each iteration, but eventually we will have
> > explored all of them as in_sleepable if need be, and the loop will
> > end. Surely, this trades off logical simplicity of verifier code with
> > redoing verification of global subprogs again.
> >
> > To add items to such a list, for each global subprog we encounter that
> > needs to be analyzed as in_sleepable, we will also collect all its
> > callee global subprogs by walking its instructions (a bit like
> > check_max_stack_depth does).
> 
> Sorry if I'm being dense, but why is all this needed if it's already
> possible to just define the timer callback from a program type that
> allows sleeping, and then set the actual timeout from a different
> program that is not sleepable? Isn't the set_sleepable_cb() kfunc just a
> convenience then? Or did I misunderstand and it's not actually possible
> to mix callback/timer arming from different program types?

More than just convience.
bpf_set_sleepable_cb() might need to be called from non-sleepable and
there could be no way to hack it around with fake sleepable entry.
bpf_timer_cancel() clears callback_fn.
So if prog wants to bpf_timer_start() and later bpf_timer_cancel()
it would need to bpf_set_sleepable_cb() every time before bpf_timer_start().
And at that time it might be in non-sleepable ctx.

Toke Høiland-Jørgensen Feb. 14, 2024, 12:56 p.m. UTC | #11

Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Tue, Feb 13, 2024 at 08:51:26PM +0100, Toke Høiland-Jørgensen wrote:
>> Kumar Kartikeya Dwivedi <memxor@gmail.com> writes:
>> 
>> > On Tue, 13 Feb 2024 at 18:46, Benjamin Tissoires <bentiss@kernel.org> wrote:
>> >>
>> >> On Feb 12 2024, Alexei Starovoitov wrote:
>> >> > On Mon, Feb 12, 2024 at 10:21 AM Benjamin Tissoires
>> >> > <benjamin.tissoires@redhat.com> wrote:
>> >> > >
>> >> > > On Mon, Feb 12, 2024 at 6:46 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> >> > > >
>> >> > > > Benjamin Tissoires <benjamin.tissoires@redhat.com> writes:
>> >> > > >
>> >> [...]
>> >> > I agree that workqueue delegation fits into the bpf_timer concept and
>> >> > a lot of code can and should be shared.
>> >>
>> >> Thanks Alexei for the detailed answer. I've given it an attempt but still can not
>> >> figure it out entirely.
>> >>
>> >> > All the lessons(bugs) learned with bpf_timer don't need to be re-discovered :)
>> >> > Too bad, bpf_timer_set_callback() doesn't have a flag argument,
>> >> > so we need a new kfunc to set a sleepable callback.
>> >> > Maybe
>> >> > bpf_timer_set_sleepable_cb() ?
>> >>
>> >> OK. So I guess I should drop Toke's suggestion with the bpf_timer_ini() flag?
>> >>
>> >> > The verifier will set is_async_cb = true for it (like it does for regular cb-s).
>> >> > And since prog->aux->sleepable is kinda "global" we need another
>> >> > per subprog flag:
>> >> > bool is_sleepable: 1;
>> >>
>> >> done (in push_callback_call())
>> >>
>> >> >
>> >> > We can factor out a check "if (prog->aux->sleepable)" into a helper
>> >> > that will check that "global" flag and another env->cur_state->in_sleepable
>> >> > flag that will work similar to active_rcu_lock.
>> >>
>> >> done (I think), cf patch 2 below
>> >>
>> >> > Once the verifier starts processing subprog->is_sleepable
>> >> > it will set cur_state->in_sleepable = true;
>> >> > to make all subprogs called from that cb to be recognized as sleepable too.
>> >>
>> >> That's the point I don't know where to put the new code.
>> >>
>> >
>> > I think that would go in the already existing special case for
>> > push_async_cb where you get the verifier state of the async callback.
>> > You can make setting the boolean in that verifier state conditional on
>> > whether it's your kfunc/helper you're processing taking a sleepable
>> > callback.
>> >
>> >> It seems the best place would be in do_check(), but I am under the impression
>> >> that the code of the callback is added at the end of the instruction list, meaning
>> >> that I do not know where it starts, and which subprog index it corresponds to.
>> >>
>> >> >
>> >> > A bit of a challenge is what to do with global subprogs,
>> >> > since they're verified lazily. They can be called from
>> >> > sleepable and non-sleepable contex. Should be solvable.
>> >>
>> >> I must confess this is way over me (and given that I didn't even managed to make
>> >> the "easy" case working, that might explain things a little :-P )
>> >>
>> >
>> > I think it will be solvable but made somewhat difficult by the fact
>> > that even if we mark subprog_info of some global_func A as
>> > in_sleepable, so that we explore it as sleepable during its
>> > verification, we might encounter later another global_func that calls
>> > a global func, already explored as non-sleepable, in sleepable
>> > context. In this case I think we need to redo the verification of that
>> > global func as sleepable once again. It could be that it is called
>> > from both non-sleepable and sleepable contexts, so both paths
>> > (in_sleepable = true, and in_sleepable = false) need to be explored,
>> > or we could reject such cases, but it might be a little restrictive.
>> >
>> > Some common helper global func unrelated to caller context doing some
>> > auxiliary work, called from sleepable timer callback and normal main
>> > subprog might be an example where rejection will be prohibitive.
>> >
>> > An approach might be to explore main and global subprogs once as we do
>> > now, and then keep a list of global subprogs that need to be revisited
>> > as in_sleepable (due to being called from a sleepable context) and
>> > trigger do_check_common for them again, this might have to be repeated
>> > as the list grows on each iteration, but eventually we will have
>> > explored all of them as in_sleepable if need be, and the loop will
>> > end. Surely, this trades off logical simplicity of verifier code with
>> > redoing verification of global subprogs again.
>> >
>> > To add items to such a list, for each global subprog we encounter that
>> > needs to be analyzed as in_sleepable, we will also collect all its
>> > callee global subprogs by walking its instructions (a bit like
>> > check_max_stack_depth does).
>> 
>> Sorry if I'm being dense, but why is all this needed if it's already
>> possible to just define the timer callback from a program type that
>> allows sleeping, and then set the actual timeout from a different
>> program that is not sleepable? Isn't the set_sleepable_cb() kfunc just a
>> convenience then? Or did I misunderstand and it's not actually possible
>> to mix callback/timer arming from different program types?
>
> More than just convience.
> bpf_set_sleepable_cb() might need to be called from non-sleepable and
> there could be no way to hack it around with fake sleepable entry.
> bpf_timer_cancel() clears callback_fn.
> So if prog wants to bpf_timer_start() and later bpf_timer_cancel()
> it would need to bpf_set_sleepable_cb() every time before bpf_timer_start().
> And at that time it might be in non-sleepable ctx.

Ah, right, makes sense; didn't think about bpf_timer_cancel(). Thanks
for the explanation :)

-Toke

Benjamin Tissoires Feb. 14, 2024, 5:10 p.m. UTC | #12

On Feb 13 2024, Kumar Kartikeya Dwivedi wrote:
> On Tue, 13 Feb 2024 at 18:46, Benjamin Tissoires <bentiss@kernel.org> wrote:
> >
> > On Feb 12 2024, Alexei Starovoitov wrote:
> > > On Mon, Feb 12, 2024 at 10:21 AM Benjamin Tissoires
> > > <benjamin.tissoires@redhat.com> wrote:
> > > >
> > > > On Mon, Feb 12, 2024 at 6:46 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> > > > >
> > > > > Benjamin Tissoires <benjamin.tissoires@redhat.com> writes:
> > > > >
> > [...]
> > > I agree that workqueue delegation fits into the bpf_timer concept and
> > > a lot of code can and should be shared.
> >
> > Thanks Alexei for the detailed answer. I've given it an attempt but still can not
> > figure it out entirely.
> >
> > > All the lessons(bugs) learned with bpf_timer don't need to be re-discovered :)
> > > Too bad, bpf_timer_set_callback() doesn't have a flag argument,
> > > so we need a new kfunc to set a sleepable callback.
> > > Maybe
> > > bpf_timer_set_sleepable_cb() ?
> >
> > OK. So I guess I should drop Toke's suggestion with the bpf_timer_ini() flag?
> >
> > > The verifier will set is_async_cb = true for it (like it does for regular cb-s).
> > > And since prog->aux->sleepable is kinda "global" we need another
> > > per subprog flag:
> > > bool is_sleepable: 1;
> >
> > done (in push_callback_call())
> >
> > >
> > > We can factor out a check "if (prog->aux->sleepable)" into a helper
> > > that will check that "global" flag and another env->cur_state->in_sleepable
> > > flag that will work similar to active_rcu_lock.
> >
> > done (I think), cf patch 2 below
> >
> > > Once the verifier starts processing subprog->is_sleepable
> > > it will set cur_state->in_sleepable = true;
> > > to make all subprogs called from that cb to be recognized as sleepable too.
> >
> > That's the point I don't know where to put the new code.
> >
> 
> I think that would go in the already existing special case for
> push_async_cb where you get the verifier state of the async callback.
> You can make setting the boolean in that verifier state conditional on
> whether it's your kfunc/helper you're processing taking a sleepable
> callback.

Hehe, thanks a lot. Indeed, it was a simple fix. I tried to put this
everywhere but here, and with your help got it working in 2 mins :)

> 
> > It seems the best place would be in do_check(), but I am under the impression
> > that the code of the callback is added at the end of the instruction list, meaning
> > that I do not know where it starts, and which subprog index it corresponds to.
> >
> > >
> > > A bit of a challenge is what to do with global subprogs,
> > > since they're verified lazily. They can be called from
> > > sleepable and non-sleepable contex. Should be solvable.
> >
> > I must confess this is way over me (and given that I didn't even managed to make
> > the "easy" case working, that might explain things a little :-P )
> >
> 
> I think it will be solvable but made somewhat difficult by the fact
> that even if we mark subprog_info of some global_func A as
> in_sleepable, so that we explore it as sleepable during its
> verification, we might encounter later another global_func that calls
> a global func, already explored as non-sleepable, in sleepable
> context. In this case I think we need to redo the verification of that
> global func as sleepable once again. It could be that it is called
> from both non-sleepable and sleepable contexts, so both paths
> (in_sleepable = true, and in_sleepable = false) need to be explored,
> or we could reject such cases, but it might be a little restrictive.
> 
> Some common helper global func unrelated to caller context doing some
> auxiliary work, called from sleepable timer callback and normal main
> subprog might be an example where rejection will be prohibitive.
> 
> An approach might be to explore main and global subprogs once as we do
> now, and then keep a list of global subprogs that need to be revisited
> as in_sleepable (due to being called from a sleepable context) and
> trigger do_check_common for them again, this might have to be repeated
> as the list grows on each iteration, but eventually we will have
> explored all of them as in_sleepable if need be, and the loop will
> end. Surely, this trades off logical simplicity of verifier code with
> redoing verification of global subprogs again.
> 
> To add items to such a list, for each global subprog we encounter that
> needs to be analyzed as in_sleepable, we will also collect all its
> callee global subprogs by walking its instructions (a bit like
> check_max_stack_depth does).

FWIW, this (or Alexei's suggestion) is still not implemented in v2

> 
> > >
> > > Overall I think this feature is needed urgently,
> > > so if you don't have cycles to work on this soon,
> > > I can prioritize it right after bpf_arena work.
> >
> > I can try to spare a few cycles on it. Even if your instructions were on
> > spot, I still can't make the subprogs recognized as sleepable.
> >
> > For reference, this is where I am (probably bogus, but seems to be
> > working when timer_set_sleepable_cb() is called from a sleepable context
> > as mentioned by Toke):
> >
> 
> I just skimmed the patch but I think it's already 90% there. The only
> other change I would suggest is switching from helper to kfunc, as
> originally proposed by Alexei.

the kfunc was a rabbit hole:
- I needed to teach the verifier about BPF_TIMER in kfunc
- I needed to teach the verifier about the kfunc itself
- I'm failing at calling the callback :(

Anyway, I'm about to send a second RFC so we can discuss on the code and
see where my monkey patching capabilities are reaching their limits.

Cheers,
Benjamin

[RFC,bpf-next,0/9] allow HID-BPF to do device IOs

Message

Comments