mbox series

[v3,bpf-next,0/5] bpf: Introduce minimal support for sleepable progs

Message ID 20200827220114.69225-1-alexei.starovoitov@gmail.com
Headers show
Series bpf: Introduce minimal support for sleepable progs | expand

Message

Alexei Starovoitov Aug. 27, 2020, 10:01 p.m. UTC
From: Alexei Starovoitov <ast@kernel.org>

v2->v3:
- switched to minimal allowlist approach. Essentially that means that syscall
  entry, few btrfs allow_error_inject functions, should_fail_bio(), and two LSM
  hooks: file_mprotect and bprm_committed_creds are the only hooks that allow
  attaching of sleepable BPF programs. When comprehensive analysis of LSM hooks
  will be done this allowlist will be extended.
- added patch 1 that fixes prototypes of two mm functions to reliably work with
  error injection. It's also necessary for resolve_btfids tool to recognize
  these two funcs, but that's secondary.

v1->v2:
- split fmod_ret fix into separate patch
- added denylist

v1:
This patch set introduces the minimal viable support for sleepable bpf programs.
In this patch only fentry/fexit/fmod_ret and lsm progs can be sleepable.
Only array and pre-allocated hash and lru maps allowed.

Here is 'perf report' difference of sleepable vs non-sleepable:
   3.86%  bench     [k] __srcu_read_unlock
   3.22%  bench     [k] __srcu_read_lock
   0.92%  bench     [k] bpf_prog_740d4210cdcd99a3_bench_trigger_fentry_sleep
   0.50%  bench     [k] bpf_trampoline_10297
   0.26%  bench     [k] __bpf_prog_exit_sleepable
   0.21%  bench     [k] __bpf_prog_enter_sleepable
vs
   0.88%  bench     [k] bpf_prog_740d4210cdcd99a3_bench_trigger_fentry
   0.84%  bench     [k] bpf_trampoline_10297
   0.13%  bench     [k] __bpf_prog_enter
   0.12%  bench     [k] __bpf_prog_exit
vs
   0.79%  bench     [k] bpf_prog_740d4210cdcd99a3_bench_trigger_fentry_sleep
   0.72%  bench     [k] bpf_trampoline_10381
   0.31%  bench     [k] __bpf_prog_exit_sleepable
   0.29%  bench     [k] __bpf_prog_enter_sleepable

Sleepable vs non-sleepable program invocation overhead is only marginally higher
due to rcu_trace. srcu approach is much slower.

Alexei Starovoitov (5):
  mm/error_inject: Fix allow_error_inject function signatures.
  bpf: Introduce sleepable BPF programs
  bpf: Add bpf_copy_from_user() helper.
  libbpf: support sleepable progs
  selftests/bpf: Add sleepable tests

 arch/x86/net/bpf_jit_comp.c                   | 32 +++++---
 include/linux/bpf.h                           |  4 +
 include/uapi/linux/bpf.h                      | 16 ++++
 init/Kconfig                                  |  1 +
 kernel/bpf/arraymap.c                         |  1 +
 kernel/bpf/hashtab.c                          | 12 +--
 kernel/bpf/helpers.c                          | 22 +++++
 kernel/bpf/syscall.c                          | 13 ++-
 kernel/bpf/trampoline.c                       | 28 ++++++-
 kernel/bpf/verifier.c                         | 81 ++++++++++++++++++-
 kernel/trace/bpf_trace.c                      |  2 +
 mm/filemap.c                                  |  8 +-
 mm/page_alloc.c                               |  2 +-
 tools/include/uapi/linux/bpf.h                | 16 ++++
 tools/lib/bpf/libbpf.c                        | 25 +++++-
 tools/testing/selftests/bpf/bench.c           |  2 +
 .../selftests/bpf/benchs/bench_trigger.c      | 17 ++++
 .../selftests/bpf/prog_tests/test_lsm.c       |  9 +++
 tools/testing/selftests/bpf/progs/lsm.c       | 66 ++++++++++++++-
 .../selftests/bpf/progs/trigger_bench.c       |  7 ++
 20 files changed, 331 insertions(+), 33 deletions(-)