mbox series

[bpf-next,v3,00/11] Add check for bpf lsm return value

Message ID 20240411122752.2873562-1-xukuohai@huaweicloud.com
Headers show
Series Add check for bpf lsm return value | expand

Message

Xu Kuohai April 11, 2024, 12:27 p.m. UTC
From: Xu Kuohai <xukuohai@huawei.com>

A bpf prog returning positive number attached to file_alloc_security hook
will make kernel panic.

Here is a panic log:

[  441.235774] BUG: kernel NULL pointer dereference, address: 00000000000009
[  441.236748] #PF: supervisor write access in kernel mode
[  441.237429] #PF: error_code(0x0002) - not-present page
[  441.238119] PGD 800000000b02f067 P4D 800000000b02f067 PUD b031067 PMD 0
[  441.238990] Oops: 0002 [#1] PREEMPT SMP PTI
[  441.239546] CPU: 0 PID: 347 Comm: loader Not tainted 6.8.0-rc6-gafe0cbf23373 #22
[  441.240496] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b4
[  441.241933] RIP: 0010:alloc_file+0x4b/0x190
[  441.242485] Code: 8b 04 25 c0 3c 1f 00 48 8b b0 30 0c 00 00 e8 9c fe ff ff 48 3d 00 f0 ff fb
[  441.244820] RSP: 0018:ffffc90000c67c40 EFLAGS: 00010203
[  441.245484] RAX: ffff888006a891a0 RBX: ffffffff8223bd00 RCX: 0000000035b08000
[  441.246391] RDX: ffff88800b95f7b0 RSI: 00000000001fc110 RDI: f089cd0b8088ffff
[  441.247294] RBP: ffffc90000c67c58 R08: 0000000000000001 R09: 0000000000000001
[  441.248209] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000001
[  441.249108] R13: ffffc90000c67c78 R14: ffffffff8223bd00 R15: fffffffffffffff4
[  441.250007] FS:  00000000005f3300(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000
[  441.251053] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  441.251788] CR2: 00000000000001a9 CR3: 000000000bdc4003 CR4: 0000000000170ef0
[  441.252688] Call Trace:
[  441.253011]  <TASK>
[  441.253296]  ? __die+0x24/0x70
[  441.253702]  ? page_fault_oops+0x15b/0x480
[  441.254236]  ? fixup_exception+0x26/0x330
[  441.254750]  ? exc_page_fault+0x6d/0x1c0
[  441.255257]  ? asm_exc_page_fault+0x26/0x30
[  441.255792]  ? alloc_file+0x4b/0x190
[  441.256257]  alloc_file_pseudo+0x9f/0xf0
[  441.256760]  __anon_inode_getfile+0x87/0x190
[  441.257311]  ? lock_release+0x14e/0x3f0
[  441.257808]  bpf_link_prime+0xe8/0x1d0
[  441.258315]  bpf_tracing_prog_attach+0x311/0x570
[  441.258916]  ? __pfx_bpf_lsm_file_alloc_security+0x10/0x10
[  441.259605]  __sys_bpf+0x1bb7/0x2dc0
[  441.260070]  __x64_sys_bpf+0x20/0x30
[  441.260533]  do_syscall_64+0x72/0x140
[  441.261004]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
[  441.261643] RIP: 0033:0x4b0349
[  441.262045] Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 88
[  441.264355] RSP: 002b:00007fff74daee38 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
[  441.265293] RAX: ffffffffffffffda RBX: 00007fff74daef30 RCX: 00000000004b0349
[  441.266187] RDX: 0000000000000040 RSI: 00007fff74daee50 RDI: 000000000000001c
[  441.267114] RBP: 000000000000001b R08: 00000000005ef820 R09: 0000000000000000
[  441.268018] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
[  441.268907] R13: 0000000000000004 R14: 00000000005ef018 R15: 00000000004004e8

The reason is that the positive number returned by bpf prog is not a
valid errno, and could not be filtered out with IS_ERR which is used by
the file system to check errors. As a result, the filesystem mistakenly
uses this random positive number as file pointer, causing panic.

To fix this issue, there are two schemes:

1. Modify the calling sites of file_alloc_security to take positive
   return values as zero.

2. Make the bpf verifier to ensure no unpredicted value returned by
   lsm bpf prog.

Considering that hook file_alloc_security never returned positive number
before bpf lsm was introduced, and other lsm hooks may have the same
problem, scheme 2 is more reasonable.

So this series adds lsm return value check in verifier to fix it.

v3:
1. Fix incorrect lsm hook return value ranges, and add disabled hook
   list for bpf lsm, and merge two LSM_RET_INT patches. (KP Singh)
2. Avoid bpf lsm progs attached to different hooks to call each other
   with tail call
3. Fix a CI failure caused by false rejection of AND operation
4. Add tests

v2: https://lore.kernel.org/bpf/20240325095653.1720123-1-xukuohai@huaweicloud.com/
fix bpf ci failure

v1: https://lore.kernel.org/bpf/20240316122359.1073787-1-xukuohai@huaweicloud.com/

Xu Kuohai (11):
  bpf, lsm: Annotate lsm hook return value range
  bpf, lsm: Add helper to read lsm hook return value range
  bpf, lsm: Check bpf lsm hook return values in verifier
  bpf, lsm: Add bpf lsm disabled hook list
  bpf: Avoid progs for different hooks calling each other with tail call
  bpf: Fix compare error in function retval_range_within
  bpf: Fix a false rejection caused by AND operation
  selftests/bpf: Avoid load failure for token_lsm.c
  selftests/bpf: Add return value checks for failed tests
  selftests/bpf: Add test for lsm tail call
  selftests/bpf: Add verifier tests for bpf lsm

 include/linux/bpf.h                           |   2 +
 include/linux/bpf_lsm.h                       |   8 +
 include/linux/lsm_hook_defs.h                 | 591 +++++++++---------
 include/linux/lsm_hooks.h                     |   6 -
 kernel/bpf/bpf_lsm.c                          |  84 ++-
 kernel/bpf/btf.c                              |   5 +-
 kernel/bpf/core.c                             |  22 +-
 kernel/bpf/verifier.c                         |  82 ++-
 security/security.c                           |   1 +
 .../selftests/bpf/prog_tests/test_lsm.c       |  46 +-
 .../selftests/bpf/prog_tests/verifier.c       |   3 +-
 tools/testing/selftests/bpf/progs/err.h       |  10 +
 .../selftests/bpf/progs/lsm_tailcall.c        |  34 +
 .../selftests/bpf/progs/test_sig_in_xattr.c   |   4 +
 .../bpf/progs/test_verify_pkcs7_sig.c         |   8 +-
 tools/testing/selftests/bpf/progs/token_lsm.c |   4 +-
 .../bpf/progs/verifier_global_subprogs.c      |   7 +-
 .../selftests/bpf/progs/verifier_lsm.c        | 155 +++++
 18 files changed, 754 insertions(+), 318 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/lsm_tailcall.c
 create mode 100644 tools/testing/selftests/bpf/progs/verifier_lsm.c

Comments

Shung-Hsi Yu April 12, 2024, 8:53 a.m. UTC | #1
On Thu, Apr 11, 2024 at 08:27:47PM +0800, Xu Kuohai wrote:
> [...]
> 24: (b4) w0 = -1                      ; R0_w=0xffffffff
> ; int BPF_PROG(test_int_hook, struct vm_area_struct *vma, @ lsm.c:89
> 25: (95) exit
> At program exit the register R0 has smin=4294967295 smax=4294967295 should have been in [-4095, 0]
> 
> It can be seen that instruction "w0 = -1" zero extended -1 to 64-bit
> register r0, setting both smin and smax values of r0 to 4294967295.
> This resulted in a false reject when r0 was checked with range [-4095, 0].
> 
> Given bpf_retval_range is a 32-bit range, this patch fixes it by
> changing the compare between r0 and return range from 64-bit
> operation to 32-bit operation.
> 
> Fixes: 8fa4ecd49b81 ("bpf: enforce exact retval range on subprog/callback exit")
> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
> ---
>  kernel/bpf/verifier.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 05c7c5f2bec0..5393d576c76f 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -9879,7 +9879,7 @@ static bool in_rbtree_lock_required_cb(struct bpf_verifier_env *env)
>  
>  static bool retval_range_within(struct bpf_retval_range range, const struct bpf_reg_state *reg)
>  {
> -	return range.minval <= reg->smin_value && reg->smax_value <= range.maxval;
> +	return range.minval <= reg->s32_min_value && reg->s32_max_value <= range.maxval;

Logic-wise LGTM

While the status-quo is that the return value is always truncated to
32-bit, looking back there was an attempt to use 64-bit return value for
bpf_prog_run[1] (not merged due to issue on 32-bit architectures). Also
from the reading of BPF standardization ABI it would be inferred that
return value is in 64-bit range:

  BPF has 10 general purpose registers and a read-only frame pointer register,
  all of which are 64-bits wide.
  
  The BPF calling convention is defined as:
  
  * R0: return value from function calls, and exit value for BPF programs
  ...

So add relevant people into the thread for opinions.

1: https://lore.kernel.org/bpf/20221115193911.u6prvskdzr5jevni@apollo/
Eduard Zingerman April 13, 2024, 11:44 a.m. UTC | #2
On Thu, 2024-04-11 at 20:27 +0800, Xu Kuohai wrote:
> From: Xu Kuohai <xukuohai@huawei.com>
> 
> A bpf prog returning positive number attached to file_alloc_security hook
> will make kernel panic.
> 
> The reason is that the positive number returned by bpf prog is not a
> valid errno, and could not be filtered out with IS_ERR which is used by
> the file system to check errors. As a result, the file system uses this
> positive number as file pointer, causing panic.
> 
> Considering that hook file_alloc_security never returned positive number
> before bpf lsm was introduced, and other bpf lsm hooks may have the same
> problem, this patch adds lsm return value check in bpf verifier to ensure
> no unpredicted values will be returned by lsm bpf prog.
> 
> Fixes: 520b7aa00d8c ("bpf: lsm: Initialize the BPF LSM hooks")
> Reported-by: Xin Liu <liuxin350@huawei.com>
> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
> ---

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Andrii Nakryiko April 25, 2024, 11:41 p.m. UTC | #3
On Thu, Apr 11, 2024 at 5:24 AM Xu Kuohai <xukuohai@huaweicloud.com> wrote:
>
> From: Xu Kuohai <xukuohai@huawei.com>
>
> After checking lsm hook return range in verifier, the test case
> "test_progs -t test_lsm" failed, and the failure log says:
>
> libbpf: prog 'test_int_hook': BPF program load failed: Invalid argument
> libbpf: prog 'test_int_hook': -- BEGIN PROG LOAD LOG --
> 0: R1=ctx() R10=fp0
> ; int BPF_PROG(test_int_hook, struct vm_area_struct *vma, @ lsm.c:89
> 0: (79) r0 = *(u64 *)(r1 +24)         ; R0_w=scalar(smin=smin32=-4095,smax=smax32=0) R1=ctx()
>
> [...]
>
> 24: (b4) w0 = -1                      ; R0_w=0xffffffff
> ; int BPF_PROG(test_int_hook, struct vm_area_struct *vma, @ lsm.c:89
> 25: (95) exit
> At program exit the register R0 has smin=4294967295 smax=4294967295 should have been in [-4095, 0]
>
> It can be seen that instruction "w0 = -1" zero extended -1 to 64-bit
> register r0, setting both smin and smax values of r0 to 4294967295.
> This resulted in a false reject when r0 was checked with range [-4095, 0].
>
> Given bpf_retval_range is a 32-bit range, this patch fixes it by
> changing the compare between r0 and return range from 64-bit
> operation to 32-bit operation.
>
> Fixes: 8fa4ecd49b81 ("bpf: enforce exact retval range on subprog/callback exit")
> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
> ---
>  kernel/bpf/verifier.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 05c7c5f2bec0..5393d576c76f 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -9879,7 +9879,7 @@ static bool in_rbtree_lock_required_cb(struct bpf_verifier_env *env)
>
>  static bool retval_range_within(struct bpf_retval_range range, const struct bpf_reg_state *reg)
>  {
> -       return range.minval <= reg->smin_value && reg->smax_value <= range.maxval;
> +       return range.minval <= reg->s32_min_value && reg->s32_max_value <= range.maxval;

are all BPF programs treated as if they return int instead of long? If
not, we probably should have a bool flag in bpf_retval_range whether
comparison should be 32-bit or 64-bit?

>  }
>
>  static int prepare_func_exit(struct bpf_verifier_env *env, int *insn_idx)
> --
> 2.30.2
>
Paul Moore June 6, 2024, 9:53 p.m. UTC | #4
On Thu, Apr 11, 2024 at 8:24 AM Xu Kuohai <xukuohai@huaweicloud.com> wrote:
>
> From: Xu Kuohai <xukuohai@huawei.com>
>
> Add macro LSM_RET_INT to annotate lsm hook return integer type and the
> default return value, and the expected return range.
>
> The LSM_RET_INT is declared as:
>
> LSM_RET_INT(defval, min, max)
>
> where
>
> - defval is the default return value
>
> - min and max indicate the expected return range is [min, max]
>
> The return value range for each lsm hook is taken from the description
> in security/security.c.
>
> The expanded result of LSM_RET_INT is not changed, and the compiled
> product is not changed.
>
> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
> ---
>  include/linux/lsm_hook_defs.h | 591 +++++++++++++++++-----------------
>  include/linux/lsm_hooks.h     |   6 -
>  kernel/bpf/bpf_lsm.c          |  10 +
>  security/security.c           |   1 +
>  4 files changed, 313 insertions(+), 295 deletions(-)

...

> diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
> index 334e00efbde4..708f515ffbf3 100644
> --- a/include/linux/lsm_hook_defs.h
> +++ b/include/linux/lsm_hook_defs.h
> @@ -18,435 +18,448 @@
>   * The macro LSM_HOOK is used to define the data structures required by
>   * the LSM framework using the pattern:
>   *
> - *     LSM_HOOK(<return_type>, <default_value>, <hook_name>, args...)
> + *     LSM_HOOK(<return_type>, <return_description>, <hook_name>, args...)
>   *
>   * struct security_hook_heads {
> - *   #define LSM_HOOK(RET, DEFAULT, NAME, ...) struct hlist_head NAME;
> + *   #define LSM_HOOK(RET, RETVAL_DESC, NAME, ...) struct hlist_head NAME;
>   *   #include <linux/lsm_hook_defs.h>
>   *   #undef LSM_HOOK
>   * };
>   */
> -LSM_HOOK(int, 0, binder_set_context_mgr, const struct cred *mgr)
> -LSM_HOOK(int, 0, binder_transaction, const struct cred *from,
> +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_set_context_mgr, const struct cred *mgr)
> +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transaction, const struct cred *from,
>          const struct cred *to)
> -LSM_HOOK(int, 0, binder_transfer_binder, const struct cred *from,
> +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transfer_binder, const struct cred *from,
>          const struct cred *to)
> -LSM_HOOK(int, 0, binder_transfer_file, const struct cred *from,
> +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transfer_file, const struct cred *from,
>          const struct cred *to, const struct file *file)

I'm not overly excited about injecting these additional return value
range annotations into the LSM hook definitions, especially since the
vast majority of the hooks "returns 0 on success, negative values on
error".  I'd rather see some effort put into looking at the
feasibility of converting some (all?) of the LSM hook return value
exceptions into the more conventional 0/-ERRNO format.  Unfortunately,
I haven't had the time to look into that myself, but if you wanted to
do that I think it would be a good thing.
Xu Kuohai June 8, 2024, 8:04 a.m. UTC | #5
On 6/7/2024 5:53 AM, Paul Moore wrote:
> On Thu, Apr 11, 2024 at 8:24 AM Xu Kuohai <xukuohai@huaweicloud.com> wrote:
>>
>> From: Xu Kuohai <xukuohai@huawei.com>
>>
>> Add macro LSM_RET_INT to annotate lsm hook return integer type and the
>> default return value, and the expected return range.
>>
>> The LSM_RET_INT is declared as:
>>
>> LSM_RET_INT(defval, min, max)
>>
>> where
>>
>> - defval is the default return value
>>
>> - min and max indicate the expected return range is [min, max]
>>
>> The return value range for each lsm hook is taken from the description
>> in security/security.c.
>>
>> The expanded result of LSM_RET_INT is not changed, and the compiled
>> product is not changed.
>>
>> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
>> ---
>>   include/linux/lsm_hook_defs.h | 591 +++++++++++++++++-----------------
>>   include/linux/lsm_hooks.h     |   6 -
>>   kernel/bpf/bpf_lsm.c          |  10 +
>>   security/security.c           |   1 +
>>   4 files changed, 313 insertions(+), 295 deletions(-)
> 
> ...
> 
>> diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
>> index 334e00efbde4..708f515ffbf3 100644
>> --- a/include/linux/lsm_hook_defs.h
>> +++ b/include/linux/lsm_hook_defs.h
>> @@ -18,435 +18,448 @@
>>    * The macro LSM_HOOK is used to define the data structures required by
>>    * the LSM framework using the pattern:
>>    *
>> - *     LSM_HOOK(<return_type>, <default_value>, <hook_name>, args...)
>> + *     LSM_HOOK(<return_type>, <return_description>, <hook_name>, args...)
>>    *
>>    * struct security_hook_heads {
>> - *   #define LSM_HOOK(RET, DEFAULT, NAME, ...) struct hlist_head NAME;
>> + *   #define LSM_HOOK(RET, RETVAL_DESC, NAME, ...) struct hlist_head NAME;
>>    *   #include <linux/lsm_hook_defs.h>
>>    *   #undef LSM_HOOK
>>    * };
>>    */
>> -LSM_HOOK(int, 0, binder_set_context_mgr, const struct cred *mgr)
>> -LSM_HOOK(int, 0, binder_transaction, const struct cred *from,
>> +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_set_context_mgr, const struct cred *mgr)
>> +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transaction, const struct cred *from,
>>           const struct cred *to)
>> -LSM_HOOK(int, 0, binder_transfer_binder, const struct cred *from,
>> +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transfer_binder, const struct cred *from,
>>           const struct cred *to)
>> -LSM_HOOK(int, 0, binder_transfer_file, const struct cred *from,
>> +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transfer_file, const struct cred *from,
>>           const struct cred *to, const struct file *file)
> 
> I'm not overly excited about injecting these additional return value
> range annotations into the LSM hook definitions, especially since the
> vast majority of the hooks "returns 0 on success, negative values on
> error".  I'd rather see some effort put into looking at the
> feasibility of converting some (all?) of the LSM hook return value
> exceptions into the more conventional 0/-ERRNO format.  Unfortunately,
> I haven't had the time to look into that myself, but if you wanted to
> do that I think it would be a good thing.
> 

I agree that keeping all hooks return a consistent range of 0/-ERRNO
is more elegant than adding return value range annotations. However, there
are two issues that might need to be addressed first:

1. Compatibility

For instance, security_vm_enough_memory_mm() determines whether to
set cap_sys_admin by checking if the hook vm_enough_memory returns
a positive number. If we were to change the hook vm_enough_memory
to return 0 to indicate the need for cap_sys_admin, then for the
LSM BPF program currently returning 0, the interpretation of its
return value would be reversed after the modification.

2. Expressing multiple non-error states using 0/-ERRNO

IIUC, although 0/-ERRNO can be used to express different errors,
only 0 can be used for non-error state. If there are multiple
non-error states, they cannot be distinguished. For example,
security_inode_need_killpriv() returns < 0 on error, 0 if
security_inode_killpriv() doesn't need to be called, and > 0
if security_inode_killpriv() does need to be called.
Alexei Starovoitov June 8, 2024, 1:54 p.m. UTC | #6
On Sat, Jun 8, 2024 at 1:04 AM Xu Kuohai <xukuohai@huaweicloud.com> wrote:
>
> On 6/7/2024 5:53 AM, Paul Moore wrote:
> > On Thu, Apr 11, 2024 at 8:24 AM Xu Kuohai <xukuohai@huaweicloud.com> wrote:
> >>
> >> From: Xu Kuohai <xukuohai@huawei.com>
> >>
> >> Add macro LSM_RET_INT to annotate lsm hook return integer type and the
> >> default return value, and the expected return range.
> >>
> >> The LSM_RET_INT is declared as:
> >>
> >> LSM_RET_INT(defval, min, max)
> >>
> >> where
> >>
> >> - defval is the default return value
> >>
> >> - min and max indicate the expected return range is [min, max]
> >>
> >> The return value range for each lsm hook is taken from the description
> >> in security/security.c.
> >>
> >> The expanded result of LSM_RET_INT is not changed, and the compiled
> >> product is not changed.
> >>
> >> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
> >> ---
> >>   include/linux/lsm_hook_defs.h | 591 +++++++++++++++++-----------------
> >>   include/linux/lsm_hooks.h     |   6 -
> >>   kernel/bpf/bpf_lsm.c          |  10 +
> >>   security/security.c           |   1 +
> >>   4 files changed, 313 insertions(+), 295 deletions(-)
> >
> > ...
> >
> >> diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
> >> index 334e00efbde4..708f515ffbf3 100644
> >> --- a/include/linux/lsm_hook_defs.h
> >> +++ b/include/linux/lsm_hook_defs.h
> >> @@ -18,435 +18,448 @@
> >>    * The macro LSM_HOOK is used to define the data structures required by
> >>    * the LSM framework using the pattern:
> >>    *
> >> - *     LSM_HOOK(<return_type>, <default_value>, <hook_name>, args...)
> >> + *     LSM_HOOK(<return_type>, <return_description>, <hook_name>, args...)
> >>    *
> >>    * struct security_hook_heads {
> >> - *   #define LSM_HOOK(RET, DEFAULT, NAME, ...) struct hlist_head NAME;
> >> + *   #define LSM_HOOK(RET, RETVAL_DESC, NAME, ...) struct hlist_head NAME;
> >>    *   #include <linux/lsm_hook_defs.h>
> >>    *   #undef LSM_HOOK
> >>    * };
> >>    */
> >> -LSM_HOOK(int, 0, binder_set_context_mgr, const struct cred *mgr)
> >> -LSM_HOOK(int, 0, binder_transaction, const struct cred *from,
> >> +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_set_context_mgr, const struct cred *mgr)
> >> +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transaction, const struct cred *from,
> >>           const struct cred *to)
> >> -LSM_HOOK(int, 0, binder_transfer_binder, const struct cred *from,
> >> +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transfer_binder, const struct cred *from,
> >>           const struct cred *to)
> >> -LSM_HOOK(int, 0, binder_transfer_file, const struct cred *from,
> >> +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transfer_file, const struct cred *from,
> >>           const struct cred *to, const struct file *file)
> >
> > I'm not overly excited about injecting these additional return value
> > range annotations into the LSM hook definitions, especially since the
> > vast majority of the hooks "returns 0 on success, negative values on
> > error".  I'd rather see some effort put into looking at the
> > feasibility of converting some (all?) of the LSM hook return value
> > exceptions into the more conventional 0/-ERRNO format.  Unfortunately,
> > I haven't had the time to look into that myself, but if you wanted to
> > do that I think it would be a good thing.
> >
>
> I agree that keeping all hooks return a consistent range of 0/-ERRNO
> is more elegant than adding return value range annotations. However, there
> are two issues that might need to be addressed first:
>
> 1. Compatibility
>
> For instance, security_vm_enough_memory_mm() determines whether to
> set cap_sys_admin by checking if the hook vm_enough_memory returns
> a positive number. If we were to change the hook vm_enough_memory
> to return 0 to indicate the need for cap_sys_admin, then for the
> LSM BPF program currently returning 0, the interpretation of its
> return value would be reversed after the modification.

This is not an issue. bpf lsm progs are no different from other lsm-s.
If the meaning of return value or arguments to lsm hook change
all lsm-s need to adjust as well. Regardless of whether they are
written as in-kernel lsm-s, bpf-lsm, or out-of-tree lsm-s.

> 2. Expressing multiple non-error states using 0/-ERRNO
>
> IIUC, although 0/-ERRNO can be used to express different errors,
> only 0 can be used for non-error state. If there are multiple
> non-error states, they cannot be distinguished. For example,
> security_inode_need_killpriv() returns < 0 on error, 0 if
> security_inode_killpriv() doesn't need to be called, and > 0
> if security_inode_killpriv() does need to be called.

This looks like a problem indeed. Converting all hooks to 0/-errno
doesn't look practical.
Paul Moore June 9, 2024, 6:17 p.m. UTC | #7
On Sun, Jun 9, 2024 at 1:39 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> On 6/8/2024 6:54 AM, Alexei Starovoitov wrote:
> > On Sat, Jun 8, 2024 at 1:04 AM Xu Kuohai <xukuohai@huaweicloud.com> wrote:
> >> On 6/7/2024 5:53 AM, Paul Moore wrote:
> >>> On Thu, Apr 11, 2024 at 8:24 AM Xu Kuohai <xukuohai@huaweicloud.com> wrote:
> >>>> From: Xu Kuohai <xukuohai@huawei.com>
> >>>>
> >>>> Add macro LSM_RET_INT to annotate lsm hook return integer type and the
> >>>> default return value, and the expected return range.
> >>>>
> >>>> The LSM_RET_INT is declared as:
> >>>>
> >>>> LSM_RET_INT(defval, min, max)
> >>>>
> >>>> where
> >>>>
> >>>> - defval is the default return value
> >>>>
> >>>> - min and max indicate the expected return range is [min, max]
> >>>>
> >>>> The return value range for each lsm hook is taken from the description
> >>>> in security/security.c.
> >>>>
> >>>> The expanded result of LSM_RET_INT is not changed, and the compiled
> >>>> product is not changed.
> >>>>
> >>>> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
> >>>> ---
> >>>>   include/linux/lsm_hook_defs.h | 591 +++++++++++++++++-----------------
> >>>>   include/linux/lsm_hooks.h     |   6 -
> >>>>   kernel/bpf/bpf_lsm.c          |  10 +
> >>>>   security/security.c           |   1 +
> >>>>   4 files changed, 313 insertions(+), 295 deletions(-)
> >>> ...
> >>>
> >>>> diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
> >>>> index 334e00efbde4..708f515ffbf3 100644
> >>>> --- a/include/linux/lsm_hook_defs.h
> >>>> +++ b/include/linux/lsm_hook_defs.h
> >>>> @@ -18,435 +18,448 @@
> >>>>    * The macro LSM_HOOK is used to define the data structures required by
> >>>>    * the LSM framework using the pattern:
> >>>>    *
> >>>> - *     LSM_HOOK(<return_type>, <default_value>, <hook_name>, args...)
> >>>> + *     LSM_HOOK(<return_type>, <return_description>, <hook_name>, args...)
> >>>>    *
> >>>>    * struct security_hook_heads {
> >>>> - *   #define LSM_HOOK(RET, DEFAULT, NAME, ...) struct hlist_head NAME;
> >>>> + *   #define LSM_HOOK(RET, RETVAL_DESC, NAME, ...) struct hlist_head NAME;
> >>>>    *   #include <linux/lsm_hook_defs.h>
> >>>>    *   #undef LSM_HOOK
> >>>>    * };
> >>>>    */
> >>>> -LSM_HOOK(int, 0, binder_set_context_mgr, const struct cred *mgr)
> >>>> -LSM_HOOK(int, 0, binder_transaction, const struct cred *from,
> >>>> +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_set_context_mgr, const struct cred *mgr)
> >>>> +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transaction, const struct cred *from,
> >>>>           const struct cred *to)
> >>>> -LSM_HOOK(int, 0, binder_transfer_binder, const struct cred *from,
> >>>> +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transfer_binder, const struct cred *from,
> >>>>           const struct cred *to)
> >>>> -LSM_HOOK(int, 0, binder_transfer_file, const struct cred *from,
> >>>> +LSM_HOOK(int, LSM_RET_INT(0, -MAX_ERRNO, 0), binder_transfer_file, const struct cred *from,
> >>>>           const struct cred *to, const struct file *file)
> >>> I'm not overly excited about injecting these additional return value
> >>> range annotations into the LSM hook definitions, especially since the
> >>> vast majority of the hooks "returns 0 on success, negative values on
> >>> error".  I'd rather see some effort put into looking at the
> >>> feasibility of converting some (all?) of the LSM hook return value
> >>> exceptions into the more conventional 0/-ERRNO format.  Unfortunately,
> >>> I haven't had the time to look into that myself, but if you wanted to
> >>> do that I think it would be a good thing.
> >>>
> >> I agree that keeping all hooks return a consistent range of 0/-ERRNO
> >> is more elegant than adding return value range annotations. However, there
> >> are two issues that might need to be addressed first:
> >>
> >> 1. Compatibility
> >>
> >> For instance, security_vm_enough_memory_mm() determines whether to
> >> set cap_sys_admin by checking if the hook vm_enough_memory returns
> >> a positive number. If we were to change the hook vm_enough_memory
> >> to return 0 to indicate the need for cap_sys_admin, then for the
> >> LSM BPF program currently returning 0, the interpretation of its
> >> return value would be reversed after the modification.
> >
> > This is not an issue. bpf lsm progs are no different from other lsm-s.
> > If the meaning of return value or arguments to lsm hook change
> > all lsm-s need to adjust as well. Regardless of whether they are
> > written as in-kernel lsm-s, bpf-lsm, or out-of-tree lsm-s.

Yes, the are no guarantees around compatibility in kernel/LSM
interface from one kernel release to the next.  If we need to change a
LSM hook, we can change a LSM hook; the important part is that when we
change the LSM hook we must make sure to update all of the in-tree
LSMs which make use of that hook.

> >> 2. Expressing multiple non-error states using 0/-ERRNO
> >>
> >> IIUC, although 0/-ERRNO can be used to express different errors,
> >> only 0 can be used for non-error state. If there are multiple
> >> non-error states, they cannot be distinguished. For example,
> >> security_inode_need_killpriv() returns < 0 on error, 0 if
> >> security_inode_killpriv() doesn't need to be called, and > 0
> >> if security_inode_killpriv() does need to be called.
> > This looks like a problem indeed.
>
> Hang on. There aren't really three states here. security_inode_killpriv()
> is called only on the security_inode_need_killpriv() > 0 case. I'm not
> looking at the code this instant, but adjusting the return to something
> like -ENOSYS (OK, maybe not a great choice, but you get the idea) instead
> of 0 in the don't call case and switching the positive value to 0 should
> work just fine.
>
> We're working on getting the LSM interfaces to be more consistent. This
> particular pair of hooks is an example of why we need to do that.

Yes, exactly.  Aside from the issues with BPF verification, we've seen
problems in the past with LSM hooks that differ from the usual "0 on
success, negative values on failure" pattern.  I'm not saying it is
possible to convert all of the hooks to fit this model, but even if we
can only adjust one or two I think that is still a win.

As far as security_inode_need_killpriv()/security_inode_killpriv() is
concerned, one possibility would be to shift the ATTR_KILL_PRIV
set/mask operation into the LSM hook, something like this:

[WARNING: completely untested, likely broken, yadda yadda]

/**
 * ...
 * Returns: Return 0 on success, negative values on failure.  @attrs
may be updated
 *          on success.
 */
int security_inode_need_killpriv(*dentry, attrs)
{
  int rc;
  rc = call_int_hook(inode_killpriv, dentry);
  if (rc < 0)
    return rc;
  if (rc > 0)
    attrs |= ATTR_KILL_PRIV;
  else if (rc == 0)
    attrs &= ~ATTR_KILL_PRIV;
  return 0;
}

Yes, that doesn't fix the problem for the individual LSMs, but it does
make the hook a bit more consistent from the rest of the kernel.
Paul Moore June 11, 2024, 8:06 p.m. UTC | #8
On Mon, Jun 10, 2024 at 10:25 PM Xu Kuohai <xukuohai@huaweicloud.com> wrote:
>
> Alright, I'll give it a try. Perhaps in the end, there will be a few
> hooks that cannot be converted. If that's the case, it seems we can
> just provide exceptions for the return value explanations for these
> not unconverted hooks, maybe on the BPF side only, thus avoiding the
> need to annotate return values for all LSM hooks.

Thanks.  Yes, while I don't think we will be able to normalize all of
the hooks to 0/-ERRNO, my guess is that we can reduce the exceptions
to a manageable count.