diff mbox series

[v2,4/4] bpf,lsm: Allow editing capabilities in BPF-LSM hooks

Message ID 20240609104355.442002-5-jcalmels@3xx0.net
State New
Headers show
Series Introduce user namespace capabilities | expand

Commit Message

Jonathan Calmels June 9, 2024, 10:43 a.m. UTC
This patch allows modifying the various capabilities of the struct cred
in BPF-LSM hooks. More specifically, the userns_create hook called
prior to creating a new user namespace.

With the introduction of userns capabilities, this effectively provides
a simple way for LSMs to control the capabilities granted to a user
namespace and all its descendants.

Update the selftests accordingly by dropping CAP_SYS_ADMIN in
namespaces and checking the resulting task's bounding set.

Signed-off-by: Jonathan Calmels <jcalmels@3xx0.net>
---
 include/linux/lsm_hook_defs.h                 |  2 +-
 include/linux/security.h                      |  4 +-
 kernel/bpf/bpf_lsm.c                          | 55 +++++++++++++++++++
 security/apparmor/lsm.c                       |  2 +-
 security/security.c                           |  6 +-
 security/selinux/hooks.c                      |  2 +-
 .../selftests/bpf/prog_tests/deny_namespace.c | 12 ++--
 .../selftests/bpf/progs/test_deny_namespace.c |  7 ++-
 8 files changed, 76 insertions(+), 14 deletions(-)

Comments

Jonathan Calmels June 11, 2024, 8:09 a.m. UTC | #1
On Sun, Jun 09, 2024 at 08:18:48PM GMT, Paul Moore wrote:
> On Sun, Jun 9, 2024 at 6:40 AM Jonathan Calmels <jcalmels@3xx0.net> wrote:
> >
> > This patch allows modifying the various capabilities of the struct cred
> > in BPF-LSM hooks. More specifically, the userns_create hook called
> > prior to creating a new user namespace.
> >
> > With the introduction of userns capabilities, this effectively provides
> > a simple way for LSMs to control the capabilities granted to a user
> > namespace and all its descendants.
> >
> > Update the selftests accordingly by dropping CAP_SYS_ADMIN in
> > namespaces and checking the resulting task's bounding set.
> >
> > Signed-off-by: Jonathan Calmels <jcalmels@3xx0.net>
> > ---
> >  include/linux/lsm_hook_defs.h                 |  2 +-
> >  include/linux/security.h                      |  4 +-
> >  kernel/bpf/bpf_lsm.c                          | 55 +++++++++++++++++++
> >  security/apparmor/lsm.c                       |  2 +-
> >  security/security.c                           |  6 +-
> >  security/selinux/hooks.c                      |  2 +-
> >  .../selftests/bpf/prog_tests/deny_namespace.c | 12 ++--
> >  .../selftests/bpf/progs/test_deny_namespace.c |  7 ++-
> >  8 files changed, 76 insertions(+), 14 deletions(-)
> 
> I'm not sure we want to go down the path of a LSM modifying the POSIX
> capabilities of a task, other than the capabilities/commoncap LSM.  It
> sets a bad precedent and could further complicate issues around LSM
> ordering.

Well unless I'm misunderstanding, this does allow modifying the
capabilities/commoncap LSM through BTF. The reason for allowing
`userns_create` to be modified is that it is functionally very similar
to `cred_prepare` in that it operates with new creds (but specific to
user namespaces because of reasons detailed in [1]). 

There were some concerns in previous threads that the userns caps by
themselves wouldn't be granular enough, hence the LSM integration.
Ubuntu for example, currently has to resort to a hardcoded profile
transition to achieve this [2].

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7cd4c5c2101cb092db00f61f69d24380cf7a0ee8
[2] https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/noble/commit/?id=43a6c29532f517179fea8c94949d657d71f4fc13
John Johansen June 11, 2024, 10:31 a.m. UTC | #2
On 6/11/24 01:09, Jonathan Calmels wrote:
> On Sun, Jun 09, 2024 at 08:18:48PM GMT, Paul Moore wrote:
>> On Sun, Jun 9, 2024 at 6:40 AM Jonathan Calmels <jcalmels@3xx0.net> wrote:
>>>
>>> This patch allows modifying the various capabilities of the struct cred
>>> in BPF-LSM hooks. More specifically, the userns_create hook called
>>> prior to creating a new user namespace.
>>>
>>> With the introduction of userns capabilities, this effectively provides
>>> a simple way for LSMs to control the capabilities granted to a user
>>> namespace and all its descendants.
>>>
>>> Update the selftests accordingly by dropping CAP_SYS_ADMIN in
>>> namespaces and checking the resulting task's bounding set.
>>>
>>> Signed-off-by: Jonathan Calmels <jcalmels@3xx0.net>
>>> ---
>>>   include/linux/lsm_hook_defs.h                 |  2 +-
>>>   include/linux/security.h                      |  4 +-
>>>   kernel/bpf/bpf_lsm.c                          | 55 +++++++++++++++++++
>>>   security/apparmor/lsm.c                       |  2 +-
>>>   security/security.c                           |  6 +-
>>>   security/selinux/hooks.c                      |  2 +-
>>>   .../selftests/bpf/prog_tests/deny_namespace.c | 12 ++--
>>>   .../selftests/bpf/progs/test_deny_namespace.c |  7 ++-
>>>   8 files changed, 76 insertions(+), 14 deletions(-)
>>
>> I'm not sure we want to go down the path of a LSM modifying the POSIX
>> capabilities of a task, other than the capabilities/commoncap LSM.  It
>> sets a bad precedent and could further complicate issues around LSM
>> ordering.
> 
> Well unless I'm misunderstanding, this does allow modifying the
> capabilities/commoncap LSM through BTF. The reason for allowing
> `userns_create` to be modified is that it is functionally very similar
> to `cred_prepare` in that it operates with new creds (but specific to
> user namespaces because of reasons detailed in [1]).
> 
yes

> There were some concerns in previous threads that the userns caps by
> themselves wouldn't be granular enough, hence the LSM integration.

> Ubuntu for example, currently has to resort to a hardcoded profile
> transition to achieve this [2].
> 

The hard coded profile transition, is because the more generic solution
as part of policy just wasn't ready. The hard coding will go away before
it is upstreamed.

But yes, updating the cred really is necessary for the flexibility needed
whether it is modifying the POSIX capabilities of the task or the LSM
modifying its own security blob.

I do share some of Paul's concerns about the LSM modifying the POSIX
capabilities of the task, but also thing the LSM here needs to be
able to modify its own blob.

I have a very similar patch I was planning on posting once the
work to fix the hard coded profile transition is done.


> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7cd4c5c2101cb092db00f61f69d24380cf7a0ee8
> [2] https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/noble/commit/?id=43a6c29532f517179fea8c94949d657d71f4fc13
Paul Moore June 11, 2024, 7:01 p.m. UTC | #3
On Tue, Jun 11, 2024 at 6:32 AM John Johansen
<john.johansen@canonical.com> wrote:
>
> On 6/11/24 01:09, Jonathan Calmels wrote:
> > On Sun, Jun 09, 2024 at 08:18:48PM GMT, Paul Moore wrote:
> >> On Sun, Jun 9, 2024 at 6:40 AM Jonathan Calmels <jcalmels@3xx0.net> wrote:
> >>>
> >>> This patch allows modifying the various capabilities of the struct cred
> >>> in BPF-LSM hooks. More specifically, the userns_create hook called
> >>> prior to creating a new user namespace.
> >>>
> >>> With the introduction of userns capabilities, this effectively provides
> >>> a simple way for LSMs to control the capabilities granted to a user
> >>> namespace and all its descendants.
> >>>
> >>> Update the selftests accordingly by dropping CAP_SYS_ADMIN in
> >>> namespaces and checking the resulting task's bounding set.
> >>>
> >>> Signed-off-by: Jonathan Calmels <jcalmels@3xx0.net>
> >>> ---
> >>>   include/linux/lsm_hook_defs.h                 |  2 +-
> >>>   include/linux/security.h                      |  4 +-
> >>>   kernel/bpf/bpf_lsm.c                          | 55 +++++++++++++++++++
> >>>   security/apparmor/lsm.c                       |  2 +-
> >>>   security/security.c                           |  6 +-
> >>>   security/selinux/hooks.c                      |  2 +-
> >>>   .../selftests/bpf/prog_tests/deny_namespace.c | 12 ++--
> >>>   .../selftests/bpf/progs/test_deny_namespace.c |  7 ++-
> >>>   8 files changed, 76 insertions(+), 14 deletions(-)
> >>
> >> I'm not sure we want to go down the path of a LSM modifying the POSIX
> >> capabilities of a task, other than the capabilities/commoncap LSM.  It
> >> sets a bad precedent and could further complicate issues around LSM
> >> ordering.
> >
> > Well unless I'm misunderstanding, this does allow modifying the
> > capabilities/commoncap LSM through BTF. The reason for allowing
> > `userns_create` to be modified is that it is functionally very similar
> > to `cred_prepare` in that it operates with new creds (but specific to
> > user namespaces because of reasons detailed in [1]).
>
> yes
>
> > There were some concerns in previous threads that the userns caps by
> > themselves wouldn't be granular enough, hence the LSM integration.
>
> > Ubuntu for example, currently has to resort to a hardcoded profile
> > transition to achieve this [2].
> >
>
> The hard coded profile transition, is because the more generic solution
> as part of policy just wasn't ready. The hard coding will go away before
> it is upstreamed.
>
> But yes, updating the cred really is necessary for the flexibility needed
> whether it is modifying the POSIX capabilities of the task or the LSM
> modifying its own security blob.
>
> I do share some of Paul's concerns about the LSM modifying the POSIX
> capabilities of the task, but also thing the LSM here needs to be
> able to modify its own blob.

To be clear, this isn't about a generic LSM needing to update its own
blob (LSM state), it is about the BPF LSM updating the capability
sets.  While we obviously must support a LSM updating its own state,
I'm currently of the opinion that allowing one LSM to update the state
of another LSM is only going to lead to problems.  We wouldn't want to
allow Smack to update AppArmor state, and from my current perspective
allowing the BPF LSM to update the capability state is no different.

It's also important to keep in mind that if we allow one LSM to do
something, we need to allow all LSMs to do something.  If we allow
multiple LSMs to manipulate the capability sets, how do we reconcile
differences in the desired capability state?  Does that resolution
change depending on what LSMs are enabled at build time?  Enabled at
boot?  Similarly, what about custom LSM ordering?

What about those LSMs that use a task's capabilities as an input to an
access control decision?  If those LSMs allow an access based on a
given capability set only to have a LSM later in the ordering modify
that capability set to something which would have resulted in an
access denial, do we risk a security regression?

Our current approach to handling multiple LSMs is that each LSM is
limited to modifying its own state, and I'm pretty confident that we
stick to this model if we have any hope of preserving the sanity of
the LSM layer as a whole.  If you want to modify the capability set
you need to do so within the confines of the capability LSM and/or
modify the other related kernel subsystems (which I'm guessing will
likely necessitate a change in the LSMs, but that avenue is very
unclear if such an option even exists).
Jonathan Calmels June 11, 2024, 10:20 p.m. UTC | #4
On Tue, Jun 11, 2024 at 03:01:01PM GMT, Paul Moore wrote:
> On Tue, Jun 11, 2024 at 6:32 AM John Johansen
> <john.johansen@canonical.com> wrote:
> >
> > On 6/11/24 01:09, Jonathan Calmels wrote:
> > > On Sun, Jun 09, 2024 at 08:18:48PM GMT, Paul Moore wrote:
> > >> On Sun, Jun 9, 2024 at 6:40 AM Jonathan Calmels <jcalmels@3xx0.net> wrote:
> > >>>
> > >>> This patch allows modifying the various capabilities of the struct cred
> > >>> in BPF-LSM hooks. More specifically, the userns_create hook called
> > >>> prior to creating a new user namespace.
> > >>>
> > >>> With the introduction of userns capabilities, this effectively provides
> > >>> a simple way for LSMs to control the capabilities granted to a user
> > >>> namespace and all its descendants.
> > >>>
> > >>> Update the selftests accordingly by dropping CAP_SYS_ADMIN in
> > >>> namespaces and checking the resulting task's bounding set.
> > >>>
> > >>> Signed-off-by: Jonathan Calmels <jcalmels@3xx0.net>
> > >>> ---
> > >>>   include/linux/lsm_hook_defs.h                 |  2 +-
> > >>>   include/linux/security.h                      |  4 +-
> > >>>   kernel/bpf/bpf_lsm.c                          | 55 +++++++++++++++++++
> > >>>   security/apparmor/lsm.c                       |  2 +-
> > >>>   security/security.c                           |  6 +-
> > >>>   security/selinux/hooks.c                      |  2 +-
> > >>>   .../selftests/bpf/prog_tests/deny_namespace.c | 12 ++--
> > >>>   .../selftests/bpf/progs/test_deny_namespace.c |  7 ++-
> > >>>   8 files changed, 76 insertions(+), 14 deletions(-)
> > >>
> > >> I'm not sure we want to go down the path of a LSM modifying the POSIX
> > >> capabilities of a task, other than the capabilities/commoncap LSM.  It
> > >> sets a bad precedent and could further complicate issues around LSM
> > >> ordering.
> > >
> > > Well unless I'm misunderstanding, this does allow modifying the
> > > capabilities/commoncap LSM through BTF. The reason for allowing
> > > `userns_create` to be modified is that it is functionally very similar
> > > to `cred_prepare` in that it operates with new creds (but specific to
> > > user namespaces because of reasons detailed in [1]).
> >
> > yes
> >
> > > There were some concerns in previous threads that the userns caps by
> > > themselves wouldn't be granular enough, hence the LSM integration.
> >
> > > Ubuntu for example, currently has to resort to a hardcoded profile
> > > transition to achieve this [2].
> > >
> >
> > The hard coded profile transition, is because the more generic solution
> > as part of policy just wasn't ready. The hard coding will go away before
> > it is upstreamed.
> >
> > But yes, updating the cred really is necessary for the flexibility needed
> > whether it is modifying the POSIX capabilities of the task or the LSM
> > modifying its own security blob.
> >
> > I do share some of Paul's concerns about the LSM modifying the POSIX
> > capabilities of the task, but also thing the LSM here needs to be
> > able to modify its own blob.
> 
> To be clear, this isn't about a generic LSM needing to update its own
> blob (LSM state), it is about the BPF LSM updating the capability
> sets.  While we obviously must support a LSM updating its own state,
> I'm currently of the opinion that allowing one LSM to update the state
> of another LSM is only going to lead to problems.  We wouldn't want to
> allow Smack to update AppArmor state, and from my current perspective
> allowing the BPF LSM to update the capability state is no different.
> 
> It's also important to keep in mind that if we allow one LSM to do
> something, we need to allow all LSMs to do something.  If we allow
> multiple LSMs to manipulate the capability sets, how do we reconcile
> differences in the desired capability state?  Does that resolution
> change depending on what LSMs are enabled at build time?  Enabled at
> boot?  Similarly, what about custom LSM ordering?
> 
> What about those LSMs that use a task's capabilities as an input to an
> access control decision?  If those LSMs allow an access based on a
> given capability set only to have a LSM later in the ordering modify
> that capability set to something which would have resulted in an
> access denial, do we risk a security regression?

I understand the concerns, what I fail to understand however, is how is
it any different from say the `cred_prepare` hook today?

> Our current approach to handling multiple LSMs is that each LSM is
> limited to modifying its own state, and I'm pretty confident that we
> stick to this model if we have any hope of preserving the sanity of
> the LSM layer as a whole.  If you want to modify the capability set
> you need to do so within the confines of the capability LSM and/or
> modify the other related kernel subsystems (which I'm guessing will
> likely necessitate a change in the LSMs, but that avenue is very
> unclear if such an option even exists).

What do you mean by "within the confines of the capability LSM" here?

Arguably, if we do want fine-grained userns policies, we need LSMs to
influence the userns capset at some point. Regardless of how or where we
do this, it will always be subject to some sort of ordering. We could
come up with some rules to limit surprises (e.g. caps can only be
dropped, only possible through some hooks, etc), but at the end of the
day, something needs to have the final word when it comes to deciding
what the creds should be.
Paul Moore June 11, 2024, 10:38 p.m. UTC | #5
On Tue, Jun 11, 2024 at 6:15 PM Jonathan Calmels <jcalmels@3xx0.net> wrote:
> On Tue, Jun 11, 2024 at 03:01:01PM GMT, Paul Moore wrote:
> > On Tue, Jun 11, 2024 at 6:32 AM John Johansen
> > <john.johansen@canonical.com> wrote:
> > >
> > > On 6/11/24 01:09, Jonathan Calmels wrote:
> > > > On Sun, Jun 09, 2024 at 08:18:48PM GMT, Paul Moore wrote:
> > > >> On Sun, Jun 9, 2024 at 6:40 AM Jonathan Calmels <jcalmels@3xx0.net> wrote:
> > > >>>
> > > >>> This patch allows modifying the various capabilities of the struct cred
> > > >>> in BPF-LSM hooks. More specifically, the userns_create hook called
> > > >>> prior to creating a new user namespace.
> > > >>>
> > > >>> With the introduction of userns capabilities, this effectively provides
> > > >>> a simple way for LSMs to control the capabilities granted to a user
> > > >>> namespace and all its descendants.
> > > >>>
> > > >>> Update the selftests accordingly by dropping CAP_SYS_ADMIN in
> > > >>> namespaces and checking the resulting task's bounding set.
> > > >>>
> > > >>> Signed-off-by: Jonathan Calmels <jcalmels@3xx0.net>
> > > >>> ---
> > > >>>   include/linux/lsm_hook_defs.h                 |  2 +-
> > > >>>   include/linux/security.h                      |  4 +-
> > > >>>   kernel/bpf/bpf_lsm.c                          | 55 +++++++++++++++++++
> > > >>>   security/apparmor/lsm.c                       |  2 +-
> > > >>>   security/security.c                           |  6 +-
> > > >>>   security/selinux/hooks.c                      |  2 +-
> > > >>>   .../selftests/bpf/prog_tests/deny_namespace.c | 12 ++--
> > > >>>   .../selftests/bpf/progs/test_deny_namespace.c |  7 ++-
> > > >>>   8 files changed, 76 insertions(+), 14 deletions(-)
> > > >>
> > > >> I'm not sure we want to go down the path of a LSM modifying the POSIX
> > > >> capabilities of a task, other than the capabilities/commoncap LSM.  It
> > > >> sets a bad precedent and could further complicate issues around LSM
> > > >> ordering.
> > > >
> > > > Well unless I'm misunderstanding, this does allow modifying the
> > > > capabilities/commoncap LSM through BTF. The reason for allowing
> > > > `userns_create` to be modified is that it is functionally very similar
> > > > to `cred_prepare` in that it operates with new creds (but specific to
> > > > user namespaces because of reasons detailed in [1]).
> > >
> > > yes
> > >
> > > > There were some concerns in previous threads that the userns caps by
> > > > themselves wouldn't be granular enough, hence the LSM integration.
> > >
> > > > Ubuntu for example, currently has to resort to a hardcoded profile
> > > > transition to achieve this [2].
> > > >
> > >
> > > The hard coded profile transition, is because the more generic solution
> > > as part of policy just wasn't ready. The hard coding will go away before
> > > it is upstreamed.
> > >
> > > But yes, updating the cred really is necessary for the flexibility needed
> > > whether it is modifying the POSIX capabilities of the task or the LSM
> > > modifying its own security blob.
> > >
> > > I do share some of Paul's concerns about the LSM modifying the POSIX
> > > capabilities of the task, but also thing the LSM here needs to be
> > > able to modify its own blob.
> >
> > To be clear, this isn't about a generic LSM needing to update its own
> > blob (LSM state), it is about the BPF LSM updating the capability
> > sets.  While we obviously must support a LSM updating its own state,
> > I'm currently of the opinion that allowing one LSM to update the state
> > of another LSM is only going to lead to problems.  We wouldn't want to
> > allow Smack to update AppArmor state, and from my current perspective
> > allowing the BPF LSM to update the capability state is no different.
> >
> > It's also important to keep in mind that if we allow one LSM to do
> > something, we need to allow all LSMs to do something.  If we allow
> > multiple LSMs to manipulate the capability sets, how do we reconcile
> > differences in the desired capability state?  Does that resolution
> > change depending on what LSMs are enabled at build time?  Enabled at
> > boot?  Similarly, what about custom LSM ordering?
> >
> > What about those LSMs that use a task's capabilities as an input to an
> > access control decision?  If those LSMs allow an access based on a
> > given capability set only to have a LSM later in the ordering modify
> > that capability set to something which would have resulted in an
> > access denial, do we risk a security regression?
>
> I understand the concerns, what I fail to understand however, is how is
> it any different from say the `cred_prepare` hook today?

The existing cred_prepare hooks only operate on their own small
portion of the cred::security blob.  What you are proposing would be
the BPF LSM operating on the capability sets that it does not "own"
(they belong to the capability LSM).

If you see that as a minor difference, please understand that if you
skip past that you have all the issues I mentioned in my previous
message to deal with.

> > Our current approach to handling multiple LSMs is that each LSM is
> > limited to modifying its own state, and I'm pretty confident that we
> > stick to this model if we have any hope of preserving the sanity of
> > the LSM layer as a whole.  If you want to modify the capability set
> > you need to do so within the confines of the capability LSM and/or
> > modify the other related kernel subsystems (which I'm guessing will
> > likely necessitate a change in the LSMs, but that avenue is very
> > unclear if such an option even exists).
>
> What do you mean by "within the confines of the capability LSM" here?

Basically security/commoncap.c.  One could make a lot of arguments
about if it is, or isn't, a LSM, but commoncap.c registers LSM hooks
which is pretty much the definition of a LSM from an implementation
point of view.

> Arguably, if we do want fine-grained userns policies, we need LSMs to
> influence the userns capset at some point.

One could always use, or develop, a LSM that offers additional
controls around exercising capabilities.  There are currently four
in-tree LSMs, including the capabilities LSM, which supply a
security_capable() hook that is used by the capability-based access
controls in the kernel; all of these hook implementations work
together within the LSM framework and provide an additional level of
control/granularity beyond the existing capabilities.
Jonathan Calmels June 12, 2024, 8:20 a.m. UTC | #6
On Tue, Jun 11, 2024 at 06:38:31PM GMT, Paul Moore wrote:
> On Tue, Jun 11, 2024 at 6:15 PM Jonathan Calmels <jcalmels@3xx0.net> wrote:
> > On Tue, Jun 11, 2024 at 03:01:01PM GMT, Paul Moore wrote:
> > > On Tue, Jun 11, 2024 at 6:32 AM John Johansen
> > > <john.johansen@canonical.com> wrote:
> > > >
> > > > On 6/11/24 01:09, Jonathan Calmels wrote:
> > > > > On Sun, Jun 09, 2024 at 08:18:48PM GMT, Paul Moore wrote:
> > > > >> On Sun, Jun 9, 2024 at 6:40 AM Jonathan Calmels <jcalmels@3xx0.net> wrote:
> > > > >>>
> > > > >>> This patch allows modifying the various capabilities of the struct cred
> > > > >>> in BPF-LSM hooks. More specifically, the userns_create hook called
> > > > >>> prior to creating a new user namespace.
> > > > >>>
> > > > >>> With the introduction of userns capabilities, this effectively provides
> > > > >>> a simple way for LSMs to control the capabilities granted to a user
> > > > >>> namespace and all its descendants.
> > > > >>>
> > > > >>> Update the selftests accordingly by dropping CAP_SYS_ADMIN in
> > > > >>> namespaces and checking the resulting task's bounding set.
> > > > >>>
> > > > >>> Signed-off-by: Jonathan Calmels <jcalmels@3xx0.net>
> > > > >>> ---
> > > > >>>   include/linux/lsm_hook_defs.h                 |  2 +-
> > > > >>>   include/linux/security.h                      |  4 +-
> > > > >>>   kernel/bpf/bpf_lsm.c                          | 55 +++++++++++++++++++
> > > > >>>   security/apparmor/lsm.c                       |  2 +-
> > > > >>>   security/security.c                           |  6 +-
> > > > >>>   security/selinux/hooks.c                      |  2 +-
> > > > >>>   .../selftests/bpf/prog_tests/deny_namespace.c | 12 ++--
> > > > >>>   .../selftests/bpf/progs/test_deny_namespace.c |  7 ++-
> > > > >>>   8 files changed, 76 insertions(+), 14 deletions(-)
> > > > >>
> > > > >> I'm not sure we want to go down the path of a LSM modifying the POSIX
> > > > >> capabilities of a task, other than the capabilities/commoncap LSM.  It
> > > > >> sets a bad precedent and could further complicate issues around LSM
> > > > >> ordering.
> > > > >
> > > > > Well unless I'm misunderstanding, this does allow modifying the
> > > > > capabilities/commoncap LSM through BTF. The reason for allowing
> > > > > `userns_create` to be modified is that it is functionally very similar
> > > > > to `cred_prepare` in that it operates with new creds (but specific to
> > > > > user namespaces because of reasons detailed in [1]).
> > > >
> > > > yes
> > > >
> > > > > There were some concerns in previous threads that the userns caps by
> > > > > themselves wouldn't be granular enough, hence the LSM integration.
> > > >
> > > > > Ubuntu for example, currently has to resort to a hardcoded profile
> > > > > transition to achieve this [2].
> > > > >
> > > >
> > > > The hard coded profile transition, is because the more generic solution
> > > > as part of policy just wasn't ready. The hard coding will go away before
> > > > it is upstreamed.
> > > >
> > > > But yes, updating the cred really is necessary for the flexibility needed
> > > > whether it is modifying the POSIX capabilities of the task or the LSM
> > > > modifying its own security blob.
> > > >
> > > > I do share some of Paul's concerns about the LSM modifying the POSIX
> > > > capabilities of the task, but also thing the LSM here needs to be
> > > > able to modify its own blob.
> > >
> > > To be clear, this isn't about a generic LSM needing to update its own
> > > blob (LSM state), it is about the BPF LSM updating the capability
> > > sets.  While we obviously must support a LSM updating its own state,
> > > I'm currently of the opinion that allowing one LSM to update the state
> > > of another LSM is only going to lead to problems.  We wouldn't want to
> > > allow Smack to update AppArmor state, and from my current perspective
> > > allowing the BPF LSM to update the capability state is no different.
> > >
> > > It's also important to keep in mind that if we allow one LSM to do
> > > something, we need to allow all LSMs to do something.  If we allow
> > > multiple LSMs to manipulate the capability sets, how do we reconcile
> > > differences in the desired capability state?  Does that resolution
> > > change depending on what LSMs are enabled at build time?  Enabled at
> > > boot?  Similarly, what about custom LSM ordering?
> > >
> > > What about those LSMs that use a task's capabilities as an input to an
> > > access control decision?  If those LSMs allow an access based on a
> > > given capability set only to have a LSM later in the ordering modify
> > > that capability set to something which would have resulted in an
> > > access denial, do we risk a security regression?
> >
> > I understand the concerns, what I fail to understand however, is how is
> > it any different from say the `cred_prepare` hook today?
> 
> The existing cred_prepare hooks only operate on their own small
> portion of the cred::security blob.  What you are proposing would be
> the BPF LSM operating on the capability sets that it does not "own"
> (they belong to the capability LSM).
> 
> If you see that as a minor difference, please understand that if you
> skip past that you have all the issues I mentioned in my previous
> message to deal with.
> 
> > > Our current approach to handling multiple LSMs is that each LSM is
> > > limited to modifying its own state, and I'm pretty confident that we
> > > stick to this model if we have any hope of preserving the sanity of
> > > the LSM layer as a whole.  If you want to modify the capability set
> > > you need to do so within the confines of the capability LSM and/or
> > > modify the other related kernel subsystems (which I'm guessing will
> > > likely necessitate a change in the LSMs, but that avenue is very
> > > unclear if such an option even exists).
> >
> > What do you mean by "within the confines of the capability LSM" here?
> 
> Basically security/commoncap.c.  One could make a lot of arguments
> about if it is, or isn't, a LSM, but commoncap.c registers LSM hooks
> which is pretty much the definition of a LSM from an implementation
> point of view.

Yes, hence the proposal to give it more fine-grained controls than
what's currently available. But to your point, unlike the others,
its own state (i.e. capsets) is shared, so this gets questionable.

> > Arguably, if we do want fine-grained userns policies, we need LSMs to
> > influence the userns capset at some point.
> 
> One could always use, or develop, a LSM that offers additional
> controls around exercising capabilities.  There are currently four
> in-tree LSMs, including the capabilities LSM, which supply a
> security_capable() hook that is used by the capability-based access
> controls in the kernel; all of these hook implementations work
> together within the LSM framework and provide an additional level of
> control/granularity beyond the existing capabilities.

Right, but the idea was to have a simple and easy way to reuse/trigger
as much of the commoncap one as possible from BPF. If we're saying we
need to reimplement and/or use a whole new framework, then there is
little value.

TBH, I don't feel strongly about this, which is why it is absent from
v1. However, as John pointed out, we should at least be able to modify
the blob if we want flexible userns caps policies down the road.
Paul Moore June 12, 2024, 5:29 p.m. UTC | #7
On Wed, Jun 12, 2024 at 4:15 AM Jonathan Calmels <jcalmels@3xx0.net> wrote:
> On Tue, Jun 11, 2024 at 06:38:31PM GMT, Paul Moore wrote:
> > On Tue, Jun 11, 2024 at 6:15 PM Jonathan Calmels <jcalmels@3xx0.net> wrote:

...

> > > Arguably, if we do want fine-grained userns policies, we need LSMs to
> > > influence the userns capset at some point.
> >
> > One could always use, or develop, a LSM that offers additional
> > controls around exercising capabilities.  There are currently four
> > in-tree LSMs, including the capabilities LSM, which supply a
> > security_capable() hook that is used by the capability-based access
> > controls in the kernel; all of these hook implementations work
> > together within the LSM framework and provide an additional level of
> > control/granularity beyond the existing capabilities.
>
> Right, but the idea was to have a simple and easy way to reuse/trigger
> as much of the commoncap one as possible from BPF. If we're saying we
> need to reimplement and/or use a whole new framework, then there is
> little value.

I can appreciate how allowing direct manipulation of capability bits
from a BPF LSM looks attractive, but my hope is that our discussion
here revealed that as you look deeper into making it work there are a
number of pitfalls which prevent this from being a safe option for
generalized systems.

> TBH, I don't feel strongly about this, which is why it is absent from
> v1. However, as John pointed out, we should at least be able to modify
> the blob if we want flexible userns caps policies down the road.

As discussed in this thread, there are existing ways to provide fine
grained control over exercising capabilities that can be safely used
within the LSM framework.  I don't want to speak to what John is
envisioning, but he should be aware of these mechanisms, and if I
recall he did voice a level of concern about the same worries I
mentioned.

I'm happy to discuss ways in which we can adjust the LSM hooks/layer
to support different approaches to capability controls, but one LSM
directly manipulating the state of another is going to be a no vote
from me.
John Johansen June 13, 2024, 3:54 a.m. UTC | #8
On 6/12/24 10:29, Paul Moore wrote:
> On Wed, Jun 12, 2024 at 4:15 AM Jonathan Calmels <jcalmels@3xx0.net> wrote:
>> On Tue, Jun 11, 2024 at 06:38:31PM GMT, Paul Moore wrote:
>>> On Tue, Jun 11, 2024 at 6:15 PM Jonathan Calmels <jcalmels@3xx0.net> wrote:
> 
> ...
> 
>>>> Arguably, if we do want fine-grained userns policies, we need LSMs to
>>>> influence the userns capset at some point.
>>>
>>> One could always use, or develop, a LSM that offers additional
>>> controls around exercising capabilities.  There are currently four
>>> in-tree LSMs, including the capabilities LSM, which supply a
>>> security_capable() hook that is used by the capability-based access
>>> controls in the kernel; all of these hook implementations work
>>> together within the LSM framework and provide an additional level of
>>> control/granularity beyond the existing capabilities.
>>
>> Right, but the idea was to have a simple and easy way to reuse/trigger
>> as much of the commoncap one as possible from BPF. If we're saying we
>> need to reimplement and/or use a whole new framework, then there is
>> little value.
> 
> I can appreciate how allowing direct manipulation of capability bits
> from a BPF LSM looks attractive, but my hope is that our discussion
> here revealed that as you look deeper into making it work there are a
> number of pitfalls which prevent this from being a safe option for
> generalized systems.
> 
>> TBH, I don't feel strongly about this, which is why it is absent from
>> v1. However, as John pointed out, we should at least be able to modify
>> the blob if we want flexible userns caps policies down the road.
> 
> As discussed in this thread, there are existing ways to provide fine
> grained control over exercising capabilities that can be safely used
> within the LSM framework.  I don't want to speak to what John is
> envisioning, but he should be aware of these mechanisms, and if I
> recall he did voice a level of concern about the same worries I
> mentioned.
> 

sorry, I should have been more clear. I envision LSMs being able to
update their own state in the userns hook.

Basically the portion of the patch that removes const from the
userns hook.

An LSM updating the capset is worrysome for all the reasons you
pointed out, and I think a few more. I haven't had a chance to really
look at v2 yet, so I didn't want to speak directly on the bpf part of
the patch without first giving a good once over.

> I'm happy to discuss ways in which we can adjust the LSM hooks/layer
> to support different approaches to capability controls, but one LSM
> directly manipulating the state of another is going to be a no vote
> from me.
> 
I might not be as hard no as Paul here, I am always willing to listen
to arguments, but it would have to be a really good argument to
modify the capset, when there are multiple LSMs in play on a system.
Jonathan Calmels June 13, 2024, 8:50 a.m. UTC | #9
On Wed, Jun 12, 2024 at 08:54:28PM GMT, John Johansen wrote:
> On 6/12/24 10:29, Paul Moore wrote:
> > On Wed, Jun 12, 2024 at 4:15 AM Jonathan Calmels <jcalmels@3xx0.net> wrote:
> > > On Tue, Jun 11, 2024 at 06:38:31PM GMT, Paul Moore wrote:
> > > > On Tue, Jun 11, 2024 at 6:15 PM Jonathan Calmels <jcalmels@3xx0.net> wrote:
> > 
> > ...
> > 
> > > > > Arguably, if we do want fine-grained userns policies, we need LSMs to
> > > > > influence the userns capset at some point.
> > > > 
> > > > One could always use, or develop, a LSM that offers additional
> > > > controls around exercising capabilities.  There are currently four
> > > > in-tree LSMs, including the capabilities LSM, which supply a
> > > > security_capable() hook that is used by the capability-based access
> > > > controls in the kernel; all of these hook implementations work
> > > > together within the LSM framework and provide an additional level of
> > > > control/granularity beyond the existing capabilities.
> > > 
> > > Right, but the idea was to have a simple and easy way to reuse/trigger
> > > as much of the commoncap one as possible from BPF. If we're saying we
> > > need to reimplement and/or use a whole new framework, then there is
> > > little value.
> > 
> > I can appreciate how allowing direct manipulation of capability bits
> > from a BPF LSM looks attractive, but my hope is that our discussion
> > here revealed that as you look deeper into making it work there are a
> > number of pitfalls which prevent this from being a safe option for
> > generalized systems.
> > 
> > > TBH, I don't feel strongly about this, which is why it is absent from
> > > v1. However, as John pointed out, we should at least be able to modify
> > > the blob if we want flexible userns caps policies down the road.
> > 
> > As discussed in this thread, there are existing ways to provide fine
> > grained control over exercising capabilities that can be safely used
> > within the LSM framework.  I don't want to speak to what John is
> > envisioning, but he should be aware of these mechanisms, and if I
> > recall he did voice a level of concern about the same worries I
> > mentioned.
> > 
> 
> sorry, I should have been more clear. I envision LSMs being able to
> update their own state in the userns hook.
> 
> Basically the portion of the patch that removes const from the
> userns hook.

Yes, pretty sure we'll need this regardless.

> An LSM updating the capset is worrysome for all the reasons you
> pointed out, and I think a few more. I haven't had a chance to really
> look at v2 yet, so I didn't want to speak directly on the bpf part of
> the patch without first giving a good once over.
> 
> > I'm happy to discuss ways in which we can adjust the LSM hooks/layer
> > to support different approaches to capability controls, but one LSM
> > directly manipulating the state of another is going to be a no vote
> > from me.
> > 
> I might not be as hard no as Paul here, I am always willing to listen
> to arguments, but it would have to be a really good argument to
> modify the capset, when there are multiple LSMs in play on a system.

The way I see it, it's more about enhancing the capability LSM with BPF
hooks and have it modify its own state dynamically, not so much
crosstalk between two distinct LSM frameworks (say one where the BPF
LSM implements a lot of things like capable()).

In this context and with enough safeguards (say we only allow dropping
caps) this could be a net positive. Sure, ordering could come into play
in very specific scenarios, but at this point I would expect the
admin/LSM author to be conscious about it.

If we think there is no way we can come up with something that's safe
enough, and that the risks outweigh the benefits, fine by me, we can
drop this patch from the series.
Dr. Greg June 13, 2024, 10:45 a.m. UTC | #10
On Wed, Jun 12, 2024 at 08:54:28PM -0700, John Johansen wrote:

Good morning, I hope the day is going well for everyone.

> On 6/12/24 10:29, Paul Moore wrote:
> >On Wed, Jun 12, 2024 at 4:15???AM Jonathan Calmels <jcalmels@3xx0.net> 
> >wrote:
> >>On Tue, Jun 11, 2024 at 06:38:31PM GMT, Paul Moore wrote:
> >>>On Tue, Jun 11, 2024 at 6:15???PM Jonathan Calmels <jcalmels@3xx0.net> 
> >>>wrote:
> >
> >...
> >
> >>>>Arguably, if we do want fine-grained userns policies, we need LSMs to
> >>>>influence the userns capset at some point.
> >>>
> >>>One could always use, or develop, a LSM that offers additional
> >>>controls around exercising capabilities.  There are currently four
> >>>in-tree LSMs, including the capabilities LSM, which supply a
> >>>security_capable() hook that is used by the capability-based access
> >>>controls in the kernel; all of these hook implementations work
> >>>together within the LSM framework and provide an additional level of
> >>>control/granularity beyond the existing capabilities.
> >>
> >>Right, but the idea was to have a simple and easy way to reuse/trigger
> >>as much of the commoncap one as possible from BPF. If we're saying we
> >>need to reimplement and/or use a whole new framework, then there is
> >>little value.
> >
> >I can appreciate how allowing direct manipulation of capability bits
> >from a BPF LSM looks attractive, but my hope is that our discussion
> >here revealed that as you look deeper into making it work there are a
> >number of pitfalls which prevent this from being a safe option for
> >generalized systems.
> >
> >>TBH, I don't feel strongly about this, which is why it is absent from
> >>v1. However, as John pointed out, we should at least be able to modify
> >>the blob if we want flexible userns caps policies down the road.
> >
> >As discussed in this thread, there are existing ways to provide fine
> >grained control over exercising capabilities that can be safely used
> >within the LSM framework.  I don't want to speak to what John is
> >envisioning, but he should be aware of these mechanisms, and if I
> >recall he did voice a level of concern about the same worries I
> >mentioned.
> >
> 
> sorry, I should have been more clear. I envision LSMs being able to
> update their own state in the userns hook.
> 
> Basically the portion of the patch that removes const from the
> userns hook.
> 
> An LSM updating the capset is worrysome for all the reasons you
> pointed out, and I think a few more. I haven't had a chance to really
> look at v2 yet, so I didn't want to speak directly on the bpf part of
> the patch without first giving a good once over.
> 
> >I'm happy to discuss ways in which we can adjust the LSM hooks/layer
> >to support different approaches to capability controls, but one LSM
> >directly manipulating the state of another is going to be a no vote
> >from me.
> >
> I might not be as hard no as Paul here, I am always willing to listen
> to arguments, but it would have to be a really good argument to
> modify the capset, when there are multiple LSMs in play on a system.

Putting my pragmatic operations hat on, it isn't just the impact on
multiple LSM's.

The security vendors, CrowdStrike's Falcon comes immediately to mind,
are installing BPF hooks as part of their agent systems.

Given that the issue of signing BPF programs is still an open
question, allowing the ability of a BPF program to modify the security
capabilities of a process opens the door to supply chain attacks that
would seem unbounded in scope.

On the other side of the fence, installing a BPF program is a
privileged operation.  If a decision is made to allow that kind of
privilege on a system the argument can be made that you get to keep
both pieces.

Of course that needs to be paired against the fact that system's
administrators are not given any choice as to the wisdom of that type
of permission being afforded to security applications.

Best wishes for a productive remainder of the week.

As always,
Dr. Greg

The Quixote Project - Flailing at the Travails of Cybersecurity
              https://github.com/Quixote-Project
Paul Moore June 13, 2024, 8:43 p.m. UTC | #11
On Wed, Jun 12, 2024 at 11:54 PM John Johansen
<john.johansen@canonical.com> wrote:
> On 6/12/24 10:29, Paul Moore wrote:
> > On Wed, Jun 12, 2024 at 4:15 AM Jonathan Calmels <jcalmels@3xx0.net> wrote:
> >> On Tue, Jun 11, 2024 at 06:38:31PM GMT, Paul Moore wrote:
> >>> On Tue, Jun 11, 2024 at 6:15 PM Jonathan Calmels <jcalmels@3xx0.net> wrote:
> >
> > ...
> >
> >>>> Arguably, if we do want fine-grained userns policies, we need LSMs to
> >>>> influence the userns capset at some point.
> >>>
> >>> One could always use, or develop, a LSM that offers additional
> >>> controls around exercising capabilities.  There are currently four
> >>> in-tree LSMs, including the capabilities LSM, which supply a
> >>> security_capable() hook that is used by the capability-based access
> >>> controls in the kernel; all of these hook implementations work
> >>> together within the LSM framework and provide an additional level of
> >>> control/granularity beyond the existing capabilities.
> >>
> >> Right, but the idea was to have a simple and easy way to reuse/trigger
> >> as much of the commoncap one as possible from BPF. If we're saying we
> >> need to reimplement and/or use a whole new framework, then there is
> >> little value.
> >
> > I can appreciate how allowing direct manipulation of capability bits
> > from a BPF LSM looks attractive, but my hope is that our discussion
> > here revealed that as you look deeper into making it work there are a
> > number of pitfalls which prevent this from being a safe option for
> > generalized systems.
> >
> >> TBH, I don't feel strongly about this, which is why it is absent from
> >> v1. However, as John pointed out, we should at least be able to modify
> >> the blob if we want flexible userns caps policies down the road.
> >
> > As discussed in this thread, there are existing ways to provide fine
> > grained control over exercising capabilities that can be safely used
> > within the LSM framework.  I don't want to speak to what John is
> > envisioning, but he should be aware of these mechanisms, and if I
> > recall he did voice a level of concern about the same worries I
> > mentioned.
> >
>
> sorry, I should have been more clear. I envision LSMs being able to
> update their own state in the userns hook.

Ah, okay, yes, that seems reasonable; although like any other change,
until we have an in-tree user we should just leave it as-is.
Paul Moore June 13, 2024, 8:55 p.m. UTC | #12
On Thu, Jun 13, 2024 at 4:45 AM Jonathan Calmels <jcalmels@3xx0.net> wrote:
> On Wed, Jun 12, 2024 at 08:54:28PM GMT, John Johansen wrote:
> > On 6/12/24 10:29, Paul Moore wrote:
> > > On Wed, Jun 12, 2024 at 4:15 AM Jonathan Calmels <jcalmels@3xx0.net> wrote:
> > > > On Tue, Jun 11, 2024 at 06:38:31PM GMT, Paul Moore wrote:
> > > > > On Tue, Jun 11, 2024 at 6:15 PM Jonathan Calmels <jcalmels@3xx0.net> wrote:
> > >
> > > ...
> > >
> > > > > > Arguably, if we do want fine-grained userns policies, we need LSMs to
> > > > > > influence the userns capset at some point.
> > > > >
> > > > > One could always use, or develop, a LSM that offers additional
> > > > > controls around exercising capabilities.  There are currently four
> > > > > in-tree LSMs, including the capabilities LSM, which supply a
> > > > > security_capable() hook that is used by the capability-based access
> > > > > controls in the kernel; all of these hook implementations work
> > > > > together within the LSM framework and provide an additional level of
> > > > > control/granularity beyond the existing capabilities.
> > > >
> > > > Right, but the idea was to have a simple and easy way to reuse/trigger
> > > > as much of the commoncap one as possible from BPF. If we're saying we
> > > > need to reimplement and/or use a whole new framework, then there is
> > > > little value.
> > >
> > > I can appreciate how allowing direct manipulation of capability bits
> > > from a BPF LSM looks attractive, but my hope is that our discussion
> > > here revealed that as you look deeper into making it work there are a
> > > number of pitfalls which prevent this from being a safe option for
> > > generalized systems.
> > >
> > > > TBH, I don't feel strongly about this, which is why it is absent from
> > > > v1. However, as John pointed out, we should at least be able to modify
> > > > the blob if we want flexible userns caps policies down the road.
> > >
> > > As discussed in this thread, there are existing ways to provide fine
> > > grained control over exercising capabilities that can be safely used
> > > within the LSM framework.  I don't want to speak to what John is
> > > envisioning, but he should be aware of these mechanisms, and if I
> > > recall he did voice a level of concern about the same worries I
> > > mentioned.
> > >
> >
> > sorry, I should have been more clear. I envision LSMs being able to
> > update their own state in the userns hook.
> >
> > Basically the portion of the patch that removes const from the
> > userns hook.
>
> Yes, pretty sure we'll need this regardless.
>
> > An LSM updating the capset is worrysome for all the reasons you
> > pointed out, and I think a few more. I haven't had a chance to really
> > look at v2 yet, so I didn't want to speak directly on the bpf part of
> > the patch without first giving a good once over.


> > > I'm happy to discuss ways in which we can adjust the LSM hooks/layer
> > > to support different approaches to capability controls, but one LSM
> > > directly manipulating the state of another is going to be a no vote
> > > from me.
> >
> > I might not be as hard no as Paul here, I am always willing to listen
> > to arguments, but it would have to be a really good argument to
> > modify the capset, when there are multiple LSMs in play on a system.
>
> The way I see it, it's more about enhancing the capability LSM with BPF
> hooks and have it modify its own state dynamically, not so much
> crosstalk between two distinct LSM frameworks (say one where the BPF
> LSM implements a lot of things like capable()).

As I mentioned previously, if you want to do something with the
capability sets you simply need to do it within the confines of
security/commoncap.c.  If you're really set on the "MUST BE BPF!" way
of life, and you can convince Serge (capabilities maintainer) that it
would be a good idea, you could propose a dedicated BPF hook within
the capabilities LSM.  I'm not sure how wise that would be, but it
would resolve a lot of the LSM ordering/stacking issues that we've
discussed.

> If we think there is no way we can come up with something that's safe
> enough, and that the risks outweigh the benefits, fine by me, we can
> drop this patch from the series.

To be clear, this patch is not acceptable at this point in time.  With
the understanding that I haven't looked that closely at the rest of
the patchset, it looks fairly well contained to the capabilities code
which means it is largely up to Serge, not me.

I will mention that you should update the audit code to recognize the
new capability set, look at kernel/auditsc.c for more information.
Serge E. Hallyn June 15, 2024, 3:20 p.m. UTC | #13
On Thu, Jun 13, 2024 at 01:50:29AM -0700, Jonathan Calmels wrote:
> On Wed, Jun 12, 2024 at 08:54:28PM GMT, John Johansen wrote:
> > On 6/12/24 10:29, Paul Moore wrote:
> > > On Wed, Jun 12, 2024 at 4:15 AM Jonathan Calmels <jcalmels@3xx0.net> wrote:
> > > > On Tue, Jun 11, 2024 at 06:38:31PM GMT, Paul Moore wrote:
> > > > > On Tue, Jun 11, 2024 at 6:15 PM Jonathan Calmels <jcalmels@3xx0.net> wrote:
> > > 
> > > ...
> > > 
> > > > > > Arguably, if we do want fine-grained userns policies, we need LSMs to
> > > > > > influence the userns capset at some point.
> > > > > 
> > > > > One could always use, or develop, a LSM that offers additional
> > > > > controls around exercising capabilities.  There are currently four
> > > > > in-tree LSMs, including the capabilities LSM, which supply a
> > > > > security_capable() hook that is used by the capability-based access
> > > > > controls in the kernel; all of these hook implementations work
> > > > > together within the LSM framework and provide an additional level of
> > > > > control/granularity beyond the existing capabilities.
> > > > 
> > > > Right, but the idea was to have a simple and easy way to reuse/trigger
> > > > as much of the commoncap one as possible from BPF. If we're saying we
> > > > need to reimplement and/or use a whole new framework, then there is
> > > > little value.
> > > 
> > > I can appreciate how allowing direct manipulation of capability bits
> > > from a BPF LSM looks attractive, but my hope is that our discussion
> > > here revealed that as you look deeper into making it work there are a
> > > number of pitfalls which prevent this from being a safe option for
> > > generalized systems.
> > > 
> > > > TBH, I don't feel strongly about this, which is why it is absent from
> > > > v1. However, as John pointed out, we should at least be able to modify
> > > > the blob if we want flexible userns caps policies down the road.
> > > 
> > > As discussed in this thread, there are existing ways to provide fine
> > > grained control over exercising capabilities that can be safely used
> > > within the LSM framework.  I don't want to speak to what John is
> > > envisioning, but he should be aware of these mechanisms, and if I
> > > recall he did voice a level of concern about the same worries I
> > > mentioned.
> > > 
> > 
> > sorry, I should have been more clear. I envision LSMs being able to
> > update their own state in the userns hook.
> > 
> > Basically the portion of the patch that removes const from the
> > userns hook.
> 
> Yes, pretty sure we'll need this regardless.
> 
> > An LSM updating the capset is worrysome for all the reasons you
> > pointed out, and I think a few more. I haven't had a chance to really
> > look at v2 yet, so I didn't want to speak directly on the bpf part of
> > the patch without first giving a good once over.
> > 
> > > I'm happy to discuss ways in which we can adjust the LSM hooks/layer
> > > to support different approaches to capability controls, but one LSM
> > > directly manipulating the state of another is going to be a no vote
> > > from me.
> > > 
> > I might not be as hard no as Paul here, I am always willing to listen
> > to arguments, but it would have to be a really good argument to
> > modify the capset, when there are multiple LSMs in play on a system.
> 
> The way I see it, it's more about enhancing the capability LSM with BPF
> hooks and have it modify its own state dynamically, not so much
> crosstalk between two distinct LSM frameworks (say one where the BPF
> LSM implements a lot of things like capable()).
> 
> In this context and with enough safeguards (say we only allow dropping
> caps) this could be a net positive. Sure, ordering could come into play
> in very specific scenarios, but at this point I would expect the
> admin/LSM author to be conscious about it.
> 
> If we think there is no way we can come up with something that's safe
> enough, and that the risks outweigh the benefits, fine by me, we can
> drop this patch from the series.

I think pursuing patches 1-3 now, and punting on 4 until later, would
be great.
diff mbox series

Patch

diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
index f804b76cde44..58d6d8f2511f 100644
--- a/include/linux/lsm_hook_defs.h
+++ b/include/linux/lsm_hook_defs.h
@@ -250,7 +250,7 @@  LSM_HOOK(int, -ENOSYS, task_prctl, int option, unsigned long arg2,
 	 unsigned long arg3, unsigned long arg4, unsigned long arg5)
 LSM_HOOK(void, LSM_RET_VOID, task_to_inode, struct task_struct *p,
 	 struct inode *inode)
-LSM_HOOK(int, 0, userns_create, const struct cred *cred)
+LSM_HOOK(int, 0, userns_create, struct cred *cred)
 LSM_HOOK(int, 0, ipc_permission, struct kern_ipc_perm *ipcp, short flag)
 LSM_HOOK(void, LSM_RET_VOID, ipc_getsecid, struct kern_ipc_perm *ipcp,
 	 u32 *secid)
diff --git a/include/linux/security.h b/include/linux/security.h
index 21cf70346b33..ffb1b0dd2aef 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -465,7 +465,7 @@  int security_task_kill(struct task_struct *p, struct kernel_siginfo *info,
 int security_task_prctl(int option, unsigned long arg2, unsigned long arg3,
 			unsigned long arg4, unsigned long arg5);
 void security_task_to_inode(struct task_struct *p, struct inode *inode);
-int security_create_user_ns(const struct cred *cred);
+int security_create_user_ns(struct cred *cred);
 int security_ipc_permission(struct kern_ipc_perm *ipcp, short flag);
 void security_ipc_getsecid(struct kern_ipc_perm *ipcp, u32 *secid);
 int security_msg_msg_alloc(struct msg_msg *msg);
@@ -1294,7 +1294,7 @@  static inline int security_task_prctl(int option, unsigned long arg2,
 static inline void security_task_to_inode(struct task_struct *p, struct inode *inode)
 { }
 
-static inline int security_create_user_ns(const struct cred *cred)
+static inline int security_create_user_ns(struct cred *cred)
 {
 	return 0;
 }
diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
index 68240c3c6e7d..6edba93ff883 100644
--- a/kernel/bpf/bpf_lsm.c
+++ b/kernel/bpf/bpf_lsm.c
@@ -382,10 +382,65 @@  bool bpf_lsm_is_trusted(const struct bpf_prog *prog)
 	return !btf_id_set_contains(&untrusted_lsm_hooks, prog->aux->attach_btf_id);
 }
 
+static int bpf_lsm_btf_struct_access(struct bpf_verifier_log *log,
+				     const struct bpf_reg_state *reg,
+				     int off, int size)
+{
+	const struct btf_type *cred;
+	const struct btf_type *t;
+	s32 type_id;
+	size_t end;
+
+	type_id = btf_find_by_name_kind(reg->btf, "cred", BTF_KIND_STRUCT);
+	if (type_id < 0)
+		return -EINVAL;
+
+	t = btf_type_by_id(reg->btf, reg->btf_id);
+	cred = btf_type_by_id(reg->btf, type_id);
+	if (t != cred) {
+		bpf_log(log, "only read is supported\n");
+		return -EACCES;
+	}
+
+	switch (off) {
+	case offsetof(struct cred, cap_inheritable):
+		end = offsetofend(struct cred, cap_inheritable);
+		break;
+	case offsetof(struct cred, cap_permitted):
+		end = offsetofend(struct cred, cap_permitted);
+		break;
+	case offsetof(struct cred, cap_effective):
+		end = offsetofend(struct cred, cap_effective);
+		break;
+	case offsetof(struct cred, cap_bset):
+		end = offsetofend(struct cred, cap_bset);
+		break;
+	case offsetof(struct cred, cap_ambient):
+		end = offsetofend(struct cred, cap_ambient);
+		break;
+	case offsetof(struct cred, cap_userns):
+		end = offsetofend(struct cred, cap_userns);
+		break;
+	default:
+		bpf_log(log, "no write support to cred at off %d\n", off);
+		return -EACCES;
+	}
+
+	if (off + size > end) {
+		bpf_log(log,
+			"write access at off %d with size %d beyond the member of cred ended at %zu\n",
+			off, size, end);
+		return -EACCES;
+	}
+
+	return 0;
+}
+
 const struct bpf_prog_ops lsm_prog_ops = {
 };
 
 const struct bpf_verifier_ops lsm_verifier_ops = {
 	.get_func_proto = bpf_lsm_func_proto,
 	.is_valid_access = btf_ctx_access,
+	.btf_struct_access = bpf_lsm_btf_struct_access,
 };
diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
index 6239777090c4..310c9fa3d4b4 100644
--- a/security/apparmor/lsm.c
+++ b/security/apparmor/lsm.c
@@ -1036,7 +1036,7 @@  static int apparmor_task_kill(struct task_struct *target, struct kernel_siginfo
 	return error;
 }
 
-static int apparmor_userns_create(const struct cred *cred)
+static int apparmor_userns_create(struct cred *cred)
 {
 	struct aa_label *label;
 	struct aa_profile *profile;
diff --git a/security/security.c b/security/security.c
index e5da848c50b9..83cf2025c58e 100644
--- a/security/security.c
+++ b/security/security.c
@@ -3558,14 +3558,14 @@  void security_task_to_inode(struct task_struct *p, struct inode *inode)
 }
 
 /**
- * security_create_user_ns() - Check if creating a new userns is allowed
+ * security_create_user_ns() - Review permissions prior to userns creation
  * @cred: prepared creds
  *
- * Check permission prior to creating a new user namespace.
+ * Check and/or modify permissions prior to creating a new user namespace.
  *
  * Return: Returns 0 if successful, otherwise < 0 error code.
  */
-int security_create_user_ns(const struct cred *cred)
+int security_create_user_ns(struct cred *cred)
 {
 	return call_int_hook(userns_create, cred);
 }
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 7eed331e90f0..28deb9510d8e 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -4263,7 +4263,7 @@  static void selinux_task_to_inode(struct task_struct *p,
 	spin_unlock(&isec->lock);
 }
 
-static int selinux_userns_create(const struct cred *cred)
+static int selinux_userns_create(struct cred *cred)
 {
 	u32 sid = current_sid();
 
diff --git a/tools/testing/selftests/bpf/prog_tests/deny_namespace.c b/tools/testing/selftests/bpf/prog_tests/deny_namespace.c
index 1bc6241b755b..1500578f9a30 100644
--- a/tools/testing/selftests/bpf/prog_tests/deny_namespace.c
+++ b/tools/testing/selftests/bpf/prog_tests/deny_namespace.c
@@ -5,6 +5,8 @@ 
 #include <sched.h>
 #include "cap_helpers.h"
 #include <stdio.h>
+#include <stdbool.h>
+#include <sys/prctl.h>
 
 static int wait_for_pid(pid_t pid)
 {
@@ -29,7 +31,7 @@  static int wait_for_pid(pid_t pid)
  * positive return value -> userns creation failed
  * 0                     -> userns creation succeeded
  */
-static int create_user_ns(void)
+static int create_user_ns(bool bpf)
 {
 	pid_t pid;
 
@@ -40,6 +42,8 @@  static int create_user_ns(void)
 	if (pid == 0) {
 		if (unshare(CLONE_NEWUSER))
 			_exit(EXIT_FAILURE);
+		if (bpf && prctl(PR_CAPBSET_READ, CAP_SYS_ADMIN))
+			_exit(EXIT_FAILURE);
 		_exit(EXIT_SUCCESS);
 	}
 
@@ -53,11 +57,11 @@  static void test_userns_create_bpf(void)
 
 	cap_enable_effective(cap_mask, &old_caps);
 
-	ASSERT_OK(create_user_ns(), "priv new user ns");
+	ASSERT_OK(create_user_ns(true), "priv new user ns");
 
 	cap_disable_effective(cap_mask, &old_caps);
 
-	ASSERT_EQ(create_user_ns(), EPERM, "unpriv new user ns");
+	ASSERT_EQ(create_user_ns(true), EPERM, "unpriv new user ns");
 
 	if (cap_mask & old_caps)
 		cap_enable_effective(cap_mask, NULL);
@@ -70,7 +74,7 @@  static void test_unpriv_userns_create_no_bpf(void)
 
 	cap_disable_effective(cap_mask, &old_caps);
 
-	ASSERT_OK(create_user_ns(), "no-bpf unpriv new user ns");
+	ASSERT_OK(create_user_ns(false), "no-bpf unpriv new user ns");
 
 	if (cap_mask & old_caps)
 		cap_enable_effective(cap_mask, NULL);
diff --git a/tools/testing/selftests/bpf/progs/test_deny_namespace.c b/tools/testing/selftests/bpf/progs/test_deny_namespace.c
index e96b901a733c..051906f80f4c 100644
--- a/tools/testing/selftests/bpf/progs/test_deny_namespace.c
+++ b/tools/testing/selftests/bpf/progs/test_deny_namespace.c
@@ -9,12 +9,13 @@  typedef struct { unsigned long long val; } kernel_cap_t;
 
 struct cred {
 	kernel_cap_t cap_effective;
+	kernel_cap_t cap_userns;
 } __attribute__((preserve_access_index));
 
 char _license[] SEC("license") = "GPL";
 
 SEC("lsm.s/userns_create")
-int BPF_PROG(test_userns_create, const struct cred *cred, int ret)
+int BPF_PROG(test_userns_create, struct cred *cred, int ret)
 {
 	kernel_cap_t caps = cred->cap_effective;
 	__u64 cap_mask = 1ULL << CAP_SYS_ADMIN;
@@ -23,8 +24,10 @@  int BPF_PROG(test_userns_create, const struct cred *cred, int ret)
 		return 0;
 
 	ret = -EPERM;
-	if (caps.val & cap_mask)
+	if (caps.val & cap_mask) {
+		cred->cap_userns.val &= ~cap_mask;
 		return 0;
+	}
 
 	return -EPERM;
 }