[v2] fs/ceph/quota: ignore quota with CAP_SYS_RESOURCE

Message ID	20240911142452.4110190-1-max.kellermann@ionos.com
State	New
Headers	show Received: from mail-wm1-f41.google.com (mail-wm1-f41.google.com [209.85.128.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1CCA18F6C for <ceph-devel@vger.kernel.org>; Wed, 11 Sep 2024 14:25:04 +0000 (UTC) From: Max Kellermann <max.kellermann@ionos.com> To: xiubli@redhat.com, idryomov@gmail.com, ceph-devel@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Max Kellermann <max.kellermann@ionos.com> Subject: [PATCH v2] fs/ceph/quota: ignore quota with CAP_SYS_RESOURCE Date: Wed, 11 Sep 2024 16:24:51 +0200 Message-ID: <20240911142452.4110190-1-max.kellermann@ionos.com> In-Reply-To: <4b6aec51-dc23-4e49-86c5-0496823dfa3c@redhat.com> References: <4b6aec51-dc23-4e49-86c5-0496823dfa3c@redhat.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	[v2] fs/ceph/quota: ignore quota with CAP_SYS_RESOURCE \| expand [v2] fs/ceph/quota: ignore quota with CAP_SYS_RESOURCE

Max Kellermann Sept. 11, 2024, 2:24 p.m. UTC

CAP_SYS_RESOURCE allows users to "override disk quota limits".  Most
filesystems have a CAP_SYS_RESOURCE check in all quota check code
paths, but Ceph currently does not.  This patch implements the
feature.

Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
---
v1 -> v2: moved capable check to check_quota_exceeded()
---
 fs/ceph/quota.c | 3 +++
 1 file changed, 3 insertions(+)

Patrick Donnelly Sept. 11, 2024, 3:01 p.m. UTC | #1

On Wed, Sep 11, 2024 at 10:25 AM Max Kellermann
<max.kellermann@ionos.com> wrote:
>
> CAP_SYS_RESOURCE allows users to "override disk quota limits".  Most
> filesystems have a CAP_SYS_RESOURCE check in all quota check code
> paths, but Ceph currently does not.  This patch implements the
> feature.

Just because the client cooperatively maintains the quota limits with
the MDS does not mean it can override the quota in a distributed
system.

NAK

Max Kellermann Sept. 11, 2024, 3:23 p.m. UTC | #2

On Wed, Sep 11, 2024 at 5:03 PM Patrick Donnelly <pdonnell@redhat.com> wrote:
> Just because the client cooperatively maintains the quota limits with
> the MDS does not mean it can override the quota in a distributed
> system.

I thought Ceph's quotas were implemented only on the client, just like
file permissions. Is that not correct? Is there an additional
server-side quota check? In my tests, I never saw one; it looked like
I could write arbitrary amounts of data with my patch.

Max

Patrick Donnelly Sept. 11, 2024, 6:03 p.m. UTC | #3

On Wed, Sep 11, 2024 at 11:23 AM Max Kellermann
<max.kellermann@ionos.com> wrote:
>
> On Wed, Sep 11, 2024 at 5:03 PM Patrick Donnelly <pdonnell@redhat.com> wrote:
> > Just because the client cooperatively maintains the quota limits with
> > the MDS does not mean it can override the quota in a distributed
> > system.
>
> I thought Ceph's quotas were implemented only on the client, just like
> file permissions. Is that not correct? Is there an additional
> server-side quota check? In my tests, I never saw one; it looked like
> I could write arbitrary amounts of data with my patch.

CephFS has many components that are cooperatively maintained by the
MDS **and** the clients; i.e. the clients are trusted to follow the
protocols and restrictions in the file system. For example,
capabilities grant a client read/write permissions on an inode but a
client could easily just open any file and write to it at will. There
is no barrier preventing that misbehavior.

Having root on a client does not extend to arbitrary superuser
permissions on the distributed file system. Down that path lies chaos
and inconsistency.

Max Kellermann Sept. 11, 2024, 7:21 p.m. UTC | #4

On Wed, Sep 11, 2024 at 8:04 PM Patrick Donnelly <pdonnell@redhat.com> wrote:
> CephFS has many components that are cooperatively maintained by the
> MDS **and** the clients; i.e. the clients are trusted to follow the
> protocols and restrictions in the file system. For example,
> capabilities grant a client read/write permissions on an inode but a
> client could easily just open any file and write to it at will. There
> is no barrier preventing that misbehavior.

To me, that sounds like you confirm my assumption on how Ceph works -
both file permissions and quotas. As a superuser (CAP_DAC_OVERRIDE), I
can write arbitrary files, and just as well CAP_SYS_RESOURCE should
allow me to exceed quotas - that's how both capabilities are
documented.

> Having root on a client does not extend to arbitrary superuser
> permissions on the distributed file system. Down that path lies chaos
> and inconsistency.

Fine for me - I'll keep my patch in our kernel fork (because we need
the feature), together with the other Ceph patches that were rejected.

Max

Patrick Donnelly Sept. 11, 2024, 8:14 p.m. UTC | #5

On Wed, Sep 11, 2024 at 3:21 PM Max Kellermann <max.kellermann@ionos.com> wrote:
>
> On Wed, Sep 11, 2024 at 8:04 PM Patrick Donnelly <pdonnell@redhat.com> wrote:
> > CephFS has many components that are cooperatively maintained by the
> > MDS **and** the clients; i.e. the clients are trusted to follow the
> > protocols and restrictions in the file system. For example,
> > capabilities grant a client read/write permissions on an inode but a
> > client could easily just open any file and write to it at will. There
> > is no barrier preventing that misbehavior.
>
> To me, that sounds like you confirm my assumption on how Ceph works -
> both file permissions and quotas. As a superuser (CAP_DAC_OVERRIDE), I
> can write arbitrary files, and just as well CAP_SYS_RESOURCE should
> allow me to exceed quotas - that's how both capabilities are
> documented.
>
> > Having root on a client does not extend to arbitrary superuser
> > permissions on the distributed file system. Down that path lies chaos
> > and inconsistency.
>
> Fine for me - I'll keep my patch in our kernel fork (because we need
> the feature), together with the other Ceph patches that were rejected.

If you want to upstream this, the appropriate change would go in
ceph.git as a new cephx capability (not cephfs capability) for the
"mds" auth cap that would allow a client with root (or
CAP_SYS_RESOURCE) to bypass quotas. I would support merging such a
patch (and the corresponding userspace / kernel client changes).

Ilya Dryomov Sept. 11, 2024, 9:03 p.m. UTC | #6

On Wed, Sep 11, 2024 at 9:21 PM Max Kellermann <max.kellermann@ionos.com> wrote:
>
> On Wed, Sep 11, 2024 at 8:04 PM Patrick Donnelly <pdonnell@redhat.com> wrote:
> > CephFS has many components that are cooperatively maintained by the
> > MDS **and** the clients; i.e. the clients are trusted to follow the
> > protocols and restrictions in the file system. For example,
> > capabilities grant a client read/write permissions on an inode but a
> > client could easily just open any file and write to it at will. There
> > is no barrier preventing that misbehavior.
>
> To me, that sounds like you confirm my assumption on how Ceph works -
> both file permissions and quotas. As a superuser (CAP_DAC_OVERRIDE), I
> can write arbitrary files, and just as well CAP_SYS_RESOURCE should
> allow me to exceed quotas - that's how both capabilities are
> documented.

Hi Max,

I think Patrick is actually saying the reverse: having root or other
_local_ user privileges on a particular node shouldn't subvert the
CephFS caps system because there might be many other users involved.
If you have a CephFS client that hasn't been tampered with, coming in
with CAP_DAC_OVERRIDE wouldn't allow you to write to an arbitrary file
directly or even buffer data for that file in memory unless the MDS
grants the cap/permission.

However a rigged CephFS client could absolutely do that.  It could
ignore those parts of the MDS protocol or bypass the MDS entirely and
interact only with OSDs.  The only thing that could stand in the way of
a client like that is CephX, which is where I believe the suggestion to
implement the quota override as a new CephX cap is coming from.  If
a particular user is to be allowed to go loose, the system needs to
have a record of that.

Thanks,

                Ilya

Xiubo Li Sept. 12, 2024, 2 a.m. UTC | #7

On 9/12/24 04:14, Patrick Donnelly wrote:
> On Wed, Sep 11, 2024 at 3:21 PM Max Kellermann <max.kellermann@ionos.com> wrote:
>> On Wed, Sep 11, 2024 at 8:04 PM Patrick Donnelly <pdonnell@redhat.com> wrote:
>>> CephFS has many components that are cooperatively maintained by the
>>> MDS **and** the clients; i.e. the clients are trusted to follow the
>>> protocols and restrictions in the file system. For example,
>>> capabilities grant a client read/write permissions on an inode but a
>>> client could easily just open any file and write to it at will. There
>>> is no barrier preventing that misbehavior.
>> To me, that sounds like you confirm my assumption on how Ceph works -
>> both file permissions and quotas. As a superuser (CAP_DAC_OVERRIDE), I
>> can write arbitrary files, and just as well CAP_SYS_RESOURCE should
>> allow me to exceed quotas - that's how both capabilities are
>> documented.
>>
>>> Having root on a client does not extend to arbitrary superuser
>>> permissions on the distributed file system. Down that path lies chaos
>>> and inconsistency.
>> Fine for me - I'll keep my patch in our kernel fork (because we need
>> the feature), together with the other Ceph patches that were rejected.
> If you want to upstream this, the appropriate change would go in
> ceph.git as a new cephx capability (not cephfs capability) for the
> "mds" auth cap that would allow a client with root (or
> CAP_SYS_RESOURCE) to bypass quotas. I would support merging such a
> patch (and the corresponding userspace / kernel client changes).
>
Yeah, Patrick is correct. This really will by pass the protocols and 
restrictions in cephfs and introduces inconsistency. By adding a new 
cephx caps we can broadcast this to all the users or clients.

Thanks

Max Kellermann Sept. 12, 2024, 6:36 a.m. UTC | #8

On Thu, Sep 12, 2024 at 4:00 AM Xiubo Li <xiubli@redhat.com> wrote:
> > If you want to upstream this, the appropriate change would go in
> > ceph.git as a new cephx capability (not cephfs capability) for the
> > "mds" auth cap that would allow a client with root (or
> > CAP_SYS_RESOURCE) to bypass quotas. I would support merging such a
> > patch (and the corresponding userspace / kernel client changes).
> >
> Yeah, Patrick is correct. This really will by pass the protocols and
> restrictions in cephfs and introduces inconsistency. By adding a new
> cephx caps we can broadcast this to all the users or clients.

I don't get any of this, please help me understand.

1. What protocol restrictions are being bypassed?
All the (user) documentation I found on docs.deph.com and on the
RedHat website about Ceph quotas is very unspecific, but it mentions
that clients can write over quota for up to 10 seconds - so writing
over quota, while not desirable, is something that can happen in
normal operation and the resulting state must be well-defined.
I found no protocol documentation at all about cephfs quotas. Where
can I find it?

2. What inconsistency is being introduced? Do you mean that the actual
stored bytes are bigger than the quota limit? If yes, then how is this
different from setting a quota limit on a directory that is already
above this new limit?

3. What is the worst that could happen if I ignore your advice and
just  keep my patch in our kernel fork? (i.e. without introducing any
cephx caps)

Max

[v2] fs/ceph/quota: ignore quota with CAP_SYS_RESOURCE

Commit Message

Comments

Patch