Message ID | 20240911142452.4110190-1-max.kellermann@ionos.com |
---|---|
State | New |
Headers | show |
Series | [v2] fs/ceph/quota: ignore quota with CAP_SYS_RESOURCE | expand |
On Wed, Sep 11, 2024 at 10:25 AM Max Kellermann <max.kellermann@ionos.com> wrote: > > CAP_SYS_RESOURCE allows users to "override disk quota limits". Most > filesystems have a CAP_SYS_RESOURCE check in all quota check code > paths, but Ceph currently does not. This patch implements the > feature. Just because the client cooperatively maintains the quota limits with the MDS does not mean it can override the quota in a distributed system. NAK
On Wed, Sep 11, 2024 at 5:03 PM Patrick Donnelly <pdonnell@redhat.com> wrote: > Just because the client cooperatively maintains the quota limits with > the MDS does not mean it can override the quota in a distributed > system. I thought Ceph's quotas were implemented only on the client, just like file permissions. Is that not correct? Is there an additional server-side quota check? In my tests, I never saw one; it looked like I could write arbitrary amounts of data with my patch. Max
On Wed, Sep 11, 2024 at 11:23 AM Max Kellermann <max.kellermann@ionos.com> wrote: > > On Wed, Sep 11, 2024 at 5:03 PM Patrick Donnelly <pdonnell@redhat.com> wrote: > > Just because the client cooperatively maintains the quota limits with > > the MDS does not mean it can override the quota in a distributed > > system. > > I thought Ceph's quotas were implemented only on the client, just like > file permissions. Is that not correct? Is there an additional > server-side quota check? In my tests, I never saw one; it looked like > I could write arbitrary amounts of data with my patch. CephFS has many components that are cooperatively maintained by the MDS **and** the clients; i.e. the clients are trusted to follow the protocols and restrictions in the file system. For example, capabilities grant a client read/write permissions on an inode but a client could easily just open any file and write to it at will. There is no barrier preventing that misbehavior. Having root on a client does not extend to arbitrary superuser permissions on the distributed file system. Down that path lies chaos and inconsistency.
On Wed, Sep 11, 2024 at 8:04 PM Patrick Donnelly <pdonnell@redhat.com> wrote: > CephFS has many components that are cooperatively maintained by the > MDS **and** the clients; i.e. the clients are trusted to follow the > protocols and restrictions in the file system. For example, > capabilities grant a client read/write permissions on an inode but a > client could easily just open any file and write to it at will. There > is no barrier preventing that misbehavior. To me, that sounds like you confirm my assumption on how Ceph works - both file permissions and quotas. As a superuser (CAP_DAC_OVERRIDE), I can write arbitrary files, and just as well CAP_SYS_RESOURCE should allow me to exceed quotas - that's how both capabilities are documented. > Having root on a client does not extend to arbitrary superuser > permissions on the distributed file system. Down that path lies chaos > and inconsistency. Fine for me - I'll keep my patch in our kernel fork (because we need the feature), together with the other Ceph patches that were rejected. Max
On Wed, Sep 11, 2024 at 3:21 PM Max Kellermann <max.kellermann@ionos.com> wrote: > > On Wed, Sep 11, 2024 at 8:04 PM Patrick Donnelly <pdonnell@redhat.com> wrote: > > CephFS has many components that are cooperatively maintained by the > > MDS **and** the clients; i.e. the clients are trusted to follow the > > protocols and restrictions in the file system. For example, > > capabilities grant a client read/write permissions on an inode but a > > client could easily just open any file and write to it at will. There > > is no barrier preventing that misbehavior. > > To me, that sounds like you confirm my assumption on how Ceph works - > both file permissions and quotas. As a superuser (CAP_DAC_OVERRIDE), I > can write arbitrary files, and just as well CAP_SYS_RESOURCE should > allow me to exceed quotas - that's how both capabilities are > documented. > > > Having root on a client does not extend to arbitrary superuser > > permissions on the distributed file system. Down that path lies chaos > > and inconsistency. > > Fine for me - I'll keep my patch in our kernel fork (because we need > the feature), together with the other Ceph patches that were rejected. If you want to upstream this, the appropriate change would go in ceph.git as a new cephx capability (not cephfs capability) for the "mds" auth cap that would allow a client with root (or CAP_SYS_RESOURCE) to bypass quotas. I would support merging such a patch (and the corresponding userspace / kernel client changes).
On Wed, Sep 11, 2024 at 9:21 PM Max Kellermann <max.kellermann@ionos.com> wrote: > > On Wed, Sep 11, 2024 at 8:04 PM Patrick Donnelly <pdonnell@redhat.com> wrote: > > CephFS has many components that are cooperatively maintained by the > > MDS **and** the clients; i.e. the clients are trusted to follow the > > protocols and restrictions in the file system. For example, > > capabilities grant a client read/write permissions on an inode but a > > client could easily just open any file and write to it at will. There > > is no barrier preventing that misbehavior. > > To me, that sounds like you confirm my assumption on how Ceph works - > both file permissions and quotas. As a superuser (CAP_DAC_OVERRIDE), I > can write arbitrary files, and just as well CAP_SYS_RESOURCE should > allow me to exceed quotas - that's how both capabilities are > documented. Hi Max, I think Patrick is actually saying the reverse: having root or other _local_ user privileges on a particular node shouldn't subvert the CephFS caps system because there might be many other users involved. If you have a CephFS client that hasn't been tampered with, coming in with CAP_DAC_OVERRIDE wouldn't allow you to write to an arbitrary file directly or even buffer data for that file in memory unless the MDS grants the cap/permission. However a rigged CephFS client could absolutely do that. It could ignore those parts of the MDS protocol or bypass the MDS entirely and interact only with OSDs. The only thing that could stand in the way of a client like that is CephX, which is where I believe the suggestion to implement the quota override as a new CephX cap is coming from. If a particular user is to be allowed to go loose, the system needs to have a record of that. Thanks, Ilya
On 9/12/24 04:14, Patrick Donnelly wrote: > On Wed, Sep 11, 2024 at 3:21 PM Max Kellermann <max.kellermann@ionos.com> wrote: >> On Wed, Sep 11, 2024 at 8:04 PM Patrick Donnelly <pdonnell@redhat.com> wrote: >>> CephFS has many components that are cooperatively maintained by the >>> MDS **and** the clients; i.e. the clients are trusted to follow the >>> protocols and restrictions in the file system. For example, >>> capabilities grant a client read/write permissions on an inode but a >>> client could easily just open any file and write to it at will. There >>> is no barrier preventing that misbehavior. >> To me, that sounds like you confirm my assumption on how Ceph works - >> both file permissions and quotas. As a superuser (CAP_DAC_OVERRIDE), I >> can write arbitrary files, and just as well CAP_SYS_RESOURCE should >> allow me to exceed quotas - that's how both capabilities are >> documented. >> >>> Having root on a client does not extend to arbitrary superuser >>> permissions on the distributed file system. Down that path lies chaos >>> and inconsistency. >> Fine for me - I'll keep my patch in our kernel fork (because we need >> the feature), together with the other Ceph patches that were rejected. > If you want to upstream this, the appropriate change would go in > ceph.git as a new cephx capability (not cephfs capability) for the > "mds" auth cap that would allow a client with root (or > CAP_SYS_RESOURCE) to bypass quotas. I would support merging such a > patch (and the corresponding userspace / kernel client changes). > Yeah, Patrick is correct. This really will by pass the protocols and restrictions in cephfs and introduces inconsistency. By adding a new cephx caps we can broadcast this to all the users or clients. Thanks
On Thu, Sep 12, 2024 at 4:00 AM Xiubo Li <xiubli@redhat.com> wrote: > > If you want to upstream this, the appropriate change would go in > > ceph.git as a new cephx capability (not cephfs capability) for the > > "mds" auth cap that would allow a client with root (or > > CAP_SYS_RESOURCE) to bypass quotas. I would support merging such a > > patch (and the corresponding userspace / kernel client changes). > > > Yeah, Patrick is correct. This really will by pass the protocols and > restrictions in cephfs and introduces inconsistency. By adding a new > cephx caps we can broadcast this to all the users or clients. I don't get any of this, please help me understand. 1. What protocol restrictions are being bypassed? All the (user) documentation I found on docs.deph.com and on the RedHat website about Ceph quotas is very unspecific, but it mentions that clients can write over quota for up to 10 seconds - so writing over quota, while not desirable, is something that can happen in normal operation and the resulting state must be well-defined. I found no protocol documentation at all about cephfs quotas. Where can I find it? 2. What inconsistency is being introduced? Do you mean that the actual stored bytes are bigger than the quota limit? If yes, then how is this different from setting a quota limit on a directory that is already above this new limit? 3. What is the worst that could happen if I ignore your advice and just keep my patch in our kernel fork? (i.e. without introducing any cephx caps) Max
diff --git a/fs/ceph/quota.c b/fs/ceph/quota.c index 06ee397e0c3a..c428dc8e8f23 100644 --- a/fs/ceph/quota.c +++ b/fs/ceph/quota.c @@ -334,6 +334,9 @@ static bool check_quota_exceeded(struct inode *inode, enum quota_check_op op, u64 max, rvalue; bool exceeded = false; + if (capable(CAP_SYS_RESOURCE)) + return false; + if (ceph_snap(inode) != CEPH_NOSNAP) return false;
CAP_SYS_RESOURCE allows users to "override disk quota limits". Most filesystems have a CAP_SYS_RESOURCE check in all quota check code paths, but Ceph currently does not. This patch implements the feature. Signed-off-by: Max Kellermann <max.kellermann@ionos.com> --- v1 -> v2: moved capable check to check_quota_exceeded() --- fs/ceph/quota.c | 3 +++ 1 file changed, 3 insertions(+)