mbox series

[RFC,0/2] Add support for snapshot names encryption

Message ID 20220310172616.16212-1-lhenriques@suse.de
Headers show
Series Add support for snapshot names encryption | expand

Message

Luis Henriques March 10, 2022, 5:26 p.m. UTC
Hi!

So, I've changed this code back into and RFC as I'm not sure yet if this
is it's final form.  I think the 2 patches in this series should probably
be squashed into a single patch.  I decided to keep them separate as the
1st one is simple (it's the same patch I had already sent), and the 2nd
patch adds a lot more complexity to the whole thing.

So, I've looked at Xiubo initial patch for handling snapshots long names.
It was complex, of course, and it required extra MDS changes.  I *think*
my approach is slightly simpler, but I'm not entirely sure yet that I'm
handling every case.

In order to test this code the following PRs are required:

  mds: add protection from clients without fscrypt support #45073
  mds: use the whole string as the snapshot long name #45192
  mds: support alternate names for snapshots #45224
  mds: limit the snapshot names to 240 characters #45312

Comments are welcome, I'm still testing these patches and I do expect to
find that something is still missing.  And I do expect to find bugs.
These strings parsing scares me a lot, but I couldn't see a simpler
approach.

Luís Henriques (2):
  ceph: add support for encrypted snapshot names
  ceph: add support for handling encrypted snapshot names in subtree

 fs/ceph/crypto.c | 146 +++++++++++++++++++++++++++++++++++++++++------
 fs/ceph/crypto.h |   9 ++-
 fs/ceph/dir.c    |   9 +++
 fs/ceph/inode.c  |  13 +++++
 4 files changed, 156 insertions(+), 21 deletions(-)

Comments

Luis Henriques March 10, 2022, 5:34 p.m. UTC | #1
Luís Henriques <lhenriques@suse.de> writes:

> Hi!
>
> So, I've changed this code back into and RFC as I'm not sure yet if this
> is it's final form.  I think the 2 patches in this series should probably
> be squashed into a single patch.  I decided to keep them separate as the
> 1st one is simple (it's the same patch I had already sent), and the 2nd
> patch adds a lot more complexity to the whole thing.
>
> So, I've looked at Xiubo initial patch for handling snapshots long names.
> It was complex, of course, and it required extra MDS changes.  I *think*
> my approach is slightly simpler, but I'm not entirely sure yet that I'm
> handling every case.
>
> In order to test this code the following PRs are required:
>
>   mds: add protection from clients without fscrypt support #45073
>   mds: use the whole string as the snapshot long name #45192
>   mds: support alternate names for snapshots #45224
>   mds: limit the snapshot names to 240 characters #45312
>
> Comments are welcome, I'm still testing these patches and I do expect to
> find that something is still missing.  And I do expect to find bugs.
> These strings parsing scares me a lot, but I couldn't see a simpler
> approach.

Again, I forgot to mention in the cover-letter that handling
base64-encoded snapshots that start with '_' is still missing.  That's
next on my list.

Cheers,
Xiubo Li March 12, 2022, 8:30 a.m. UTC | #2
On 3/11/22 1:26 AM, Luís Henriques wrote:
> Since filenames in encrypted directories are already encrypted and shown
> as a base64-encoded string when the directory is locked, snapshot names
> should show a similar behaviour.
>
> Signed-off-by: Luís Henriques <lhenriques@suse.de>
> ---
>   fs/ceph/dir.c   |  9 +++++++++
>   fs/ceph/inode.c | 13 +++++++++++++
>   2 files changed, 22 insertions(+)
>
> diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
> index 6df2a91af236..123e3b9c8161 100644
> --- a/fs/ceph/dir.c
> +++ b/fs/ceph/dir.c
> @@ -1075,6 +1075,15 @@ static int ceph_mkdir(struct user_namespace *mnt_userns, struct inode *dir,
>   		op = CEPH_MDS_OP_MKSNAP;
>   		dout("mksnap dir %p snap '%pd' dn %p\n", dir,
>   		     dentry, dentry);
> +		/*
> +		 * Encrypted snapshots require d_revalidate to force a
> +		 * LOOKUPSNAP to cleanup dcache
> +		 */
> +		if (IS_ENCRYPTED(dir)) {
> +			spin_lock(&dentry->d_lock);
> +			dentry->d_flags |= DCACHE_NOKEY_NAME;

I think this is not correct fix of this issue.

Actually this dentry's name is a KEY NAME, which is human readable name.

DCACHE_NOKEY_NAME means the base64_encoded names. This usually will be 
set when filling a new dentry if the directory is locked. If the 
directory is unlocked the directory inode will be set with the key.

The root cause should be the snapshot's inode doesn't correctly set the 
encrypt stuff when you are reading from it.

NOTE: when you are 'ls -l .snap/snapXXX' the snapXXX dentry name is 
correct, it's just corrupted for the file or directory names under snapXXX/.


> +			spin_unlock(&dentry->d_lock);
> +		}
>   	} else if (ceph_snap(dir) == CEPH_NOSNAP) {
>   		dout("mkdir dir %p dn %p mode 0%ho\n", dir, dentry, mode);
>   		op = CEPH_MDS_OP_MKDIR;
> diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
> index b573a0f33450..81d3d554d261 100644
> --- a/fs/ceph/inode.c
> +++ b/fs/ceph/inode.c
> @@ -182,6 +182,19 @@ struct inode *ceph_get_snapdir(struct inode *parent)
>   	ci->i_rbytes = 0;
>   	ci->i_btime = ceph_inode(parent)->i_btime;
>   
> +	/* if encrypted, just borrow fscrypt_auth from parent */
> +	if (IS_ENCRYPTED(parent)) {
> +		struct ceph_inode_info *pci = ceph_inode(parent);
> +
> +		ci->fscrypt_auth = kmemdup(pci->fscrypt_auth,
> +					   pci->fscrypt_auth_len,
> +					   GFP_KERNEL);
> +		if (ci->fscrypt_auth) {
> +			inode->i_flags |= S_ENCRYPTED;
> +			ci->fscrypt_auth_len = pci->fscrypt_auth_len;
> +		} else
> +			dout("Failed to alloc memory for fscrypt_auth in snapdir\n");
> +	}

Here I think Jeff has already commented it in your last version, it 
should fail by returning NULL ?

- Xiubo

>   	if (inode->i_state & I_NEW) {
>   		inode->i_op = &ceph_snapdir_iops;
>   		inode->i_fop = &ceph_snapdir_fops;
>
Xiubo Li March 14, 2022, 2:45 a.m. UTC | #3
On 3/12/22 4:30 PM, Xiubo Li wrote:
>
> On 3/11/22 1:26 AM, Luís Henriques wrote:
>> Since filenames in encrypted directories are already encrypted and shown
>> as a base64-encoded string when the directory is locked, snapshot names
>> should show a similar behaviour.
>>
>> Signed-off-by: Luís Henriques <lhenriques@suse.de>
>> ---
>>   fs/ceph/dir.c   |  9 +++++++++
>>   fs/ceph/inode.c | 13 +++++++++++++
>>   2 files changed, 22 insertions(+)
>>
>> diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
>> index 6df2a91af236..123e3b9c8161 100644
>> --- a/fs/ceph/dir.c
>> +++ b/fs/ceph/dir.c
>> @@ -1075,6 +1075,15 @@ static int ceph_mkdir(struct user_namespace 
>> *mnt_userns, struct inode *dir,
>>           op = CEPH_MDS_OP_MKSNAP;
>>           dout("mksnap dir %p snap '%pd' dn %p\n", dir,
>>                dentry, dentry);
>> +        /*
>> +         * Encrypted snapshots require d_revalidate to force a
>> +         * LOOKUPSNAP to cleanup dcache
>> +         */
>> +        if (IS_ENCRYPTED(dir)) {
>> +            spin_lock(&dentry->d_lock);
>> +            dentry->d_flags |= DCACHE_NOKEY_NAME;
>
> I think this is not correct fix of this issue.
>
> Actually this dentry's name is a KEY NAME, which is human readable name.
>
> DCACHE_NOKEY_NAME means the base64_encoded names. This usually will be 
> set when filling a new dentry if the directory is locked. If the 
> directory is unlocked the directory inode will be set with the key.
>
> The root cause should be the snapshot's inode doesn't correctly set 
> the encrypt stuff when you are reading from it.
>
> NOTE: when you are 'ls -l .snap/snapXXX' the snapXXX dentry name is 
> correct, it's just corrupted for the file or directory names under 
> snapXXX/.
>
When mksnap in ceph_mkdir() before sending the request out it will 
create a new inode for the snapshot dentry and then will fill the 
ci->fscrypt_auth from .snap's inode, please see 
ceph_mkdir()->ceph_new_inode().

And in the mksnap request reply it will try to fill the ci->fscrypt_auth 
again but failed because it was already filled. This time the auth info 
is from .snap's parent dir from MDS side. In this patch in theory they 
should be the same, but I am still not sure why when decrypting the 
dentry names in snapXXX will fail.

I just guess it possibly will depend on the inode number from the 
related inode or something else. Before the request reply it seems the 
inode isn't set the inode number ?

- Xiubo

>
>> + spin_unlock(&dentry->d_lock);
>> +        }
>>       } else if (ceph_snap(dir) == CEPH_NOSNAP) {
>>           dout("mkdir dir %p dn %p mode 0%ho\n", dir, dentry, mode);
>>           op = CEPH_MDS_OP_MKDIR;
>> diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
>> index b573a0f33450..81d3d554d261 100644
>> --- a/fs/ceph/inode.c
>> +++ b/fs/ceph/inode.c
>> @@ -182,6 +182,19 @@ struct inode *ceph_get_snapdir(struct inode 
>> *parent)
>>       ci->i_rbytes = 0;
>>       ci->i_btime = ceph_inode(parent)->i_btime;
>>   +    /* if encrypted, just borrow fscrypt_auth from parent */
>> +    if (IS_ENCRYPTED(parent)) {
>> +        struct ceph_inode_info *pci = ceph_inode(parent);
>> +
>> +        ci->fscrypt_auth = kmemdup(pci->fscrypt_auth,
>> +                       pci->fscrypt_auth_len,
>> +                       GFP_KERNEL);
>> +        if (ci->fscrypt_auth) {
>> +            inode->i_flags |= S_ENCRYPTED;
>> +            ci->fscrypt_auth_len = pci->fscrypt_auth_len;
>> +        } else
>> +            dout("Failed to alloc memory for fscrypt_auth in 
>> snapdir\n");
>> +    }
>
> Here I think Jeff has already commented it in your last version, it 
> should fail by returning NULL ?
>
> - Xiubo
>
>>       if (inode->i_state & I_NEW) {
>>           inode->i_op = &ceph_snapdir_iops;
>>           inode->i_fop = &ceph_snapdir_fops;
>>
Luis Henriques March 14, 2022, 6:32 p.m. UTC | #4
Xiubo Li <xiubli@redhat.com> writes:

> On 3/14/22 10:45 AM, Xiubo Li wrote:
>>
>> On 3/12/22 4:30 PM, Xiubo Li wrote:
>>>
>>> On 3/11/22 1:26 AM, Luís Henriques wrote:
>>>> Since filenames in encrypted directories are already encrypted and shown
>>>> as a base64-encoded string when the directory is locked, snapshot names
>>>> should show a similar behaviour.
>>>>
>>>> Signed-off-by: Luís Henriques <lhenriques@suse.de>
>>>> ---
>>>>   fs/ceph/dir.c   |  9 +++++++++
>>>>   fs/ceph/inode.c | 13 +++++++++++++
>>>>   2 files changed, 22 insertions(+)
>>>>
>>>> diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
>>>> index 6df2a91af236..123e3b9c8161 100644
>>>> --- a/fs/ceph/dir.c
>>>> +++ b/fs/ceph/dir.c
>>>> @@ -1075,6 +1075,15 @@ static int ceph_mkdir(struct user_namespace
>>>> *mnt_userns, struct inode *dir,
>>>>           op = CEPH_MDS_OP_MKSNAP;
>>>>           dout("mksnap dir %p snap '%pd' dn %p\n", dir,
>>>>                dentry, dentry);
>>>> +        /*
>>>> +         * Encrypted snapshots require d_revalidate to force a
>>>> +         * LOOKUPSNAP to cleanup dcache
>>>> +         */
>>>> +        if (IS_ENCRYPTED(dir)) {
>>>> +            spin_lock(&dentry->d_lock);
>>>> +            dentry->d_flags |= DCACHE_NOKEY_NAME;
>>>
>>> I think this is not correct fix of this issue.
>>>
>>> Actually this dentry's name is a KEY NAME, which is human readable name.
>>>
>>> DCACHE_NOKEY_NAME means the base64_encoded names. This usually will be set
>>> when filling a new dentry if the directory is locked. If the directory is
>>> unlocked the directory inode will be set with the key.
>>>
>>> The root cause should be the snapshot's inode doesn't correctly set the
>>> encrypt stuff when you are reading from it.
>>>
>>> NOTE: when you are 'ls -l .snap/snapXXX' the snapXXX dentry name is correct,
>>> it's just corrupted for the file or directory names under snapXXX/.
>>>
>> When mksnap in ceph_mkdir() before sending the request out it will create a
>> new inode for the snapshot dentry and then will fill the ci->fscrypt_auth from
>> .snap's inode, please see ceph_mkdir()->ceph_new_inode().
>>
>> And in the mksnap request reply it will try to fill the ci->fscrypt_auth again
>> but failed because it was already filled. This time the auth info is from
>> .snap's parent dir from MDS side. In this patch in theory they should be the
>> same, but I am still not sure why when decrypting the dentry names in snapXXX
>> will fail.
>>
>> I just guess it possibly will depend on the inode number from the related
>> inode or something else. Before the request reply it seems the inode isn't set
>> the inode number ?
>>
> It should be the ci_nonce's problem.

OK, you were right.  However, I don't see a simple way around it.  And I
don't think that adding a fscrypt new interface to copy an existent nonce
makes sense.

So, here's another possible option: instead of setting the
DCACHE_NOKEY_NAME flag, we could simply do d_invalidate(dentry) before
leaving ceph_mkdir (if we're creating an encrypted snapshot, of course).
Would this be acceptable?

Cheers,