diff mbox series

[1/2] memfd: Fix COW issue on MAP_PRIVATE and F_SEAL_FUTURE_WRITE mappings

Message ID 20191107195355.80608-1-joel@joelfernandes.org
State New
Headers show
Series [1/2] memfd: Fix COW issue on MAP_PRIVATE and F_SEAL_FUTURE_WRITE mappings | expand

Commit Message

Joel Fernandes (Google) Nov. 7, 2019, 7:53 p.m. UTC
From: Nicolas Geoffray <ngeoffray@google.com>


F_SEAL_FUTURE_WRITE has unexpected behavior when used with MAP_PRIVATE:
A private mapping created after the memfd file that gets sealed with
F_SEAL_FUTURE_WRITE loses the copy-on-write at fork behavior, meaning
children and parent share the same memory, even though the mapping is
private.

The reason for this is due to the code below:

static int shmem_mmap(struct file *file, struct vm_area_struct *vma)
{
        struct shmem_inode_info *info = SHMEM_I(file_inode(file));

        if (info->seals & F_SEAL_FUTURE_WRITE) {
                /*
                 * New PROT_WRITE and MAP_SHARED mmaps are not allowed when
                 * "future write" seal active.
                 */
                if ((vma->vm_flags & VM_SHARED) && (vma->vm_flags & VM_WRITE))
                        return -EPERM;

                /*
                 * Since the F_SEAL_FUTURE_WRITE seals allow for a MAP_SHARED
                 * read-only mapping, take care to not allow mprotect to revert
                 * protections.
                 */
                vma->vm_flags &= ~(VM_MAYWRITE);
        }
        ...
}

And for the mm to know if a mapping is copy-on-write:
static inline bool is_cow_mapping(vm_flags_t flags)
{
        return (flags & (VM_SHARED | VM_MAYWRITE)) == VM_MAYWRITE;
}

The patch fixes the issue by making the mprotect revert protection
happen only for shared mappings. For private mappings, using mprotect
will have no effect on the seal behavior.

Cc: kernel-team@android.com
Signed-off-by: Nicolas Geoffray <ngeoffray@google.com>

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>


---
Google bug: 143833776

 mm/shmem.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

-- 
2.24.0.rc1.363.gb1bccd3e3d-goog

Comments

Andrew Morton Nov. 8, 2019, 1 a.m. UTC | #1
On Thu,  7 Nov 2019 14:53:54 -0500 "Joel Fernandes (Google)" <joel@joelfernandes.org> wrote:

> F_SEAL_FUTURE_WRITE has unexpected behavior when used with MAP_PRIVATE:

> A private mapping created after the memfd file that gets sealed with

> F_SEAL_FUTURE_WRITE loses the copy-on-write at fork behavior, meaning

> children and parent share the same memory, even though the mapping is

> private.


That sounds fairly serious.  Should this be backported into -stable kernels?
Joel Fernandes (Google) Nov. 8, 2019, 2:06 a.m. UTC | #2
On Thu, Nov 07, 2019 at 05:00:23PM -0800, Andrew Morton wrote:
> On Thu,  7 Nov 2019 14:53:54 -0500 "Joel Fernandes (Google)" <joel@joelfernandes.org> wrote:

> 

> > F_SEAL_FUTURE_WRITE has unexpected behavior when used with MAP_PRIVATE:

> > A private mapping created after the memfd file that gets sealed with

> > F_SEAL_FUTURE_WRITE loses the copy-on-write at fork behavior, meaning

> > children and parent share the same memory, even though the mapping is

> > private.

> 

> That sounds fairly serious.  Should this be backported into -stable kernels?


Yes, it should be. The F_SEAL_FUTURE_WRITE feature was introduced in v5.1 so
v5.3.x stable kernels would need a backport. I can submit a backport tomorrow
unless we are Ok with stable automatically picking it up (I believe the
stable folks "auto select" fixes which should detect this is a fix since I
have said it is a fix in the subject line).

thanks,

 - Joel
Andrew Morton Nov. 8, 2019, 3:25 a.m. UTC | #3
On Thu, 7 Nov 2019 21:06:14 -0500 Joel Fernandes <joel@joelfernandes.org> wrote:

> On Thu, Nov 07, 2019 at 05:00:23PM -0800, Andrew Morton wrote:

> > On Thu,  7 Nov 2019 14:53:54 -0500 "Joel Fernandes (Google)" <joel@joelfernandes.org> wrote:

> > 

> > > F_SEAL_FUTURE_WRITE has unexpected behavior when used with MAP_PRIVATE:

> > > A private mapping created after the memfd file that gets sealed with

> > > F_SEAL_FUTURE_WRITE loses the copy-on-write at fork behavior, meaning

> > > children and parent share the same memory, even though the mapping is

> > > private.

> > 

> > That sounds fairly serious.  Should this be backported into -stable kernels?

> 

> Yes, it should be.


I added

Fixes: ab3948f58ff84 ("mm/memfd: add an F_SEAL_FUTURE_WRITE seal to memfd")
Cc: <stable@vger.kernel.org>

> The F_SEAL_FUTURE_WRITE feature was introduced in v5.1 so

> v5.3.x stable kernels would need a backport. I can submit a backport tomorrow

> unless we are Ok with stable automatically picking it up (I believe the

> stable folks "auto select" fixes which should detect this is a fix since I

> have said it is a fix in the subject line).


The Cc:stable tag should trigger the appropriate actions, assisted by
the Fixes:.  I doubt if "fix" in the Subject has much effect.
Christoph Hellwig Nov. 8, 2019, 6:33 a.m. UTC | #4
> -		 * Since the F_SEAL_FUTURE_WRITE seals allow for a MAP_SHARED

> -		 * read-only mapping, take care to not allow mprotect to revert

> -		 * protections.

> +		 * Since an F_SEAL_FUTURE_WRITE sealed memfd can be mapped as

> +		 * MAP_SHARED and read-only, take care to not allow mprotect to

> +		 * revert protections on such mappings. Do this only for shared

> +		 * mappings. For private mappings, don't need to mask VM_MAYWRITE


This adds an > 80 char line.
Greg Kroah-Hartman Nov. 8, 2019, 6:37 a.m. UTC | #5
On Thu, Nov 07, 2019 at 09:06:14PM -0500, Joel Fernandes wrote:
> On Thu, Nov 07, 2019 at 05:00:23PM -0800, Andrew Morton wrote:

> > On Thu,  7 Nov 2019 14:53:54 -0500 "Joel Fernandes (Google)" <joel@joelfernandes.org> wrote:

> > 

> > > F_SEAL_FUTURE_WRITE has unexpected behavior when used with MAP_PRIVATE:

> > > A private mapping created after the memfd file that gets sealed with

> > > F_SEAL_FUTURE_WRITE loses the copy-on-write at fork behavior, meaning

> > > children and parent share the same memory, even though the mapping is

> > > private.

> > 

> > That sounds fairly serious.  Should this be backported into -stable kernels?

> 

> Yes, it should be. The F_SEAL_FUTURE_WRITE feature was introduced in v5.1 so

> v5.3.x stable kernels would need a backport. I can submit a backport tomorrow

> unless we are Ok with stable automatically picking it up (I believe the

> stable folks "auto select" fixes which should detect this is a fix since I

> have said it is a fix in the subject line).


Never rely on "auto select" to pick up a patch for stable if you already
know it should go to stable.  Just mark it as such, or tell stable@vger
after the fact.

thanks,

greg k-h
Joel Fernandes (Google) Nov. 8, 2019, 3:34 p.m. UTC | #6
On Fri, Nov 08, 2019 at 07:37:15AM +0100, Greg KH wrote:
> On Thu, Nov 07, 2019 at 09:06:14PM -0500, Joel Fernandes wrote:

> > On Thu, Nov 07, 2019 at 05:00:23PM -0800, Andrew Morton wrote:

> > > On Thu,  7 Nov 2019 14:53:54 -0500 "Joel Fernandes (Google)" <joel@joelfernandes.org> wrote:

> > > 

> > > > F_SEAL_FUTURE_WRITE has unexpected behavior when used with MAP_PRIVATE:

> > > > A private mapping created after the memfd file that gets sealed with

> > > > F_SEAL_FUTURE_WRITE loses the copy-on-write at fork behavior, meaning

> > > > children and parent share the same memory, even though the mapping is

> > > > private.

> > > 

> > > That sounds fairly serious.  Should this be backported into -stable kernels?

> > 

> > Yes, it should be. The F_SEAL_FUTURE_WRITE feature was introduced in v5.1 so

> > v5.3.x stable kernels would need a backport. I can submit a backport tomorrow

> > unless we are Ok with stable automatically picking it up (I believe the

> > stable folks "auto select" fixes which should detect this is a fix since I

> > have said it is a fix in the subject line).

> 

> Never rely on "auto select" to pick up a patch for stable if you already

> know it should go to stable.  Just mark it as such, or tell stable@vger

> after the fact.


Sure, agreed.

Thanks Andrew for adding the tags!

thanks,

 - Joel
Joel Fernandes (Google) Nov. 8, 2019, 3:35 p.m. UTC | #7
On Thu, Nov 07, 2019 at 10:33:08PM -0800, Christoph Hellwig wrote:
> > -		 * Since the F_SEAL_FUTURE_WRITE seals allow for a MAP_SHARED

> > -		 * read-only mapping, take care to not allow mprotect to revert

> > -		 * protections.

> > +		 * Since an F_SEAL_FUTURE_WRITE sealed memfd can be mapped as

> > +		 * MAP_SHARED and read-only, take care to not allow mprotect to

> > +		 * revert protections on such mappings. Do this only for shared

> > +		 * mappings. For private mappings, don't need to mask VM_MAYWRITE

> 

> This adds an > 80 char line.


Oh, true. Sorry. Andrew I hate to ask you but since you took the patch
already, could you just the comment for the character limit in the one
you applied?

thanks,

 - Joel
diff mbox series

Patch

diff --git a/mm/shmem.c b/mm/shmem.c
index 447fd575587c..6ac5e867ef13 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2214,11 +2214,14 @@  static int shmem_mmap(struct file *file, struct vm_area_struct *vma)
 			return -EPERM;
 
 		/*
-		 * Since the F_SEAL_FUTURE_WRITE seals allow for a MAP_SHARED
-		 * read-only mapping, take care to not allow mprotect to revert
-		 * protections.
+		 * Since an F_SEAL_FUTURE_WRITE sealed memfd can be mapped as
+		 * MAP_SHARED and read-only, take care to not allow mprotect to
+		 * revert protections on such mappings. Do this only for shared
+		 * mappings. For private mappings, don't need to mask VM_MAYWRITE
+		 * as we still want them to be COW-writable.
 		 */
-		vma->vm_flags &= ~(VM_MAYWRITE);
+		if (vma->vm_flags & VM_SHARED)
+			vma->vm_flags &= ~(VM_MAYWRITE);
 	}
 
 	file_accessed(file);