diff mbox series

ceph: fix test for whether we can skip read when writing beyond EOF

Message ID 20210625175951.90347-1-jlayton@kernel.org
State Superseded
Headers show
Series ceph: fix test for whether we can skip read when writing beyond EOF | expand

Commit Message

Jeff Layton June 25, 2021, 5:59 p.m. UTC
commit 827a746f405d upstream.

It's not sufficient to skip reading when the pos is beyond the EOF.
There may be data at the head of the page that we need to fill in
before the write.

Add a new helper function that corrects and clarifies the logic of
when we can skip reads, and have it only zero out the part of the page
that won't have data copied in for the write.

Cc: <stable@vger.kernel.org> # v5.10+
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Fixes: 1cc1699070bd ("ceph: fold ceph_update_writeable_page into ceph_write_begin")
Reported-by: Andrew W Elble <aweits@rit.edu>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
---
 fs/ceph/addr.c | 54 ++++++++++++++++++++++++++++++++++++++------------
 1 file changed, 41 insertions(+), 13 deletions(-)

This bug was originally in ceph, and then got replicated in the new
netfs helper code in v5.13. This patch is a backport of the netfs patch
for ceph. It should be applied to 5.10.y - 5.12.y.

Comments

Greg KH June 27, 2021, 2:22 p.m. UTC | #1
On Fri, Jun 25, 2021 at 01:59:51PM -0400, Jeff Layton wrote:
> commit 827a746f405d upstream.


No it is not :(

Please fix this up and resend it with the correct git id.

thanks,

greg k-h
Jeff Layton June 27, 2021, 3:41 p.m. UTC | #2
On Sun, 2021-06-27 at 16:22 +0200, Greg KH wrote:
> On Fri, Jun 25, 2021 at 01:59:51PM -0400, Jeff Layton wrote:

> > commit 827a746f405d upstream.

> 

> No it is not :(

> 

> Please fix this up and resend it with the correct git id.

> 

> thanks,

> 


Are you sure?

    $ git log --oneline origin/master -- fs/netfs
    827a746f405d (tag: netfs-fixes-20210621, dhowells/afs-fixes) netfs: fix test for whether we can skip read when writing beyond EOF

"origin" is Linus' tree. I'm not sure what I'm doing wrong otherwise.
-- 
Jeff Layton <jlayton@kernel.org>
Greg KH June 27, 2021, 4:02 p.m. UTC | #3
On Sun, Jun 27, 2021 at 11:41:45AM -0400, Jeff Layton wrote:
> On Sun, 2021-06-27 at 16:22 +0200, Greg KH wrote:

> > On Fri, Jun 25, 2021 at 01:59:51PM -0400, Jeff Layton wrote:

> > > commit 827a746f405d upstream.

> > 

> > No it is not :(

> > 

> > Please fix this up and resend it with the correct git id.

> > 

> > thanks,

> > 

> 

> Are you sure?

> 

>     $ git log --oneline origin/master -- fs/netfs

>     827a746f405d (tag: netfs-fixes-20210621, dhowells/afs-fixes) netfs: fix test for whether we can skip read when writing beyond EOF

> 

> "origin" is Linus' tree. I'm not sure what I'm doing wrong otherwise.


Commit 827a746f405d ("netfs: fix test for whether we can skip read when
writing beyond EOF") is just that, yes.

That does not match with the subject line here, or the patch itself.

So I do not understand what you are trying to do here...

thanks,

greg k-h
Matthew Wilcox June 27, 2021, 5:22 p.m. UTC | #4
On Sun, Jun 27, 2021 at 06:02:32PM +0200, Greg KH wrote:
> On Sun, Jun 27, 2021 at 11:41:45AM -0400, Jeff Layton wrote:

> > On Sun, 2021-06-27 at 16:22 +0200, Greg KH wrote:

> > > On Fri, Jun 25, 2021 at 01:59:51PM -0400, Jeff Layton wrote:

> > > > commit 827a746f405d upstream.

> > > 

> > > No it is not :(

> > > 

> > > Please fix this up and resend it with the correct git id.

> > > 

> > > thanks,

> > > 

> > 

> > Are you sure?

> > 

> >     $ git log --oneline origin/master -- fs/netfs

> >     827a746f405d (tag: netfs-fixes-20210621, dhowells/afs-fixes) netfs: fix test for whether we can skip read when writing beyond EOF

> > 

> > "origin" is Linus' tree. I'm not sure what I'm doing wrong otherwise.

> 

> Commit 827a746f405d ("netfs: fix test for whether we can skip read when

> writing beyond EOF") is just that, yes.

> 

> That does not match with the subject line here, or the patch itself.

> 

> So I do not understand what you are trying to do here...


That was in the original message:

> This bug was originally in ceph, and then got replicated in the new

> netfs helper code in v5.13. This patch is a backport of the netfs patch

> for ceph. It should be applied to 5.10.y - 5.12.y.


ie the code got moved from ceph to netfs upstream, and this is a
backport from netfs to ceph.
diff mbox series

Patch

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 26e66436f005..c000fe338f7e 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -1302,6 +1302,45 @@  ceph_find_incompatible(struct page *page)
 	return NULL;
 }
 
+/**
+ * prep_noread_page - prep a page for writing without reading first
+ * @page: page being prepared
+ * @pos: starting position for the write
+ * @len: length of write
+ *
+ * In some cases, write_begin doesn't need to read at all:
+ * - full page write
+ * - file is currently zero-length
+ * - write that lies in a page that is completely beyond EOF
+ * - write that covers the the page from start to EOF or beyond it
+ *
+ * If any of these criteria are met, then zero out the unwritten parts
+ * of the page and return true. Otherwise, return false.
+ */
+static bool skip_page_read(struct page *page, loff_t pos, size_t len)
+{
+	struct inode *inode = page->mapping->host;
+	loff_t i_size = i_size_read(inode);
+	size_t offset = offset_in_page(pos);
+
+	/* Full page write */
+	if (offset == 0 && len >= PAGE_SIZE)
+		return true;
+
+	/* pos beyond last page in the file */
+	if (pos - offset >= i_size)
+		goto zero_out;
+
+	/* write that covers the whole page from start to EOF or beyond it */
+	if (offset == 0 && (pos + len) >= i_size)
+		goto zero_out;
+
+	return false;
+zero_out:
+	zero_user_segments(page, 0, offset, offset + len, PAGE_SIZE);
+	return true;
+}
+
 /*
  * We are only allowed to write into/dirty the page if the page is
  * clean, or already dirty within the same snap context.
@@ -1315,7 +1354,6 @@  static int ceph_write_begin(struct file *file, struct address_space *mapping,
 	struct ceph_snap_context *snapc;
 	struct page *page = NULL;
 	pgoff_t index = pos >> PAGE_SHIFT;
-	int pos_in_page = pos & ~PAGE_MASK;
 	int r = 0;
 
 	dout("write_begin file %p inode %p page %p %d~%d\n", file, inode, page, (int)pos, (int)len);
@@ -1350,19 +1388,9 @@  static int ceph_write_begin(struct file *file, struct address_space *mapping,
 			break;
 		}
 
-		/*
-		 * In some cases we don't need to read at all:
-		 * - full page write
-		 * - write that lies completely beyond EOF
-		 * - write that covers the the page from start to EOF or beyond it
-		 */
-		if ((pos_in_page == 0 && len == PAGE_SIZE) ||
-		    (pos >= i_size_read(inode)) ||
-		    (pos_in_page == 0 && (pos + len) >= i_size_read(inode))) {
-			zero_user_segments(page, 0, pos_in_page,
-					   pos_in_page + len, PAGE_SIZE);
+		/* No need to read in some cases */
+		if (skip_page_read(page, pos, len))
 			break;
-		}
 
 		/*
 		 * We need to read it. If we get back -EINPROGRESS, then the page was