Message ID | 20200917151050.5363-1-willy@infradead.org |
---|---|
Headers | show |
Series | Allow readpage to return a locked page | expand |
On Thu, Sep 17, 2020 at 04:10:38PM +0100, Matthew Wilcox (Oracle) wrote: > +++ b/mm/filemap.c > @@ -2254,8 +2254,10 @@ ssize_t generic_file_buffered_read(struct kiocb *iocb, > * PG_error will be set again if readpage fails. > */ > ClearPageError(page); > - /* Start the actual read. The read will unlock the page. */ > + /* Start the actual read. The read may unlock the page. */ > error = mapping->a_ops->readpage(filp, page); > + if (error == AOP_UPDATED_PAGE) > + goto page_ok; > > if (unlikely(error)) { > if (error == AOP_TRUNCATED_PAGE) { If anybody wants to actually test this, this hunk is wrong. +++ b/mm/filemap.c @@ -2256,8 +2256,11 @@ ssize_t generic_file_buffered_read(struct kiocb *iocb, ClearPageError(page); /* Start the actual read. The read may unlock the page. */ error = mapping->a_ops->readpage(filp, page); - if (error == AOP_UPDATED_PAGE) + if (error == AOP_UPDATED_PAGE) { + unlock_page(page); + error = 0; goto page_ok; + } if (unlikely(error)) { if (error == AOP_TRUNCATED_PAGE) { > @@ -2619,7 +2621,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf) > */ > if (unlikely(!PageUptodate(page))) > goto page_not_uptodate; > - > +page_ok: > /* > * We've made it this far and we had to drop our mmap_lock, now is the > * time to return to the upper layer and have it re-find the vma and > @@ -2654,6 +2656,8 @@ vm_fault_t filemap_fault(struct vm_fault *vmf) > ClearPageError(page); > fpin = maybe_unlock_mmap_for_io(vmf, fpin); > error = mapping->a_ops->readpage(file, page); > + if (error == AOP_UPDATED_PAGE) > + goto page_ok; > if (!error) { > wait_on_page_locked(page); > if (!PageUptodate(page)) > @@ -2867,6 +2871,10 @@ static struct page *do_read_cache_page(struct address_space *mapping, > err = filler(data, page); > else > err = mapping->a_ops->readpage(data, page); > + if (err == AOP_UPDATED_PAGE) { > + unlock_page(page); > + goto out; > + } > > if (err < 0) { > put_page(page); > -- > 2.28.0 >
Matthew Wilcox (Oracle) wrote on Thu, Sep 17, 2020: > diff --git a/fs/9p/vfs_addr.c b/fs/9p/vfs_addr.c > index cce9ace651a2..506ca0ba2ec7 100644 > --- a/fs/9p/vfs_addr.c > +++ b/fs/9p/vfs_addr.c > @@ -280,6 +280,10 @@ static int v9fs_write_begin(struct file *filp, struct address_space *mapping, > goto out; > > retval = v9fs_fid_readpage(v9inode->writeback_fid, page); > + if (retval == AOP_UPDATED_PAGE) { > + retval = 0; > + goto out; > + } FWIW this is a change of behaviour; for some reason the code used to loop back to grab_cache_page_write_begin() and bail out on PageUptodate() I suppose; some sort of race check? The whole pattern is a bit weird to me and 9p has no guarantee on concurrent writes to a file with cache enabled (except that it will corrupt something), so this part is fine with me. What I'm curious about is the page used to be both unlocked and put, but now isn't either and the return value hasn't changed for the caller to make a difference on write_begin / I don't see any code change in the vfs to handle that. What did I miss? (FWIW at least cifs in the series has the same pattern change; didn't check all of them) Thanks, -- Dominique
On Fri, Sep 18, 2020 at 07:59:19AM +0200, Dominique Martinet wrote: > Matthew Wilcox (Oracle) wrote on Thu, Sep 17, 2020: > > diff --git a/fs/9p/vfs_addr.c b/fs/9p/vfs_addr.c > > index cce9ace651a2..506ca0ba2ec7 100644 > > --- a/fs/9p/vfs_addr.c > > +++ b/fs/9p/vfs_addr.c > > @@ -280,6 +280,10 @@ static int v9fs_write_begin(struct file *filp, struct address_space *mapping, > > goto out; > > > > retval = v9fs_fid_readpage(v9inode->writeback_fid, page); > > + if (retval == AOP_UPDATED_PAGE) { > > + retval = 0; > > + goto out; > > + } > > FWIW this is a change of behaviour; for some reason the code used to > loop back to grab_cache_page_write_begin() and bail out on > PageUptodate() I suppose; some sort of race check? > The whole pattern is a bit weird to me and 9p has no guarantee on > concurrent writes to a file with cache enabled (except that it will > corrupt something), so this part is fine with me. > > What I'm curious about is the page used to be both unlocked and put, but > now isn't either and the return value hasn't changed for the caller to > make a difference on write_begin / I don't see any code change in the > vfs to handle that. > What did I miss? The page cache is kind of subtle. The grab_cache_page_write_begin() will return a Locked page with an increased refcount. If it's Uptodate, that's exactly what we want, and we return it. If we have to read the page, readpage used to unlock the page before returning, and rather than re-lock it, we would drop the reference to the page and look it up again. It's possible that after dropping the lock on that page that the page was replaced in the page cache and so we'd get a different page. Anyway, now (unless fscache is involved), v9fs_fid_readpage will return the page without unlocking it. So we don't need to do the dance of dropping the lock, putting the refcount and looking the page back up again. We can just return the page. The VFS doesn't need a special return code because nothing has changed from the VFS's point of view -- it asked you to get a page and you got the page.