From patchwork Fri Nov 20 15:30:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 329194 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89B19C56201 for ; Fri, 20 Nov 2020 15:30:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 303712245B for ; Fri, 20 Nov 2020 15:30:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="l7zZ6KiG" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728186AbgKTPaL (ORCPT ); Fri, 20 Nov 2020 10:30:11 -0500 Received: from mail.kernel.org ([198.145.29.99]:35108 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728059AbgKTPaK (ORCPT ); Fri, 20 Nov 2020 10:30:10 -0500 Received: from tleilax.com (68-20-15-154.lightspeed.rlghnc.sbcglobal.net [68.20.15.154]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 88F25223BE; Fri, 20 Nov 2020 15:30:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1605886210; bh=sNHtgUYqJsCOaigwU+i2akNaySdXgpLizaz1JdFmCFE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=l7zZ6KiGhXQIJWoalmt8Vjau4sVTELqtlImxRCacWp7Jt5ZtSGAoGVpNypQ3gz73S FPGV94sKd7ookZdr8oEHgSu6GWUb3cQih9oNDyPYtvtMyWiJoxWEbvWjkLYoqpSkxe MayQ50XTQWxkNqwqZ68XgFttB6pMyAR4jD8Uc/ZI= From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: linux-cachefs@redhat.com, idryomov@redhat.com, dhowells@redhat.com Subject: [PATCH 2/5] ceph: convert readpage to fscache read helper Date: Fri, 20 Nov 2020 10:30:03 -0500 Message-Id: <20201120153006.304296-3-jlayton@kernel.org> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201120153006.304296-1-jlayton@kernel.org> References: <20201120153006.304296-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Have the ceph KConfig select NETFS_SUPPORT. Add a new netfs ops structure that and convert ceph_readpage to use the new netfs_readpage helper. Signed-off-by: Jeff Layton --- fs/ceph/Kconfig | 1 + fs/ceph/addr.c | 167 +++++++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 158 insertions(+), 10 deletions(-) diff --git a/fs/ceph/Kconfig b/fs/ceph/Kconfig index e955a38be3c8..77ad452337ee 100644 --- a/fs/ceph/Kconfig +++ b/fs/ceph/Kconfig @@ -6,6 +6,7 @@ config CEPH_FS select LIBCRC32C select CRYPTO_AES select CRYPTO + select NETFS_SUPPORT default n help Choose Y or M here to include support for mounting the diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index 9e657089d56e..9dd9bb3f4696 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -12,6 +12,7 @@ #include #include #include +#include #include "super.h" #include "mds_client.h" @@ -185,6 +186,162 @@ static int ceph_releasepage(struct page *page, gfp_t gfp_flags) return 1; } +static bool ceph_netfs_clamp_length(struct netfs_read_subrequest *subreq) +{ + struct inode *inode = subreq->rreq->mapping->host; + struct ceph_inode_info *ci = ceph_inode(inode); + u64 objno, objoff; + u32 xlen; + + /* Truncate the extent at the end of the current object */ + ceph_calc_file_object_mapping(&ci->i_layout, subreq->start, subreq->len, + &objno, &objoff, &xlen); + subreq->len = xlen; + return true; +} + +static void finish_netfs_read(struct ceph_osd_request *req) +{ + struct ceph_fs_client *fsc = ceph_inode_to_client(req->r_inode); + struct ceph_osd_data *osd_data = osd_req_op_extent_osd_data(req, 0); + struct netfs_read_subrequest *subreq = req->r_priv; + int num_pages; + int err = req->r_result; + + ceph_update_read_latency(&fsc->mdsc->metric, req->r_start_latency, + req->r_end_latency, err); + + dout("%s: result %d subreq->len=%zu i_size=%lld\n", __func__, req->r_result, + subreq->len, i_size_read(req->r_inode)); + + /* no object means success but no data */ + if (err == -ENOENT) + err = 0; + else if (err == -EBLOCKLISTED) + fsc->blocklisted = true; + + if (err >= 0 && err < subreq->len) + __set_bit(NETFS_SREQ_CLEAR_TAIL, &subreq->flags); + + netfs_subreq_terminated(subreq, err); + + num_pages = calc_pages_for(osd_data->alignment, osd_data->length); + ceph_put_page_vector(osd_data->pages, num_pages, false); + iput(req->r_inode); +} + +static void ceph_netfs_issue_op(struct netfs_read_subrequest *subreq) +{ + struct netfs_read_request *rreq = subreq->rreq; + struct inode *inode = rreq->mapping->host; + struct ceph_inode_info *ci = ceph_inode(inode); + struct ceph_fs_client *fsc = ceph_inode_to_client(inode); + struct ceph_osd_request *req = NULL; + struct ceph_vino vino = ceph_vino(inode); + struct iov_iter iter; + struct page **pages; + size_t page_off; + int err = 0; + u64 len = subreq->len; + + req = ceph_osdc_new_request(&fsc->client->osdc, &ci->i_layout, vino, subreq->start, &len, + 0, 1, CEPH_OSD_OP_READ, + CEPH_OSD_FLAG_READ | fsc->client->osdc.client->options->read_from_replica, + NULL, ci->i_truncate_seq, ci->i_truncate_size, false); + if (IS_ERR(req)) { + err = PTR_ERR(req); + goto out; + } + + dout("%s: pos=%llu orig_len=%lu len=%llu\n", __func__, subreq->start, subreq->len, len); + iov_iter_xarray(&iter, READ, &rreq->mapping->i_pages, subreq->start, len); + len = iov_iter_get_pages_alloc(&iter, &pages, len, &page_off); + if (len < 0) { + err = len; + dout("%s: iov_ter_get_pages_alloc returned %d\n", __func__, err); + goto out; + } + + /* should always give us a page-aligned read */ + WARN_ON_ONCE(page_off); + + osd_req_op_extent_osd_data_pages(req, 0, pages, len, 0, false, false); + req->r_callback = finish_netfs_read; + req->r_priv = subreq; + req->r_inode = inode; + ihold(inode); + + err = ceph_osdc_start_request(req->r_osdc, req, false); + if (err) + iput(inode); +out: + if (req) + ceph_osdc_put_request(req); + if (err) + netfs_subreq_terminated(subreq, err); + dout("%s: result %d\n", __func__, err); +} + +static void ceph_init_rreq(struct netfs_read_request *rreq, struct file *file) +{ + struct ceph_inode_info *ci = ceph_inode(rreq->inode); + struct fscache_cookie *cookie = ceph_fscache_cookie(ci); + + if (cookie) + rreq->cookie_debug_id = cookie->debug_id; +} + +static bool ceph_is_cache_enabled(struct inode *inode) +{ + return fscache_cookie_enabled(ceph_fscache_cookie(ceph_inode(inode))); +} + +static int ceph_begin_cache_operation(struct netfs_read_request *rreq) +{ + struct ceph_inode_info *ci = ceph_inode(rreq->inode); + + return fscache_begin_operation(ceph_fscache_cookie(ci), &rreq->cache_resources, + FSCACHE_WANT_PARAMS); +} + +const struct netfs_read_request_ops ceph_readpage_netfs_ops = { + .init_rreq = ceph_init_rreq, + .is_cache_enabled = ceph_is_cache_enabled, + .begin_cache_operation = ceph_begin_cache_operation, + .issue_op = ceph_netfs_issue_op, + .clamp_length = ceph_netfs_clamp_length, +}; + +/* read a single page, without unlocking it. */ +static int ceph_readpage(struct file *filp, struct page *page) +{ + struct inode *inode = file_inode(filp); + struct ceph_inode_info *ci = ceph_inode(inode); + struct ceph_vino vino = ceph_vino(inode); + u64 off = page_offset(page); + u64 len = PAGE_SIZE; + + if (ci->i_inline_version != CEPH_INLINE_NONE) { + /* + * Uptodate inline data should have been added + * into page cache while getting Fcr caps. + */ + if (off == 0) { + unlock_page(page); + return -EINVAL; + } + zero_user_segment(page, 0, PAGE_SIZE); + SetPageUptodate(page); + unlock_page(page); + return 0; + } + + dout("readpage ino %llx.%llx file %p off %llu len %llu page %p index %lu\n", + vino.ino, vino.snap, filp, off, len, page, page->index); + + return netfs_readpage(filp, page, &ceph_readpage_netfs_ops, NULL); +} + /* read a single page, without unlocking it. */ static int ceph_do_readpage(struct file *filp, struct page *page) { @@ -255,16 +412,6 @@ static int ceph_do_readpage(struct file *filp, struct page *page) return err < 0 ? err : 0; } -static int ceph_readpage(struct file *filp, struct page *page) -{ - int r = ceph_do_readpage(filp, page); - if (r != -EINPROGRESS) - unlock_page(page); - else - r = 0; - return r; -} - /* * Finish an async read(ahead) op. */ From patchwork Fri Nov 20 15:30:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 329193 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNWANTED_LANGUAGE_BODY, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96BD3C5519F for ; Fri, 20 Nov 2020 15:30:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3648A2245B for ; Fri, 20 Nov 2020 15:30:14 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="MobNi6p+" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728499AbgKTPaN (ORCPT ); Fri, 20 Nov 2020 10:30:13 -0500 Received: from mail.kernel.org ([198.145.29.99]:35128 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728059AbgKTPaM (ORCPT ); Fri, 20 Nov 2020 10:30:12 -0500 Received: from tleilax.com (68-20-15-154.lightspeed.rlghnc.sbcglobal.net [68.20.15.154]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 064B5223FD; Fri, 20 Nov 2020 15:30:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1605886211; bh=BG2FuI8tWs+R6EghSvjLjdZLMRL2gnF1sAr6c4cW/8Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=MobNi6p+chFlu2JzjZzCxrujynR3f1FjSXHMB9tWkxH6ghfSTfjn+2Q9qHkoDBgcK eODUv5LcQmpVLSBhsUgyTm9vktPb5GTeg4OwPMR65n0lfh25byKj5wd8VIsrLpbQpI VO2DUW6IZ0kd7fEmBg0XdV57nO6fV+moTgtkvzOg= From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: linux-cachefs@redhat.com, idryomov@redhat.com, dhowells@redhat.com Subject: [PATCH 4/5] ceph: convert ceph_readpages to ceph_readahead Date: Fri, 20 Nov 2020 10:30:05 -0500 Message-Id: <20201120153006.304296-5-jlayton@kernel.org> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201120153006.304296-1-jlayton@kernel.org> References: <20201120153006.304296-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Convert ceph_readpages to ceph_readahead and make it use netfs_readahead. With this we can rip out a lot of the old readpage/readpages infrastructure. Signed-off-by: Jeff Layton --- fs/ceph/addr.c | 230 ++++++++----------------------------------------- 1 file changed, 34 insertions(+), 196 deletions(-) diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index dfd9fa05e0e1..ea17ee7d7218 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -342,215 +342,53 @@ static int ceph_readpage(struct file *filp, struct page *page) return netfs_readpage(filp, page, &ceph_readpage_netfs_ops, NULL); } -/* - * Finish an async read(ahead) op. - */ -static void finish_read(struct ceph_osd_request *req) -{ - struct inode *inode = req->r_inode; - struct ceph_fs_client *fsc = ceph_inode_to_client(inode); - struct ceph_osd_data *osd_data; - int rc = req->r_result <= 0 ? req->r_result : 0; - int bytes = req->r_result >= 0 ? req->r_result : 0; - int num_pages; - int i; - - dout("finish_read %p req %p rc %d bytes %d\n", inode, req, rc, bytes); - if (rc == -EBLOCKLISTED) - ceph_inode_to_client(inode)->blocklisted = true; - - /* unlock all pages, zeroing any data we didn't read */ - osd_data = osd_req_op_extent_osd_data(req, 0); - BUG_ON(osd_data->type != CEPH_OSD_DATA_TYPE_PAGES); - num_pages = calc_pages_for((u64)osd_data->alignment, - (u64)osd_data->length); - for (i = 0; i < num_pages; i++) { - struct page *page = osd_data->pages[i]; - - if (rc < 0 && rc != -ENOENT) - goto unlock; - if (bytes < (int)PAGE_SIZE) { - /* zero (remainder of) page */ - int s = bytes < 0 ? 0 : bytes; - zero_user_segment(page, s, PAGE_SIZE); - } - dout("finish_read %p uptodate %p idx %lu\n", inode, page, - page->index); - flush_dcache_page(page); - SetPageUptodate(page); -unlock: - unlock_page(page); - put_page(page); - bytes -= PAGE_SIZE; - } - - ceph_update_read_latency(&fsc->mdsc->metric, req->r_start_latency, - req->r_end_latency, rc); - - kfree(osd_data->pages); -} - -/* - * start an async read(ahead) operation. return nr_pages we submitted - * a read for on success, or negative error code. - */ -static int start_read(struct inode *inode, struct ceph_rw_context *rw_ctx, - struct list_head *page_list, int max) +static void ceph_readahead_cleanup(struct address_space *mapping, void *priv) { - struct ceph_osd_client *osdc = - &ceph_inode_to_client(inode)->client->osdc; + struct inode *inode = mapping->host; struct ceph_inode_info *ci = ceph_inode(inode); - struct page *page = lru_to_page(page_list); - struct ceph_vino vino; - struct ceph_osd_request *req; - u64 off; - u64 len; - int i; - struct page **pages; - pgoff_t next_index; - int nr_pages = 0; - int got = 0; - int ret = 0; + int got = (int)(uintptr_t)priv; - if (!rw_ctx) { - /* caller of readpages does not hold buffer and read caps - * (fadvise, madvise and readahead cases) */ - int want = CEPH_CAP_FILE_CACHE; - ret = ceph_try_get_caps(inode, CEPH_CAP_FILE_RD, want, - true, &got); - if (ret < 0) { - dout("start_read %p, error getting cap\n", inode); - } else if (!(got & want)) { - dout("start_read %p, no cache cap\n", inode); - ret = 0; - } - if (ret <= 0) { - if (got) - ceph_put_cap_refs(ci, got); - while (!list_empty(page_list)) { - page = lru_to_page(page_list); - list_del(&page->lru); - put_page(page); - } - return ret; - } - } - - off = (u64) page_offset(page); - - /* count pages */ - next_index = page->index; - list_for_each_entry_reverse(page, page_list, lru) { - if (page->index != next_index) - break; - nr_pages++; - next_index++; - if (max && nr_pages == max) - break; - } - len = nr_pages << PAGE_SHIFT; - dout("start_read %p nr_pages %d is %lld~%lld\n", inode, nr_pages, - off, len); - vino = ceph_vino(inode); - req = ceph_osdc_new_request(osdc, &ci->i_layout, vino, off, &len, - 0, 1, CEPH_OSD_OP_READ, - CEPH_OSD_FLAG_READ, NULL, - ci->i_truncate_seq, ci->i_truncate_size, - false); - if (IS_ERR(req)) { - ret = PTR_ERR(req); - goto out; - } - - /* build page vector */ - nr_pages = calc_pages_for(0, len); - pages = kmalloc_array(nr_pages, sizeof(*pages), GFP_KERNEL); - if (!pages) { - ret = -ENOMEM; - goto out_put; - } - for (i = 0; i < nr_pages; ++i) { - page = list_entry(page_list->prev, struct page, lru); - BUG_ON(PageLocked(page)); - list_del(&page->lru); - - dout("start_read %p adding %p idx %lu\n", inode, page, - page->index); - if (add_to_page_cache_lru(page, &inode->i_data, page->index, - GFP_KERNEL)) { - put_page(page); - dout("start_read %p add_to_page_cache failed %p\n", - inode, page); - nr_pages = i; - if (nr_pages > 0) { - len = nr_pages << PAGE_SHIFT; - osd_req_op_extent_update(req, 0, len); - break; - } - goto out_pages; - } - pages[i] = page; - } - osd_req_op_extent_osd_data_pages(req, 0, pages, len, 0, false, false); - req->r_callback = finish_read; - req->r_inode = inode; - - dout("start_read %p starting %p %lld~%lld\n", inode, req, off, len); - ret = ceph_osdc_start_request(osdc, req, false); - if (ret < 0) - goto out_pages; - ceph_osdc_put_request(req); - - /* After adding locked pages to page cache, the inode holds cache cap. - * So we can drop our cap refs. */ if (got) ceph_put_cap_refs(ci, got); - - return nr_pages; - -out_pages: - for (i = 0; i < nr_pages; ++i) { - unlock_page(pages[i]); - } - ceph_put_page_vector(pages, nr_pages, false); -out_put: - ceph_osdc_put_request(req); -out: - if (got) - ceph_put_cap_refs(ci, got); - return ret; } +const struct netfs_read_request_ops ceph_readahead_netfs_ops = { + .init_rreq = ceph_init_rreq, + .is_cache_enabled = ceph_is_cache_enabled, + .begin_cache_operation = ceph_begin_cache_operation, + .issue_op = ceph_netfs_issue_op, + .clamp_length = ceph_netfs_clamp_length, + .cleanup = ceph_readahead_cleanup, +}; - -/* - * Read multiple pages. Leave pages we don't read + unlock in page_list; - * the caller (VM) cleans them up. - */ -static int ceph_readpages(struct file *file, struct address_space *mapping, - struct list_head *page_list, unsigned nr_pages) +static void ceph_readahead(struct readahead_control *ractl) { - struct inode *inode = file_inode(file); - struct ceph_fs_client *fsc = ceph_inode_to_client(inode); - struct ceph_file_info *fi = file->private_data; + struct inode *inode = file_inode(ractl->file); + struct ceph_file_info *fi = ractl->file->private_data; struct ceph_rw_context *rw_ctx; - int rc = 0; - int max = 0; + int got = 0; + int ret = 0; if (ceph_inode(inode)->i_inline_version != CEPH_INLINE_NONE) - return -EINVAL; + return; rw_ctx = ceph_find_rw_context(fi); - max = fsc->mount_options->rsize >> PAGE_SHIFT; - dout("readpages %p file %p ctx %p nr_pages %d max %d\n", - inode, file, rw_ctx, nr_pages, max); - while (!list_empty(page_list)) { - rc = start_read(inode, rw_ctx, page_list, max); - if (rc < 0) - goto out; + if (!rw_ctx) { + /* + * readahead callers do not necessarily hold Fcb caps + * (e.g. fadvise, madvise). + */ + int want = CEPH_CAP_FILE_CACHE; + + ret = ceph_try_get_caps(inode, CEPH_CAP_FILE_RD, want, true, &got); + if (ret < 0) + dout("start_read %p, error getting cap\n", inode); + else if (!(got & want)) + dout("start_read %p, no cache cap\n", inode); + + if (ret <= 0) + return; } -out: - dout("readpages %p file %p ret %d\n", inode, file, rc); - return rc; + netfs_readahead(ractl, &ceph_readahead_netfs_ops, (void *)(uintptr_t)got); } struct ceph_writeback_ctl @@ -1503,7 +1341,7 @@ static ssize_t ceph_direct_io(struct kiocb *iocb, struct iov_iter *iter) const struct address_space_operations ceph_aops = { .readpage = ceph_readpage, - .readpages = ceph_readpages, + .readahead = ceph_readahead, .writepage = ceph_writepage, .writepages = ceph_writepages_start, .write_begin = ceph_write_begin, From patchwork Fri Nov 20 15:30:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 329192 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CD2FDC56201 for ; Fri, 20 Nov 2020 15:30:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 822392245F for ; Fri, 20 Nov 2020 15:30:14 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="1xJnaGHS" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728534AbgKTPaN (ORCPT ); Fri, 20 Nov 2020 10:30:13 -0500 Received: from mail.kernel.org ([198.145.29.99]:35132 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727335AbgKTPaN (ORCPT ); Fri, 20 Nov 2020 10:30:13 -0500 Received: from tleilax.com (68-20-15-154.lightspeed.rlghnc.sbcglobal.net [68.20.15.154]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id C5DB12245A; Fri, 20 Nov 2020 15:30:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1605886212; bh=ZrMmAoupCP1VvaDAEcXsjaQ5Si8IUlNucRxFMdE1MF0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=1xJnaGHSBEyylwUC23Rpied4/U1li3LgHSMaQFBTO62uYVoZBpNu0dLk6bULfUazc B2S63X7qxEt6Zp6Ig9vmPEkPJE3FFx9wXx/neQi9tzLOtrWMuBGl8PVMivB08USR6T 6x0kMgKxojOe2d15/IgtjhH67KF2OJHJvEb6if18= From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: linux-cachefs@redhat.com, idryomov@redhat.com, dhowells@redhat.com Subject: [PATCH 5/5] ceph: add fscache writeback support Date: Fri, 20 Nov 2020 10:30:06 -0500 Message-Id: <20201120153006.304296-6-jlayton@kernel.org> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201120153006.304296-1-jlayton@kernel.org> References: <20201120153006.304296-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org When updating the backing store from the pagecache (a'la writepage or writepages), write to the cache first. This allows us to keep caching files even when they are open for write. With this change we we can now re-enable CEPH_FSCACHE in Kconfig. Signed-off-by: Jeff Layton --- fs/ceph/Kconfig | 2 +- fs/ceph/addr.c | 64 +++++++++++++++++++++++++++++++++++++++++++------ 2 files changed, 58 insertions(+), 8 deletions(-) diff --git a/fs/ceph/Kconfig b/fs/ceph/Kconfig index 77ad452337ee..94df854147d3 100644 --- a/fs/ceph/Kconfig +++ b/fs/ceph/Kconfig @@ -21,7 +21,7 @@ config CEPH_FS if CEPH_FS config CEPH_FSCACHE bool "Enable Ceph client caching support" - depends on CEPH_FS=m && FSCACHE_OLD || CEPH_FS=y && FSCACHE_OLD=y + depends on CEPH_FS=m && FSCACHE || CEPH_FS=y && FSCACHE=y help Choose Y here to enable persistent, read-only local caching support for Ceph clients using FS-Cache diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index ea17ee7d7218..3bd9a2922e4f 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -5,7 +5,6 @@ #include #include #include -#include /* generic_writepages */ #include #include #include @@ -391,6 +390,39 @@ static void ceph_readahead(struct readahead_control *ractl) netfs_readahead(ractl, &ceph_readahead_netfs_ops, (void *)(uintptr_t)got); } +#ifdef CONFIG_CEPH_FSCACHE +static void ceph_fscache_write_terminated(void *priv, ssize_t error) +{ + struct inode *inode = priv; + + if (IS_ERR_VALUE(error) && error != -ENOBUFS) + ceph_fscache_invalidate(inode, 0); +} + +static void ceph_fscache_write_to_cache(struct inode *inode, u64 off, u64 len) +{ + struct ceph_inode_info *ci = ceph_inode(inode); + struct fscache_cookie *cookie = ceph_fscache_cookie(ci); + + fscache_write_to_cache(cookie, inode->i_mapping, off, len, i_size_read(inode), + ceph_fscache_write_terminated, inode); +} + +static inline bool CephTestSetPageFsCache(struct page *page) +{ + return TestSetPageFsCache(page); +} +#else +static inline void ceph_fscache_write_to_cache(struct inode *inode, u64 off, u64 len) +{ +} + +static inline bool CephTestSetPageFsCache(struct page *page) +{ + return false; +} +#endif /* CONFIG_CEPH_FSCACHE */ + struct ceph_writeback_ctl { loff_t i_size; @@ -544,16 +576,17 @@ static int writepage_nounlock(struct page *page, struct writeback_control *wbc) CONGESTION_ON_THRESH(fsc->mount_options->congestion_kb)) set_bdi_congested(inode_to_bdi(inode), BLK_RW_ASYNC); - set_page_writeback(page); req = ceph_osdc_new_request(osdc, &ci->i_layout, ceph_vino(inode), page_off, &len, 0, 1, CEPH_OSD_OP_WRITE, CEPH_OSD_FLAG_WRITE, snapc, ceph_wbc.truncate_seq, ceph_wbc.truncate_size, true); - if (IS_ERR(req)) { - redirty_page_for_writepage(wbc, page); - end_page_writeback(page); + if (IS_ERR(req)) return PTR_ERR(req); - } + + set_page_writeback(page); + if (CephTestSetPageFsCache(page)) + BUG(); + ceph_fscache_write_to_cache(inode, page_off, len); /* it may be a short write due to an object boundary */ WARN_ON_ONCE(len > PAGE_SIZE); @@ -612,6 +645,9 @@ static int ceph_writepage(struct page *page, struct writeback_control *wbc) struct inode *inode = page->mapping->host; BUG_ON(!inode); ihold(inode); + + ceph_wait_on_page_fscache(page); + err = writepage_nounlock(page, wbc); if (err == -ERESTARTSYS) { /* direct memory reclaimer was killed by SIGKILL. return 0 @@ -856,7 +892,7 @@ static int ceph_writepages_start(struct address_space *mapping, unlock_page(page); break; } - if (PageWriteback(page)) { + if (PageWriteback(page) || PageFsCache(page)) { if (wbc->sync_mode == WB_SYNC_NONE) { dout("%p under writeback\n", page); unlock_page(page); @@ -864,6 +900,7 @@ static int ceph_writepages_start(struct address_space *mapping, } dout("waiting on writeback %p\n", page); wait_on_page_writeback(page); + ceph_wait_on_page_fscache(page); } if (!clear_page_dirty_for_io(page)) { @@ -996,9 +1033,19 @@ static int ceph_writepages_start(struct address_space *mapping, op_idx = 0; for (i = 0; i < locked_pages; i++) { u64 cur_offset = page_offset(pages[i]); + /* + * Discontinuity in page range? Ceph can handle that by just passing + * multiple extents in the write op. + */ if (offset + len != cur_offset) { + /* If it's full, stop here */ if (op_idx + 1 == req->r_num_ops) break; + + /* Kick off an fscache write with what we have so far. */ + ceph_fscache_write_to_cache(inode, offset, len); + + /* Start a new extent */ osd_req_op_extent_dup_last(req, op_idx, cur_offset - offset); dout("writepages got pages at %llu~%llu\n", @@ -1015,8 +1062,11 @@ static int ceph_writepages_start(struct address_space *mapping, } set_page_writeback(pages[i]); + if (CephTestSetPageFsCache(pages[i])) + BUG(); len += PAGE_SIZE; } + ceph_fscache_write_to_cache(inode, offset, len); if (ceph_wbc.size_stable) { len = min(len, ceph_wbc.i_size - offset);