From patchwork Wed May 27 09:46:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Gurtovoy X-Patchwork-Id: 218451 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.7 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5AA24C433E0 for ; Wed, 27 May 2020 09:46:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 39B4520888 for ; Wed, 27 May 2020 09:46:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725939AbgE0Jqm (ORCPT ); Wed, 27 May 2020 05:46:42 -0400 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:51564 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729072AbgE0Jql (ORCPT ); Wed, 27 May 2020 05:46:41 -0400 Received: from Internal Mail-Server by MTLPINE1 (envelope-from maxg@mellanox.com) with ESMTPS (AES256-SHA encrypted); 27 May 2020 12:46:35 +0300 Received: from mtr-vdi-031.wap.labs.mlnx. (mtr-vdi-031.wap.labs.mlnx [10.209.102.136]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 04R9kYii009430; Wed, 27 May 2020 12:46:35 +0300 From: Max Gurtovoy To: jgg@mellanox.com, dledford@redhat.com, leon@kernel.org, galpress@amazon.com, dennis.dalessandro@intel.com, netdev@vger.kernel.org, sagi@grimberg.me, linux-rdma@vger.kernel.org, bvanassche@acm.org, santosh.shilimkar@oracle.com, tom@talpey.com Cc: aron.silverton@oracle.com, israelr@mellanox.com, oren@mellanox.com, shlomin@mellanox.com, vladimirk@mellanox.com, Max Gurtovoy Subject: [PATCH 1/9] RDMA/mlx5: Remove FMR leftovers Date: Wed, 27 May 2020 12:46:26 +0300 Message-Id: <20200527094634.24240-2-maxg@mellanox.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200527094634.24240-1-maxg@mellanox.com> References: <20200527094634.24240-1-maxg@mellanox.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Gal Pressman Remove a few leftovers from FMR functionality which are no longer used. Signed-off-by: Gal Pressman Signed-off-by: Max Gurtovoy --- drivers/infiniband/hw/mlx5/mlx5_ib.h | 8 -------- 1 file changed, 8 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index a4e5223..211ec84 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -702,12 +702,6 @@ struct umr_common { struct semaphore sem; }; -enum { - MLX5_FMR_INVALID, - MLX5_FMR_VALID, - MLX5_FMR_BUSY, -}; - struct mlx5_cache_ent { struct list_head head; /* sync access to the cahce entry @@ -1283,8 +1277,6 @@ int mlx5_query_mad_ifc_port(struct ib_device *ibdev, u8 port, struct ib_port_attr *props); int mlx5_ib_query_port(struct ib_device *ibdev, u8 port, struct ib_port_attr *props); -int mlx5_ib_init_fmr(struct mlx5_ib_dev *dev); -void mlx5_ib_cleanup_fmr(struct mlx5_ib_dev *dev); void mlx5_ib_cont_pages(struct ib_umem *umem, u64 addr, unsigned long max_page_shift, int *count, int *shift, From patchwork Wed May 27 09:46:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Gurtovoy X-Patchwork-Id: 218447 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.7 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D827C433E0 for ; Wed, 27 May 2020 09:49:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0B52520888 for ; Wed, 27 May 2020 09:49:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728434AbgE0Js7 (ORCPT ); Wed, 27 May 2020 05:48:59 -0400 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:58143 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727895AbgE0Js6 (ORCPT ); Wed, 27 May 2020 05:48:58 -0400 Received: from Internal Mail-Server by MTLPINE2 (envelope-from maxg@mellanox.com) with ESMTPS (AES256-SHA encrypted); 27 May 2020 12:46:35 +0300 Received: from mtr-vdi-031.wap.labs.mlnx. (mtr-vdi-031.wap.labs.mlnx [10.209.102.136]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 04R9kYij009430; Wed, 27 May 2020 12:46:35 +0300 From: Max Gurtovoy To: jgg@mellanox.com, dledford@redhat.com, leon@kernel.org, galpress@amazon.com, dennis.dalessandro@intel.com, netdev@vger.kernel.org, sagi@grimberg.me, linux-rdma@vger.kernel.org, bvanassche@acm.org, santosh.shilimkar@oracle.com, tom@talpey.com Cc: aron.silverton@oracle.com, israelr@mellanox.com, oren@mellanox.com, shlomin@mellanox.com, vladimirk@mellanox.com, Max Gurtovoy Subject: [PATCH 2/9] RDMA/mlx4: remove FMR support for memory registration Date: Wed, 27 May 2020 12:46:27 +0300 Message-Id: <20200527094634.24240-3-maxg@mellanox.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200527094634.24240-1-maxg@mellanox.com> References: <20200527094634.24240-1-maxg@mellanox.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org HCA's that are driven by mlx4 driver support FRWR method to register memory. Remove the ancient and unsafe FMR method. Signed-off-by: Max Gurtovoy --- drivers/infiniband/hw/mlx4/main.c | 10 -- drivers/infiniband/hw/mlx4/mlx4_ib.h | 16 --- drivers/infiniband/hw/mlx4/mr.c | 93 ---------------- drivers/net/ethernet/mellanox/mlx4/mr.c | 183 -------------------------------- include/linux/mlx4/device.h | 21 +--- 5 files changed, 2 insertions(+), 321 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c index 275722c..778dccf 100644 --- a/drivers/infiniband/hw/mlx4/main.c +++ b/drivers/infiniband/hw/mlx4/main.c @@ -2600,13 +2600,6 @@ static void get_fw_ver_str(struct ib_device *device, char *str) .modify_wq = mlx4_ib_modify_wq, }; -static const struct ib_device_ops mlx4_ib_dev_fmr_ops = { - .alloc_fmr = mlx4_ib_fmr_alloc, - .dealloc_fmr = mlx4_ib_fmr_dealloc, - .map_phys_fmr = mlx4_ib_map_phys_fmr, - .unmap_fmr = mlx4_ib_unmap_fmr, -}; - static const struct ib_device_ops mlx4_ib_dev_mw_ops = { .alloc_mw = mlx4_ib_alloc_mw, .dealloc_mw = mlx4_ib_dealloc_mw, @@ -2724,9 +2717,6 @@ static void *mlx4_ib_add(struct mlx4_dev *dev) ib_set_device_ops(&ibdev->ib_dev, &mlx4_ib_dev_wq_ops); } - if (!mlx4_is_slave(ibdev->dev)) - ib_set_device_ops(&ibdev->ib_dev, &mlx4_ib_dev_fmr_ops); - if (dev->caps.flags & MLX4_DEV_CAP_FLAG_MEM_WINDOW || dev->caps.bmme_flags & MLX4_BMME_FLAG_TYPE_2_WIN) { ibdev->ib_dev.uverbs_cmd_mask |= diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h index d188573..ebdeffd 100644 --- a/drivers/infiniband/hw/mlx4/mlx4_ib.h +++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h @@ -146,11 +146,6 @@ struct mlx4_ib_mw { struct mlx4_mw mmw; }; -struct mlx4_ib_fmr { - struct ib_fmr ibfmr; - struct mlx4_fmr mfmr; -}; - #define MAX_REGS_PER_FLOW 2 struct mlx4_flow_reg_id { @@ -679,11 +674,6 @@ static inline struct mlx4_ib_mw *to_mmw(struct ib_mw *ibmw) return container_of(ibmw, struct mlx4_ib_mw, ibmw); } -static inline struct mlx4_ib_fmr *to_mfmr(struct ib_fmr *ibfmr) -{ - return container_of(ibfmr, struct mlx4_ib_fmr, ibfmr); -} - static inline struct mlx4_ib_flow *to_mflow(struct ib_flow *ibflow) { return container_of(ibflow, struct mlx4_ib_flow, ibflow); @@ -794,12 +784,6 @@ int mlx4_ib_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num, int mlx4_ib_mad_init(struct mlx4_ib_dev *dev); void mlx4_ib_mad_cleanup(struct mlx4_ib_dev *dev); -struct ib_fmr *mlx4_ib_fmr_alloc(struct ib_pd *pd, int mr_access_flags, - struct ib_fmr_attr *fmr_attr); -int mlx4_ib_map_phys_fmr(struct ib_fmr *ibfmr, u64 *page_list, int npages, - u64 iova); -int mlx4_ib_unmap_fmr(struct list_head *fmr_list); -int mlx4_ib_fmr_dealloc(struct ib_fmr *fmr); int __mlx4_ib_query_port(struct ib_device *ibdev, u8 port, struct ib_port_attr *props, int netw_view); int __mlx4_ib_query_pkey(struct ib_device *ibdev, u8 port, u16 index, diff --git a/drivers/infiniband/hw/mlx4/mr.c b/drivers/infiniband/hw/mlx4/mr.c index b0121c9..e2fb71b 100644 --- a/drivers/infiniband/hw/mlx4/mr.c +++ b/drivers/infiniband/hw/mlx4/mr.c @@ -698,99 +698,6 @@ struct ib_mr *mlx4_ib_alloc_mr(struct ib_pd *pd, enum ib_mr_type mr_type, return ERR_PTR(err); } -struct ib_fmr *mlx4_ib_fmr_alloc(struct ib_pd *pd, int acc, - struct ib_fmr_attr *fmr_attr) -{ - struct mlx4_ib_dev *dev = to_mdev(pd->device); - struct mlx4_ib_fmr *fmr; - int err = -ENOMEM; - - fmr = kmalloc(sizeof *fmr, GFP_KERNEL); - if (!fmr) - return ERR_PTR(-ENOMEM); - - err = mlx4_fmr_alloc(dev->dev, to_mpd(pd)->pdn, convert_access(acc), - fmr_attr->max_pages, fmr_attr->max_maps, - fmr_attr->page_shift, &fmr->mfmr); - if (err) - goto err_free; - - err = mlx4_fmr_enable(to_mdev(pd->device)->dev, &fmr->mfmr); - if (err) - goto err_mr; - - fmr->ibfmr.rkey = fmr->ibfmr.lkey = fmr->mfmr.mr.key; - - return &fmr->ibfmr; - -err_mr: - (void) mlx4_mr_free(to_mdev(pd->device)->dev, &fmr->mfmr.mr); - -err_free: - kfree(fmr); - - return ERR_PTR(err); -} - -int mlx4_ib_map_phys_fmr(struct ib_fmr *ibfmr, u64 *page_list, - int npages, u64 iova) -{ - struct mlx4_ib_fmr *ifmr = to_mfmr(ibfmr); - struct mlx4_ib_dev *dev = to_mdev(ifmr->ibfmr.device); - - return mlx4_map_phys_fmr(dev->dev, &ifmr->mfmr, page_list, npages, iova, - &ifmr->ibfmr.lkey, &ifmr->ibfmr.rkey); -} - -int mlx4_ib_unmap_fmr(struct list_head *fmr_list) -{ - struct ib_fmr *ibfmr; - int err; - struct mlx4_dev *mdev = NULL; - - list_for_each_entry(ibfmr, fmr_list, list) { - if (mdev && to_mdev(ibfmr->device)->dev != mdev) - return -EINVAL; - mdev = to_mdev(ibfmr->device)->dev; - } - - if (!mdev) - return 0; - - list_for_each_entry(ibfmr, fmr_list, list) { - struct mlx4_ib_fmr *ifmr = to_mfmr(ibfmr); - - mlx4_fmr_unmap(mdev, &ifmr->mfmr, &ifmr->ibfmr.lkey, &ifmr->ibfmr.rkey); - } - - /* - * Make sure all MPT status updates are visible before issuing - * SYNC_TPT firmware command. - */ - wmb(); - - err = mlx4_SYNC_TPT(mdev); - if (err) - pr_warn("SYNC_TPT error %d when " - "unmapping FMRs\n", err); - - return 0; -} - -int mlx4_ib_fmr_dealloc(struct ib_fmr *ibfmr) -{ - struct mlx4_ib_fmr *ifmr = to_mfmr(ibfmr); - struct mlx4_ib_dev *dev = to_mdev(ibfmr->device); - int err; - - err = mlx4_fmr_free(dev->dev, &ifmr->mfmr); - - if (!err) - kfree(ifmr); - - return err; -} - static int mlx4_set_page(struct ib_mr *ibmr, u64 addr) { struct mlx4_ib_mr *mr = to_mmr(ibmr); diff --git a/drivers/net/ethernet/mellanox/mlx4/mr.c b/drivers/net/ethernet/mellanox/mlx4/mr.c index 1a11bc0..d2986f1 100644 --- a/drivers/net/ethernet/mellanox/mlx4/mr.c +++ b/drivers/net/ethernet/mellanox/mlx4/mr.c @@ -966,189 +966,6 @@ void mlx4_cleanup_mr_table(struct mlx4_dev *dev) mlx4_bitmap_cleanup(&mr_table->mpt_bitmap); } -static inline int mlx4_check_fmr(struct mlx4_fmr *fmr, u64 *page_list, - int npages, u64 iova) -{ - int i, page_mask; - - if (npages > fmr->max_pages) - return -EINVAL; - - page_mask = (1 << fmr->page_shift) - 1; - - /* We are getting page lists, so va must be page aligned. */ - if (iova & page_mask) - return -EINVAL; - - /* Trust the user not to pass misaligned data in page_list */ - if (0) - for (i = 0; i < npages; ++i) { - if (page_list[i] & ~page_mask) - return -EINVAL; - } - - if (fmr->maps >= fmr->max_maps) - return -EINVAL; - - return 0; -} - -int mlx4_map_phys_fmr(struct mlx4_dev *dev, struct mlx4_fmr *fmr, u64 *page_list, - int npages, u64 iova, u32 *lkey, u32 *rkey) -{ - u32 key; - int i, err; - - err = mlx4_check_fmr(fmr, page_list, npages, iova); - if (err) - return err; - - ++fmr->maps; - - key = key_to_hw_index(fmr->mr.key); - key += dev->caps.num_mpts; - *lkey = *rkey = fmr->mr.key = hw_index_to_key(key); - - *(u8 *) fmr->mpt = MLX4_MPT_STATUS_SW; - - /* Make sure MPT status is visible before writing MTT entries */ - wmb(); - - dma_sync_single_for_cpu(&dev->persist->pdev->dev, fmr->dma_handle, - npages * sizeof(u64), DMA_TO_DEVICE); - - for (i = 0; i < npages; ++i) - fmr->mtts[i] = cpu_to_be64(page_list[i] | MLX4_MTT_FLAG_PRESENT); - - dma_sync_single_for_device(&dev->persist->pdev->dev, fmr->dma_handle, - npages * sizeof(u64), DMA_TO_DEVICE); - - fmr->mpt->key = cpu_to_be32(key); - fmr->mpt->lkey = cpu_to_be32(key); - fmr->mpt->length = cpu_to_be64(npages * (1ull << fmr->page_shift)); - fmr->mpt->start = cpu_to_be64(iova); - - /* Make MTT entries are visible before setting MPT status */ - wmb(); - - *(u8 *) fmr->mpt = MLX4_MPT_STATUS_HW; - - /* Make sure MPT status is visible before consumer can use FMR */ - wmb(); - - return 0; -} -EXPORT_SYMBOL_GPL(mlx4_map_phys_fmr); - -int mlx4_fmr_alloc(struct mlx4_dev *dev, u32 pd, u32 access, int max_pages, - int max_maps, u8 page_shift, struct mlx4_fmr *fmr) -{ - struct mlx4_priv *priv = mlx4_priv(dev); - int err = -ENOMEM; - - if (max_maps > dev->caps.max_fmr_maps) - return -EINVAL; - - if (page_shift < (ffs(dev->caps.page_size_cap) - 1) || page_shift >= 32) - return -EINVAL; - - /* All MTTs must fit in the same page */ - if (max_pages * sizeof(*fmr->mtts) > PAGE_SIZE) - return -EINVAL; - - fmr->page_shift = page_shift; - fmr->max_pages = max_pages; - fmr->max_maps = max_maps; - fmr->maps = 0; - - err = mlx4_mr_alloc(dev, pd, 0, 0, access, max_pages, - page_shift, &fmr->mr); - if (err) - return err; - - fmr->mtts = mlx4_table_find(&priv->mr_table.mtt_table, - fmr->mr.mtt.offset, - &fmr->dma_handle); - - if (!fmr->mtts) { - err = -ENOMEM; - goto err_free; - } - - return 0; - -err_free: - (void) mlx4_mr_free(dev, &fmr->mr); - return err; -} -EXPORT_SYMBOL_GPL(mlx4_fmr_alloc); - -int mlx4_fmr_enable(struct mlx4_dev *dev, struct mlx4_fmr *fmr) -{ - struct mlx4_priv *priv = mlx4_priv(dev); - int err; - - err = mlx4_mr_enable(dev, &fmr->mr); - if (err) - return err; - - fmr->mpt = mlx4_table_find(&priv->mr_table.dmpt_table, - key_to_hw_index(fmr->mr.key), NULL); - if (!fmr->mpt) - return -ENOMEM; - - return 0; -} -EXPORT_SYMBOL_GPL(mlx4_fmr_enable); - -void mlx4_fmr_unmap(struct mlx4_dev *dev, struct mlx4_fmr *fmr, - u32 *lkey, u32 *rkey) -{ - if (!fmr->maps) - return; - - /* To unmap: it is sufficient to take back ownership from HW */ - *(u8 *)fmr->mpt = MLX4_MPT_STATUS_SW; - - /* Make sure MPT status is visible */ - wmb(); - - fmr->maps = 0; -} -EXPORT_SYMBOL_GPL(mlx4_fmr_unmap); - -int mlx4_fmr_free(struct mlx4_dev *dev, struct mlx4_fmr *fmr) -{ - int ret; - - if (fmr->maps) - return -EBUSY; - if (fmr->mr.enabled == MLX4_MPT_EN_HW) { - /* In case of FMR was enabled and unmapped - * make sure to give ownership of MPT back to HW - * so HW2SW_MPT command will success. - */ - *(u8 *)fmr->mpt = MLX4_MPT_STATUS_SW; - /* Make sure MPT status is visible before changing MPT fields */ - wmb(); - fmr->mpt->length = 0; - fmr->mpt->start = 0; - /* Make sure MPT data is visible after changing MPT status */ - wmb(); - *(u8 *)fmr->mpt = MLX4_MPT_STATUS_HW; - /* make sure MPT status is visible */ - wmb(); - } - - ret = mlx4_mr_free(dev, &fmr->mr); - if (ret) - return ret; - fmr->mr.enabled = MLX4_MPT_DISABLED; - - return 0; -} -EXPORT_SYMBOL_GPL(mlx4_fmr_free); - int mlx4_SYNC_TPT(struct mlx4_dev *dev) { return mlx4_cmd(dev, 0, 0, 0, MLX4_CMD_SYNC_TPT, diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h index 20372de..dd53d96 100644 --- a/include/linux/mlx4/device.h +++ b/include/linux/mlx4/device.h @@ -707,17 +707,6 @@ struct mlx4_mw { int enabled; }; -struct mlx4_fmr { - struct mlx4_mr mr; - struct mlx4_mpt_entry *mpt; - __be64 *mtts; - dma_addr_t dma_handle; - int max_pages; - int max_maps; - int maps; - u8 page_shift; -}; - struct mlx4_uar { unsigned long pfn; int index; @@ -1412,14 +1401,6 @@ void mlx4_handle_eth_header_mcast_prio(struct mlx4_net_trans_rule_hw_ctrl *ctrl, int mlx4_register_vlan(struct mlx4_dev *dev, u8 port, u16 vlan, int *index); void mlx4_unregister_vlan(struct mlx4_dev *dev, u8 port, u16 vlan); -int mlx4_map_phys_fmr(struct mlx4_dev *dev, struct mlx4_fmr *fmr, u64 *page_list, - int npages, u64 iova, u32 *lkey, u32 *rkey); -int mlx4_fmr_alloc(struct mlx4_dev *dev, u32 pd, u32 access, int max_pages, - int max_maps, u8 page_shift, struct mlx4_fmr *fmr); -int mlx4_fmr_enable(struct mlx4_dev *dev, struct mlx4_fmr *fmr); -void mlx4_fmr_unmap(struct mlx4_dev *dev, struct mlx4_fmr *fmr, - u32 *lkey, u32 *rkey); -int mlx4_fmr_free(struct mlx4_dev *dev, struct mlx4_fmr *fmr); int mlx4_SYNC_TPT(struct mlx4_dev *dev); int mlx4_test_interrupt(struct mlx4_dev *dev, int vector); int mlx4_test_async(struct mlx4_dev *dev); @@ -1522,6 +1503,8 @@ struct mlx4_slaves_pport mlx4_phys_to_slaves_pport_actv( int mlx4_vf_get_enable_smi_admin(struct mlx4_dev *dev, int slave, int port); int mlx4_vf_set_enable_smi_admin(struct mlx4_dev *dev, int slave, int port, int enable); + +struct mlx4_mpt_entry; int mlx4_mr_hw_get_mpt(struct mlx4_dev *dev, struct mlx4_mr *mmr, struct mlx4_mpt_entry ***mpt_entry); int mlx4_mr_hw_write_mpt(struct mlx4_dev *dev, struct mlx4_mr *mmr, From patchwork Wed May 27 09:46:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Gurtovoy X-Patchwork-Id: 218449 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.7 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4D70C433DF for ; Wed, 27 May 2020 09:46:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9C78A20888 for ; Wed, 27 May 2020 09:46:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729307AbgE0Jqw (ORCPT ); Wed, 27 May 2020 05:46:52 -0400 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:51569 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729052AbgE0Jqu (ORCPT ); Wed, 27 May 2020 05:46:50 -0400 Received: from Internal Mail-Server by MTLPINE1 (envelope-from maxg@mellanox.com) with ESMTPS (AES256-SHA encrypted); 27 May 2020 12:46:35 +0300 Received: from mtr-vdi-031.wap.labs.mlnx. (mtr-vdi-031.wap.labs.mlnx [10.209.102.136]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 04R9kYik009430; Wed, 27 May 2020 12:46:35 +0300 From: Max Gurtovoy To: jgg@mellanox.com, dledford@redhat.com, leon@kernel.org, galpress@amazon.com, dennis.dalessandro@intel.com, netdev@vger.kernel.org, sagi@grimberg.me, linux-rdma@vger.kernel.org, bvanassche@acm.org, santosh.shilimkar@oracle.com, tom@talpey.com Cc: aron.silverton@oracle.com, israelr@mellanox.com, oren@mellanox.com, shlomin@mellanox.com, vladimirk@mellanox.com, Max Gurtovoy Subject: [PATCH 3/9] RDMA/rds: remove FMR support for memory registration Date: Wed, 27 May 2020 12:46:28 +0300 Message-Id: <20200527094634.24240-4-maxg@mellanox.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200527094634.24240-1-maxg@mellanox.com> References: <20200527094634.24240-1-maxg@mellanox.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Use FRWR method for memory registration by default and remove the ancient and unsafe FMR method. Signed-off-by: Max Gurtovoy --- net/rds/Makefile | 2 +- net/rds/ib.c | 14 +-- net/rds/ib.h | 1 - net/rds/ib_cm.c | 4 +- net/rds/ib_fmr.c | 269 ------------------------------------------------------ net/rds/ib_mr.h | 12 --- net/rds/ib_rdma.c | 16 +--- 7 files changed, 11 insertions(+), 307 deletions(-) delete mode 100644 net/rds/ib_fmr.c diff --git a/net/rds/Makefile b/net/rds/Makefile index e647f9d..8fdc118 100644 --- a/net/rds/Makefile +++ b/net/rds/Makefile @@ -7,7 +7,7 @@ rds-y := af_rds.o bind.o cong.o connection.o info.o message.o \ obj-$(CONFIG_RDS_RDMA) += rds_rdma.o rds_rdma-y := rdma_transport.o \ ib.o ib_cm.o ib_recv.o ib_ring.o ib_send.o ib_stats.o \ - ib_sysctl.o ib_rdma.o ib_fmr.o ib_frmr.o + ib_sysctl.o ib_rdma.o ib_frmr.o obj-$(CONFIG_RDS_TCP) += rds_tcp.o diff --git a/net/rds/ib.c b/net/rds/ib.c index a792d8a..33a65c8 100644 --- a/net/rds/ib.c +++ b/net/rds/ib.c @@ -130,12 +130,15 @@ void rds_ib_dev_put(struct rds_ib_device *rds_ibdev) static void rds_ib_add_one(struct ib_device *device) { struct rds_ib_device *rds_ibdev; - bool has_fr, has_fmr; /* Only handle IB (no iWARP) devices */ if (device->node_type != RDMA_NODE_IB_CA) return; + /* Device must support FRWR */ + if (!(device->attrs.device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS)) + return; + rds_ibdev = kzalloc_node(sizeof(struct rds_ib_device), GFP_KERNEL, ibdev_to_node(device)); if (!rds_ibdev) @@ -151,11 +154,6 @@ static void rds_ib_add_one(struct ib_device *device) rds_ibdev->max_wrs = device->attrs.max_qp_wr; rds_ibdev->max_sge = min(device->attrs.max_send_sge, RDS_IB_MAX_SGE); - has_fr = (device->attrs.device_cap_flags & - IB_DEVICE_MEM_MGT_EXTENSIONS); - has_fmr = (device->ops.alloc_fmr && device->ops.dealloc_fmr && - device->ops.map_phys_fmr && device->ops.unmap_fmr); - rds_ibdev->use_fastreg = (has_fr && !has_fmr); rds_ibdev->odp_capable = !!(device->attrs.device_cap_flags & IB_DEVICE_ON_DEMAND_PAGING) && @@ -217,9 +215,7 @@ static void rds_ib_add_one(struct ib_device *device) rds_ibdev->fmr_max_remaps, rds_ibdev->max_1m_mrs, rds_ibdev->max_8k_mrs); - pr_info("RDS/IB: %s: %s supported and preferred\n", - device->name, - rds_ibdev->use_fastreg ? "FRMR" : "FMR"); + pr_info("RDS/IB: %s: added\n", device->name); down_write(&rds_ib_devices_lock); list_add_tail_rcu(&rds_ibdev->list, &rds_ib_devices); diff --git a/net/rds/ib.h b/net/rds/ib.h index 0296f1f..e0d5991 100644 --- a/net/rds/ib.h +++ b/net/rds/ib.h @@ -247,7 +247,6 @@ struct rds_ib_device { struct ib_device *dev; struct ib_pd *pd; struct dma_pool *rid_hdrs_pool; /* RDS headers DMA pool */ - u8 use_fastreg:1; u8 odp_capable:1; unsigned int max_mrs; diff --git a/net/rds/ib_cm.c b/net/rds/ib_cm.c index c71f432..c5c665e 100644 --- a/net/rds/ib_cm.c +++ b/net/rds/ib_cm.c @@ -526,10 +526,10 @@ static int rds_ib_setup_qp(struct rds_connection *conn) return -EOPNOTSUPP; /* The fr_queue_space is currently set to 512, to add extra space on - * completion queue and send queue. This extra space is used for FRMR + * completion queue and send queue. This extra space is used for FRWR * registration and invalidation work requests */ - fr_queue_space = (rds_ibdev->use_fastreg ? RDS_IB_DEFAULT_FR_WR : 0); + fr_queue_space = RDS_IB_DEFAULT_FR_WR; /* add the conn now so that connection establishment has the dev */ rds_ib_add_conn(rds_ibdev, conn); diff --git a/net/rds/ib_fmr.c b/net/rds/ib_fmr.c deleted file mode 100644 index 93c0437..0000000 --- a/net/rds/ib_fmr.c +++ /dev/null @@ -1,269 +0,0 @@ -/* - * Copyright (c) 2016 Oracle. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - */ - -#include "ib_mr.h" - -struct rds_ib_mr *rds_ib_alloc_fmr(struct rds_ib_device *rds_ibdev, int npages) -{ - struct rds_ib_mr_pool *pool; - struct rds_ib_mr *ibmr = NULL; - struct rds_ib_fmr *fmr; - int err = 0; - - if (npages <= RDS_MR_8K_MSG_SIZE) - pool = rds_ibdev->mr_8k_pool; - else - pool = rds_ibdev->mr_1m_pool; - - if (atomic_read(&pool->dirty_count) >= pool->max_items / 10) - queue_delayed_work(rds_ib_mr_wq, &pool->flush_worker, 10); - - /* Switch pools if one of the pool is reaching upper limit */ - if (atomic_read(&pool->dirty_count) >= pool->max_items * 9 / 10) { - if (pool->pool_type == RDS_IB_MR_8K_POOL) - pool = rds_ibdev->mr_1m_pool; - else - pool = rds_ibdev->mr_8k_pool; - } - - ibmr = rds_ib_try_reuse_ibmr(pool); - if (ibmr) - return ibmr; - - ibmr = kzalloc_node(sizeof(*ibmr), GFP_KERNEL, - rdsibdev_to_node(rds_ibdev)); - if (!ibmr) { - err = -ENOMEM; - goto out_no_cigar; - } - - fmr = &ibmr->u.fmr; - fmr->fmr = ib_alloc_fmr(rds_ibdev->pd, - (IB_ACCESS_LOCAL_WRITE | - IB_ACCESS_REMOTE_READ | - IB_ACCESS_REMOTE_WRITE | - IB_ACCESS_REMOTE_ATOMIC), - &pool->fmr_attr); - if (IS_ERR(fmr->fmr)) { - err = PTR_ERR(fmr->fmr); - fmr->fmr = NULL; - pr_warn("RDS/IB: %s failed (err=%d)\n", __func__, err); - goto out_no_cigar; - } - - ibmr->pool = pool; - if (pool->pool_type == RDS_IB_MR_8K_POOL) - rds_ib_stats_inc(s_ib_rdma_mr_8k_alloc); - else - rds_ib_stats_inc(s_ib_rdma_mr_1m_alloc); - - return ibmr; - -out_no_cigar: - kfree(ibmr); - atomic_dec(&pool->item_count); - - return ERR_PTR(err); -} - -static int rds_ib_map_fmr(struct rds_ib_device *rds_ibdev, - struct rds_ib_mr *ibmr, struct scatterlist *sg, - unsigned int nents) -{ - struct ib_device *dev = rds_ibdev->dev; - struct rds_ib_fmr *fmr = &ibmr->u.fmr; - struct scatterlist *scat = sg; - u64 io_addr = 0; - u64 *dma_pages; - u32 len; - int page_cnt, sg_dma_len; - int i, j; - int ret; - - sg_dma_len = ib_dma_map_sg(dev, sg, nents, DMA_BIDIRECTIONAL); - if (unlikely(!sg_dma_len)) { - pr_warn("RDS/IB: %s failed!\n", __func__); - return -EBUSY; - } - - len = 0; - page_cnt = 0; - - for (i = 0; i < sg_dma_len; ++i) { - unsigned int dma_len = sg_dma_len(&scat[i]); - u64 dma_addr = sg_dma_address(&scat[i]); - - if (dma_addr & ~PAGE_MASK) { - if (i > 0) { - ib_dma_unmap_sg(dev, sg, nents, - DMA_BIDIRECTIONAL); - return -EINVAL; - } else { - ++page_cnt; - } - } - if ((dma_addr + dma_len) & ~PAGE_MASK) { - if (i < sg_dma_len - 1) { - ib_dma_unmap_sg(dev, sg, nents, - DMA_BIDIRECTIONAL); - return -EINVAL; - } else { - ++page_cnt; - } - } - - len += dma_len; - } - - page_cnt += len >> PAGE_SHIFT; - if (page_cnt > ibmr->pool->fmr_attr.max_pages) { - ib_dma_unmap_sg(dev, sg, nents, DMA_BIDIRECTIONAL); - return -EINVAL; - } - - dma_pages = kmalloc_array_node(sizeof(u64), page_cnt, GFP_ATOMIC, - rdsibdev_to_node(rds_ibdev)); - if (!dma_pages) { - ib_dma_unmap_sg(dev, sg, nents, DMA_BIDIRECTIONAL); - return -ENOMEM; - } - - page_cnt = 0; - for (i = 0; i < sg_dma_len; ++i) { - unsigned int dma_len = sg_dma_len(&scat[i]); - u64 dma_addr = sg_dma_address(&scat[i]); - - for (j = 0; j < dma_len; j += PAGE_SIZE) - dma_pages[page_cnt++] = - (dma_addr & PAGE_MASK) + j; - } - - ret = ib_map_phys_fmr(fmr->fmr, dma_pages, page_cnt, io_addr); - if (ret) { - ib_dma_unmap_sg(dev, sg, nents, DMA_BIDIRECTIONAL); - goto out; - } - - /* Success - we successfully remapped the MR, so we can - * safely tear down the old mapping. - */ - rds_ib_teardown_mr(ibmr); - - ibmr->sg = scat; - ibmr->sg_len = nents; - ibmr->sg_dma_len = sg_dma_len; - ibmr->remap_count++; - - if (ibmr->pool->pool_type == RDS_IB_MR_8K_POOL) - rds_ib_stats_inc(s_ib_rdma_mr_8k_used); - else - rds_ib_stats_inc(s_ib_rdma_mr_1m_used); - ret = 0; - -out: - kfree(dma_pages); - - return ret; -} - -struct rds_ib_mr *rds_ib_reg_fmr(struct rds_ib_device *rds_ibdev, - struct scatterlist *sg, - unsigned long nents, - u32 *key) -{ - struct rds_ib_mr *ibmr = NULL; - struct rds_ib_fmr *fmr; - int ret; - - ibmr = rds_ib_alloc_fmr(rds_ibdev, nents); - if (IS_ERR(ibmr)) - return ibmr; - - ibmr->device = rds_ibdev; - fmr = &ibmr->u.fmr; - ret = rds_ib_map_fmr(rds_ibdev, ibmr, sg, nents); - if (ret == 0) - *key = fmr->fmr->rkey; - else - rds_ib_free_mr(ibmr, 0); - - return ibmr; -} - -void rds_ib_unreg_fmr(struct list_head *list, unsigned int *nfreed, - unsigned long *unpinned, unsigned int goal) -{ - struct rds_ib_mr *ibmr, *next; - struct rds_ib_fmr *fmr; - LIST_HEAD(fmr_list); - int ret = 0; - unsigned int freed = *nfreed; - - /* String all ib_mr's onto one list and hand them to ib_unmap_fmr */ - list_for_each_entry(ibmr, list, unmap_list) { - fmr = &ibmr->u.fmr; - list_add(&fmr->fmr->list, &fmr_list); - } - - ret = ib_unmap_fmr(&fmr_list); - if (ret) - pr_warn("RDS/IB: FMR invalidation failed (err=%d)\n", ret); - - /* Now we can destroy the DMA mapping and unpin any pages */ - list_for_each_entry_safe(ibmr, next, list, unmap_list) { - fmr = &ibmr->u.fmr; - *unpinned += ibmr->sg_len; - __rds_ib_teardown_mr(ibmr); - if (freed < goal || - ibmr->remap_count >= ibmr->pool->fmr_attr.max_maps) { - if (ibmr->pool->pool_type == RDS_IB_MR_8K_POOL) - rds_ib_stats_inc(s_ib_rdma_mr_8k_free); - else - rds_ib_stats_inc(s_ib_rdma_mr_1m_free); - list_del(&ibmr->unmap_list); - ib_dealloc_fmr(fmr->fmr); - kfree(ibmr); - freed++; - } - } - *nfreed = freed; -} - -void rds_ib_free_fmr_list(struct rds_ib_mr *ibmr) -{ - struct rds_ib_mr_pool *pool = ibmr->pool; - - if (ibmr->remap_count >= pool->fmr_attr.max_maps) - llist_add(&ibmr->llnode, &pool->drop_list); - else - llist_add(&ibmr->llnode, &pool->free_list); -} diff --git a/net/rds/ib_mr.h b/net/rds/ib_mr.h index 0c8252d7..f365e1a 100644 --- a/net/rds/ib_mr.h +++ b/net/rds/ib_mr.h @@ -43,10 +43,6 @@ #define RDS_MR_8K_SCALE (256 / (RDS_MR_8K_MSG_SIZE + 1)) #define RDS_MR_8K_POOL_SIZE (RDS_MR_8K_SCALE * (8192 / 2)) -struct rds_ib_fmr { - struct ib_fmr *fmr; -}; - enum rds_ib_fr_state { FRMR_IS_FREE, /* mr invalidated & ready for use */ FRMR_IS_INUSE, /* mr is in use or used & can be invalidated */ @@ -84,7 +80,6 @@ struct rds_ib_mr { u8 odp:1; union { - struct rds_ib_fmr fmr; struct rds_ib_frmr frmr; struct ib_mr *mr; } u; @@ -110,7 +105,6 @@ struct rds_ib_mr_pool { unsigned long max_items_soft; unsigned long max_free_pinned; struct ib_fmr_attr fmr_attr; - bool use_fastreg; }; extern struct workqueue_struct *rds_ib_mr_wq; @@ -136,15 +130,9 @@ void *rds_ib_get_mr(struct scatterlist *sg, unsigned long nents, void __rds_ib_teardown_mr(struct rds_ib_mr *); void rds_ib_teardown_mr(struct rds_ib_mr *); -struct rds_ib_mr *rds_ib_alloc_fmr(struct rds_ib_device *, int); struct rds_ib_mr *rds_ib_reuse_mr(struct rds_ib_mr_pool *); int rds_ib_flush_mr_pool(struct rds_ib_mr_pool *, int, struct rds_ib_mr **); -struct rds_ib_mr *rds_ib_reg_fmr(struct rds_ib_device *, struct scatterlist *, - unsigned long, u32 *); struct rds_ib_mr *rds_ib_try_reuse_ibmr(struct rds_ib_mr_pool *); -void rds_ib_unreg_fmr(struct list_head *, unsigned int *, - unsigned long *, unsigned int); -void rds_ib_free_fmr_list(struct rds_ib_mr *); struct rds_ib_mr *rds_ib_reg_frmr(struct rds_ib_device *rds_ibdev, struct rds_ib_connection *ic, struct scatterlist *sg, diff --git a/net/rds/ib_rdma.c b/net/rds/ib_rdma.c index b34b24e..c55d77e 100644 --- a/net/rds/ib_rdma.c +++ b/net/rds/ib_rdma.c @@ -406,10 +406,7 @@ int rds_ib_flush_mr_pool(struct rds_ib_mr_pool *pool, if (list_empty(&unmap_list)) goto out; - if (pool->use_fastreg) - rds_ib_unreg_frmr(&unmap_list, &nfreed, &unpinned, free_goal); - else - rds_ib_unreg_fmr(&unmap_list, &nfreed, &unpinned, free_goal); + rds_ib_unreg_frmr(&unmap_list, &nfreed, &unpinned, free_goal); if (!list_empty(&unmap_list)) { unsigned long flags; @@ -503,10 +500,7 @@ void rds_ib_free_mr(void *trans_private, int invalidate) } /* Return it to the pool's free list */ - if (rds_ibdev->use_fastreg) - rds_ib_free_frmr_list(ibmr); - else - rds_ib_free_fmr_list(ibmr); + rds_ib_free_frmr_list(ibmr); atomic_add(ibmr->sg_len, &pool->free_pinned); atomic_inc(&pool->dirty_count); @@ -622,10 +616,7 @@ void *rds_ib_get_mr(struct scatterlist *sg, unsigned long nents, goto out; } - if (rds_ibdev->use_fastreg) - ibmr = rds_ib_reg_frmr(rds_ibdev, ic, sg, nents, key_ret); - else - ibmr = rds_ib_reg_fmr(rds_ibdev, sg, nents, key_ret); + ibmr = rds_ib_reg_frmr(rds_ibdev, ic, sg, nents, key_ret); if (IS_ERR(ibmr)) { ret = PTR_ERR(ibmr); pr_warn("RDS/IB: rds_ib_get_mr failed (errno=%d)\n", ret); @@ -681,7 +672,6 @@ struct rds_ib_mr_pool *rds_ib_create_mr_pool(struct rds_ib_device *rds_ibdev, pool->fmr_attr.max_maps = rds_ibdev->fmr_max_remaps; pool->fmr_attr.page_shift = PAGE_SHIFT; pool->max_items_soft = rds_ibdev->max_mrs * 3 / 4; - pool->use_fastreg = rds_ibdev->use_fastreg; return pool; } From patchwork Wed May 27 09:46:31 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Gurtovoy X-Patchwork-Id: 218450 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.7 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2AA3AC433E0 for ; Wed, 27 May 2020 09:46:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0A19D208B8 for ; Wed, 27 May 2020 09:46:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729283AbgE0Jqq (ORCPT ); Wed, 27 May 2020 05:46:46 -0400 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:58166 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729047AbgE0Jqo (ORCPT ); Wed, 27 May 2020 05:46:44 -0400 Received: from Internal Mail-Server by MTLPINE2 (envelope-from maxg@mellanox.com) with ESMTPS (AES256-SHA encrypted); 27 May 2020 12:46:36 +0300 Received: from mtr-vdi-031.wap.labs.mlnx. (mtr-vdi-031.wap.labs.mlnx [10.209.102.136]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 04R9kYin009430; Wed, 27 May 2020 12:46:35 +0300 From: Max Gurtovoy To: jgg@mellanox.com, dledford@redhat.com, leon@kernel.org, galpress@amazon.com, dennis.dalessandro@intel.com, netdev@vger.kernel.org, sagi@grimberg.me, linux-rdma@vger.kernel.org, bvanassche@acm.org, santosh.shilimkar@oracle.com, tom@talpey.com Cc: aron.silverton@oracle.com, israelr@mellanox.com, oren@mellanox.com, shlomin@mellanox.com, vladimirk@mellanox.com, Max Gurtovoy Subject: [PATCH 6/9] RDMA/iser: Remove support for FMR memory registration Date: Wed, 27 May 2020 12:46:31 +0300 Message-Id: <20200527094634.24240-7-maxg@mellanox.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200527094634.24240-1-maxg@mellanox.com> References: <20200527094634.24240-1-maxg@mellanox.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Israel Rukshin FMR is not supported on most recent RDMA devices (that use fast memory registration mechanism). Also, FMR was recently removed from NFS/RDMA ULP. Signed-off-by: Israel Rukshin Signed-off-by: Max Gurtovoy Reviewed-by: Sagi Grimberg --- drivers/infiniband/ulp/iser/iscsi_iser.h | 79 +---------- drivers/infiniband/ulp/iser/iser_initiator.c | 19 ++- drivers/infiniband/ulp/iser/iser_memory.c | 188 ++------------------------- drivers/infiniband/ulp/iser/iser_verbs.c | 126 +++--------------- 4 files changed, 40 insertions(+), 372 deletions(-) diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.h b/drivers/infiniband/ulp/iser/iscsi_iser.h index 029c001..1d77c7f 100644 --- a/drivers/infiniband/ulp/iser/iscsi_iser.h +++ b/drivers/infiniband/ulp/iser/iscsi_iser.h @@ -65,7 +65,6 @@ #include #include -#include #include #define DRV_NAME "iser" @@ -313,33 +312,6 @@ struct iser_comp { }; /** - * struct iser_reg_ops - Memory registration operations - * per-device registration schemes - * - * @alloc_reg_res: Allocate registration resources - * @free_reg_res: Free registration resources - * @reg_mem: Register memory buffers - * @unreg_mem: Un-register memory buffers - * @reg_desc_get: Get a registration descriptor for pool - * @reg_desc_put: Get a registration descriptor to pool - */ -struct iser_reg_ops { - int (*alloc_reg_res)(struct ib_conn *ib_conn, - unsigned cmds_max, - unsigned int size); - void (*free_reg_res)(struct ib_conn *ib_conn); - int (*reg_mem)(struct iscsi_iser_task *iser_task, - struct iser_data_buf *mem, - struct iser_reg_resources *rsc, - struct iser_mem_reg *reg); - void (*unreg_mem)(struct iscsi_iser_task *iser_task, - enum iser_data_dir cmd_dir); - struct iser_fr_desc * (*reg_desc_get)(struct ib_conn *ib_conn); - void (*reg_desc_put)(struct ib_conn *ib_conn, - struct iser_fr_desc *desc); -}; - -/** * struct iser_device - iSER device handle * * @ib_device: RDMA device @@ -351,8 +323,6 @@ struct iser_reg_ops { * @comps_used: Number of completion contexts used, Min between online * cpus and device max completion vectors * @comps: Dinamically allocated array of completion handlers - * @reg_ops: Registration ops - * @remote_inv_sup: Remote invalidate is supported on this device */ struct iser_device { struct ib_device *ib_device; @@ -362,26 +332,18 @@ struct iser_device { int refcount; int comps_used; struct iser_comp *comps; - const struct iser_reg_ops *reg_ops; - bool remote_inv_sup; }; /** * struct iser_reg_resources - Fast registration resources * * @mr: memory region - * @fmr_pool: pool of fmrs * @sig_mr: signature memory region - * @page_vec: fast reg page list used by fmr pool * @mr_valid: is mr valid indicator */ struct iser_reg_resources { - union { - struct ib_mr *mr; - struct ib_fmr_pool *fmr_pool; - }; + struct ib_mr *mr; struct ib_mr *sig_mr; - struct iser_page_vec *page_vec; u8 mr_valid:1; }; @@ -403,7 +365,7 @@ struct iser_fr_desc { * struct iser_fr_pool - connection fast registration pool * * @list: list of fastreg descriptors - * @lock: protects fmr/fastreg pool + * @lock: protects fastreg pool * @size: size of the pool */ struct iser_fr_pool { @@ -518,12 +480,6 @@ struct iscsi_iser_task { struct iser_data_buf prot[ISER_DIRS_NUM]; }; -struct iser_page_vec { - u64 *pages; - int npages; - struct ib_mr fake_mr; -}; - /** * struct iser_global - iSER global context * @@ -548,8 +504,6 @@ struct iser_global { extern unsigned int iser_max_sectors; extern bool iser_always_reg; -int iser_assign_reg_ops(struct iser_device *device); - int iser_send_control(struct iscsi_conn *conn, struct iscsi_task *task); @@ -591,22 +545,17 @@ void iser_finalize_rdma_unaligned_sg(struct iscsi_iser_task *iser_task, struct iser_data_buf *mem, enum iser_data_dir cmd_dir); -int iser_reg_rdma_mem(struct iscsi_iser_task *task, - enum iser_data_dir dir, - bool all_imm); -void iser_unreg_rdma_mem(struct iscsi_iser_task *task, - enum iser_data_dir dir); +int iser_reg_mem_fastreg(struct iscsi_iser_task *task, + enum iser_data_dir dir, + bool all_imm); +void iser_unreg_mem_fastreg(struct iscsi_iser_task *task, + enum iser_data_dir dir); int iser_connect(struct iser_conn *iser_conn, struct sockaddr *src_addr, struct sockaddr *dst_addr, int non_blocking); -void iser_unreg_mem_fmr(struct iscsi_iser_task *iser_task, - enum iser_data_dir cmd_dir); -void iser_unreg_mem_fastreg(struct iscsi_iser_task *iser_task, - enum iser_data_dir cmd_dir); - int iser_post_recvl(struct iser_conn *iser_conn); int iser_post_recvm(struct iser_conn *iser_conn, int count); int iser_post_send(struct ib_conn *ib_conn, struct iser_tx_desc *tx_desc, @@ -625,26 +574,12 @@ int iser_initialize_task_headers(struct iscsi_task *task, struct iser_tx_desc *tx_desc); int iser_alloc_rx_descriptors(struct iser_conn *iser_conn, struct iscsi_session *session); -int iser_alloc_fmr_pool(struct ib_conn *ib_conn, - unsigned cmds_max, - unsigned int size); -void iser_free_fmr_pool(struct ib_conn *ib_conn); int iser_alloc_fastreg_pool(struct ib_conn *ib_conn, unsigned cmds_max, unsigned int size); void iser_free_fastreg_pool(struct ib_conn *ib_conn); u8 iser_check_task_pi_status(struct iscsi_iser_task *iser_task, enum iser_data_dir cmd_dir, sector_t *sector); -struct iser_fr_desc * -iser_reg_desc_get_fr(struct ib_conn *ib_conn); -void -iser_reg_desc_put_fr(struct ib_conn *ib_conn, - struct iser_fr_desc *desc); -struct iser_fr_desc * -iser_reg_desc_get_fmr(struct ib_conn *ib_conn); -void -iser_reg_desc_put_fmr(struct ib_conn *ib_conn, - struct iser_fr_desc *desc); static inline struct iser_conn * to_iser_conn(struct ib_conn *ib_conn) diff --git a/drivers/infiniband/ulp/iser/iser_initiator.c b/drivers/infiniband/ulp/iser/iser_initiator.c index 4a7045b..27a6f75 100644 --- a/drivers/infiniband/ulp/iser/iser_initiator.c +++ b/drivers/infiniband/ulp/iser/iser_initiator.c @@ -72,7 +72,7 @@ static int iser_prepare_read_cmd(struct iscsi_task *task) return err; } - err = iser_reg_rdma_mem(iser_task, ISER_DIR_IN, false); + err = iser_reg_mem_fastreg(iser_task, ISER_DIR_IN, false); if (err) { iser_err("Failed to set up Data-IN RDMA\n"); return err; @@ -126,8 +126,8 @@ static int iser_prepare_read_cmd(struct iscsi_task *task) return err; } - err = iser_reg_rdma_mem(iser_task, ISER_DIR_OUT, - buf_out->data_len == imm_sz); + err = iser_reg_mem_fastreg(iser_task, ISER_DIR_OUT, + buf_out->data_len == imm_sz); if (err != 0) { iser_err("Failed to register write cmd RDMA mem\n"); return err; @@ -250,8 +250,8 @@ int iser_alloc_rx_descriptors(struct iser_conn *iser_conn, iser_conn->qp_max_recv_dtos_mask = session->cmds_max - 1; /* cmds_max is 2^N */ iser_conn->min_posted_rx = iser_conn->qp_max_recv_dtos >> 2; - if (device->reg_ops->alloc_reg_res(ib_conn, session->scsi_cmds_max, - iser_conn->pages_per_mr)) + if (iser_alloc_fastreg_pool(ib_conn, session->scsi_cmds_max, + iser_conn->pages_per_mr)) goto create_rdma_reg_res_failed; if (iser_alloc_login_buf(iser_conn)) @@ -293,7 +293,7 @@ int iser_alloc_rx_descriptors(struct iser_conn *iser_conn, rx_desc_alloc_fail: iser_free_login_buf(iser_conn); alloc_login_buf_fail: - device->reg_ops->free_reg_res(ib_conn); + iser_free_fastreg_pool(ib_conn); create_rdma_reg_res_failed: iser_err("failed allocating rx descriptors / data buffers\n"); return -ENOMEM; @@ -306,8 +306,7 @@ void iser_free_rx_descriptors(struct iser_conn *iser_conn) struct ib_conn *ib_conn = &iser_conn->ib_conn; struct iser_device *device = ib_conn->device; - if (device->reg_ops->free_reg_res) - device->reg_ops->free_reg_res(ib_conn); + iser_free_fastreg_pool(ib_conn); rx_desc = iser_conn->rx_descs; for (i = 0; i < iser_conn->qp_max_recv_dtos; i++, rx_desc++) @@ -768,7 +767,7 @@ void iser_task_rdma_finalize(struct iscsi_iser_task *iser_task) int prot_count = scsi_prot_sg_count(iser_task->sc); if (iser_task->dir[ISER_DIR_IN]) { - iser_unreg_rdma_mem(iser_task, ISER_DIR_IN); + iser_unreg_mem_fastreg(iser_task, ISER_DIR_IN); iser_dma_unmap_task_data(iser_task, &iser_task->data[ISER_DIR_IN], DMA_FROM_DEVICE); @@ -779,7 +778,7 @@ void iser_task_rdma_finalize(struct iscsi_iser_task *iser_task) } if (iser_task->dir[ISER_DIR_OUT]) { - iser_unreg_rdma_mem(iser_task, ISER_DIR_OUT); + iser_unreg_mem_fastreg(iser_task, ISER_DIR_OUT); iser_dma_unmap_task_data(iser_task, &iser_task->data[ISER_DIR_OUT], DMA_TO_DEVICE); diff --git a/drivers/infiniband/ulp/iser/iser_memory.c b/drivers/infiniband/ulp/iser/iser_memory.c index 999ef7c..d4e057f 100644 --- a/drivers/infiniband/ulp/iser/iser_memory.c +++ b/drivers/infiniband/ulp/iser/iser_memory.c @@ -38,62 +38,13 @@ #include #include "iscsi_iser.h" -static -int iser_fast_reg_fmr(struct iscsi_iser_task *iser_task, - struct iser_data_buf *mem, - struct iser_reg_resources *rsc, - struct iser_mem_reg *mem_reg); -static -int iser_fast_reg_mr(struct iscsi_iser_task *iser_task, - struct iser_data_buf *mem, - struct iser_reg_resources *rsc, - struct iser_mem_reg *mem_reg); - -static const struct iser_reg_ops fastreg_ops = { - .alloc_reg_res = iser_alloc_fastreg_pool, - .free_reg_res = iser_free_fastreg_pool, - .reg_mem = iser_fast_reg_mr, - .unreg_mem = iser_unreg_mem_fastreg, - .reg_desc_get = iser_reg_desc_get_fr, - .reg_desc_put = iser_reg_desc_put_fr, -}; - -static const struct iser_reg_ops fmr_ops = { - .alloc_reg_res = iser_alloc_fmr_pool, - .free_reg_res = iser_free_fmr_pool, - .reg_mem = iser_fast_reg_fmr, - .unreg_mem = iser_unreg_mem_fmr, - .reg_desc_get = iser_reg_desc_get_fmr, - .reg_desc_put = iser_reg_desc_put_fmr, -}; void iser_reg_comp(struct ib_cq *cq, struct ib_wc *wc) { iser_err_comp(wc, "memreg"); } -int iser_assign_reg_ops(struct iser_device *device) -{ - struct ib_device *ib_dev = device->ib_device; - - /* Assign function handles - based on FMR support */ - if (ib_dev->ops.alloc_fmr && ib_dev->ops.dealloc_fmr && - ib_dev->ops.map_phys_fmr && ib_dev->ops.unmap_fmr) { - iser_info("FMR supported, using FMR for registration\n"); - device->reg_ops = &fmr_ops; - } else if (ib_dev->attrs.device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS) { - iser_info("FastReg supported, using FastReg for registration\n"); - device->reg_ops = &fastreg_ops; - device->remote_inv_sup = iser_always_reg; - } else { - iser_err("IB device does not support FMRs nor FastRegs, can't register memory\n"); - return -1; - } - - return 0; -} - -struct iser_fr_desc * +static struct iser_fr_desc * iser_reg_desc_get_fr(struct ib_conn *ib_conn) { struct iser_fr_pool *fr_pool = &ib_conn->fr_pool; @@ -109,7 +60,7 @@ struct iser_fr_desc * return desc; } -void +static void iser_reg_desc_put_fr(struct ib_conn *ib_conn, struct iser_fr_desc *desc) { @@ -121,44 +72,6 @@ struct iser_fr_desc * spin_unlock_irqrestore(&fr_pool->lock, flags); } -struct iser_fr_desc * -iser_reg_desc_get_fmr(struct ib_conn *ib_conn) -{ - struct iser_fr_pool *fr_pool = &ib_conn->fr_pool; - - return list_first_entry(&fr_pool->list, - struct iser_fr_desc, list); -} - -void -iser_reg_desc_put_fmr(struct ib_conn *ib_conn, - struct iser_fr_desc *desc) -{ -} - -static void iser_data_buf_dump(struct iser_data_buf *data, - struct ib_device *ibdev) -{ - struct scatterlist *sg; - int i; - - for_each_sg(data->sg, sg, data->dma_nents, i) - iser_dbg("sg[%d] dma_addr:0x%lX page:0x%p " - "off:0x%x sz:0x%x dma_len:0x%x\n", - i, (unsigned long)sg_dma_address(sg), - sg_page(sg), sg->offset, sg->length, sg_dma_len(sg)); -} - -static void iser_dump_page_vec(struct iser_page_vec *page_vec) -{ - int i; - - iser_err("page vec npages %d data length %lld\n", - page_vec->npages, page_vec->fake_mr.length); - for (i = 0; i < page_vec->npages; i++) - iser_err("vec[%d]: %llx\n", i, page_vec->pages[i]); -} - int iser_dma_map_task_data(struct iscsi_iser_task *iser_task, struct iser_data_buf *data, enum iser_data_dir iser_dir, @@ -213,84 +126,9 @@ void iser_dma_unmap_task_data(struct iscsi_iser_task *iser_task, return 0; } -static int iser_set_page(struct ib_mr *mr, u64 addr) -{ - struct iser_page_vec *page_vec = - container_of(mr, struct iser_page_vec, fake_mr); - - page_vec->pages[page_vec->npages++] = addr; - - return 0; -} - -static -int iser_fast_reg_fmr(struct iscsi_iser_task *iser_task, - struct iser_data_buf *mem, - struct iser_reg_resources *rsc, - struct iser_mem_reg *reg) -{ - struct ib_conn *ib_conn = &iser_task->iser_conn->ib_conn; - struct iser_device *device = ib_conn->device; - struct iser_page_vec *page_vec = rsc->page_vec; - struct ib_fmr_pool *fmr_pool = rsc->fmr_pool; - struct ib_pool_fmr *fmr; - int ret, plen; - - page_vec->npages = 0; - page_vec->fake_mr.page_size = SZ_4K; - plen = ib_sg_to_pages(&page_vec->fake_mr, mem->sg, - mem->dma_nents, NULL, iser_set_page); - if (unlikely(plen < mem->dma_nents)) { - iser_err("page vec too short to hold this SG\n"); - iser_data_buf_dump(mem, device->ib_device); - iser_dump_page_vec(page_vec); - return -EINVAL; - } - - fmr = ib_fmr_pool_map_phys(fmr_pool, page_vec->pages, - page_vec->npages, page_vec->pages[0]); - if (IS_ERR(fmr)) { - ret = PTR_ERR(fmr); - iser_err("ib_fmr_pool_map_phys failed: %d\n", ret); - return ret; - } - - reg->sge.lkey = fmr->fmr->lkey; - reg->rkey = fmr->fmr->rkey; - reg->sge.addr = page_vec->fake_mr.iova; - reg->sge.length = page_vec->fake_mr.length; - reg->mem_h = fmr; - - iser_dbg("fmr reg: lkey=0x%x, rkey=0x%x, addr=0x%llx," - " length=0x%x\n", reg->sge.lkey, reg->rkey, - reg->sge.addr, reg->sge.length); - - return 0; -} - -/** - * Unregister (previosuly registered using FMR) memory. - * If memory is non-FMR does nothing. - */ -void iser_unreg_mem_fmr(struct iscsi_iser_task *iser_task, - enum iser_data_dir cmd_dir) -{ - struct iser_mem_reg *reg = &iser_task->rdma_reg[cmd_dir]; - - if (!reg->mem_h) - return; - - iser_dbg("PHYSICAL Mem.Unregister mem_h %p\n", reg->mem_h); - - ib_fmr_pool_unmap((struct ib_pool_fmr *)reg->mem_h); - - reg->mem_h = NULL; -} - void iser_unreg_mem_fastreg(struct iscsi_iser_task *iser_task, enum iser_data_dir cmd_dir) { - struct iser_device *device = iser_task->iser_conn->ib_conn.device; struct iser_mem_reg *reg = &iser_task->rdma_reg[cmd_dir]; struct iser_fr_desc *desc; struct ib_mr_status mr_status; @@ -312,7 +150,7 @@ void iser_unreg_mem_fastreg(struct iscsi_iser_task *iser_task, ib_check_mr_status(desc->rsc.sig_mr, IB_MR_CHECK_SIG_STATUS, &mr_status); } - device->reg_ops->reg_desc_put(&iser_task->iser_conn->ib_conn, desc); + iser_reg_desc_put_fr(&iser_task->iser_conn->ib_conn, reg->mem_h); reg->mem_h = NULL; } @@ -509,15 +347,14 @@ static int iser_fast_reg_mr(struct iscsi_iser_task *iser_task, if (use_dma_key) return iser_reg_dma(device, mem, reg); - return device->reg_ops->reg_mem(task, mem, &desc->rsc, reg); + return iser_fast_reg_mr(task, mem, &desc->rsc, reg); } -int iser_reg_rdma_mem(struct iscsi_iser_task *task, - enum iser_data_dir dir, - bool all_imm) +int iser_reg_mem_fastreg(struct iscsi_iser_task *task, + enum iser_data_dir dir, + bool all_imm) { struct ib_conn *ib_conn = &task->iser_conn->ib_conn; - struct iser_device *device = ib_conn->device; struct iser_data_buf *mem = &task->data[dir]; struct iser_mem_reg *reg = &task->rdma_reg[dir]; struct iser_fr_desc *desc = NULL; @@ -528,7 +365,7 @@ int iser_reg_rdma_mem(struct iscsi_iser_task *task, scsi_get_prot_op(task->sc) == SCSI_PROT_NORMAL; if (!use_dma_key) { - desc = device->reg_ops->reg_desc_get(ib_conn); + desc = iser_reg_desc_get_fr(ib_conn); reg->mem_h = desc; } @@ -549,15 +386,8 @@ int iser_reg_rdma_mem(struct iscsi_iser_task *task, err_reg: if (desc) - device->reg_ops->reg_desc_put(ib_conn, desc); + iser_reg_desc_put_fr(ib_conn, desc); return err; } -void iser_unreg_rdma_mem(struct iscsi_iser_task *task, - enum iser_data_dir dir) -{ - struct iser_device *device = task->iser_conn->ib_conn.device; - - device->reg_ops->unreg_mem(task, dir); -} diff --git a/drivers/infiniband/ulp/iser/iser_verbs.c b/drivers/infiniband/ulp/iser/iser_verbs.c index 127887c..c1f44c4 100644 --- a/drivers/infiniband/ulp/iser/iser_verbs.c +++ b/drivers/infiniband/ulp/iser/iser_verbs.c @@ -68,11 +68,12 @@ static void iser_event_handler(struct ib_event_handler *handler, static int iser_create_device_ib_res(struct iser_device *device) { struct ib_device *ib_dev = device->ib_device; - int ret, i, max_cqe; + int i, max_cqe; - ret = iser_assign_reg_ops(device); - if (ret) - return ret; + if (!(ib_dev->attrs.device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS)) { + iser_err("IB device does not support memory registrations\n"); + return -1; + } device->comps_used = min_t(int, num_online_cpus(), ib_dev->num_comp_vectors); @@ -147,96 +148,6 @@ static void iser_free_device_ib_res(struct iser_device *device) device->pd = NULL; } -/** - * iser_alloc_fmr_pool - Creates FMR pool and page_vector - * @ib_conn: connection RDMA resources - * @cmds_max: max number of SCSI commands for this connection - * @size: max number of pages per map request - * - * Return: 0 on success, or errno code on failure - */ -int iser_alloc_fmr_pool(struct ib_conn *ib_conn, - unsigned cmds_max, - unsigned int size) -{ - struct iser_device *device = ib_conn->device; - struct iser_fr_pool *fr_pool = &ib_conn->fr_pool; - struct iser_page_vec *page_vec; - struct iser_fr_desc *desc; - struct ib_fmr_pool *fmr_pool; - struct ib_fmr_pool_param params; - int ret; - - INIT_LIST_HEAD(&fr_pool->list); - spin_lock_init(&fr_pool->lock); - - desc = kzalloc(sizeof(*desc), GFP_KERNEL); - if (!desc) - return -ENOMEM; - - page_vec = kmalloc(sizeof(*page_vec) + (sizeof(u64) * size), - GFP_KERNEL); - if (!page_vec) { - ret = -ENOMEM; - goto err_frpl; - } - - page_vec->pages = (u64 *)(page_vec + 1); - - params.page_shift = ilog2(SZ_4K); - params.max_pages_per_fmr = size; - /* make the pool size twice the max number of SCSI commands * - * the ML is expected to queue, watermark for unmap at 50% */ - params.pool_size = cmds_max * 2; - params.dirty_watermark = cmds_max; - params.cache = 0; - params.flush_function = NULL; - params.access = (IB_ACCESS_LOCAL_WRITE | - IB_ACCESS_REMOTE_WRITE | - IB_ACCESS_REMOTE_READ); - - fmr_pool = ib_create_fmr_pool(device->pd, ¶ms); - if (IS_ERR(fmr_pool)) { - ret = PTR_ERR(fmr_pool); - iser_err("FMR allocation failed, err %d\n", ret); - goto err_fmr; - } - - desc->rsc.page_vec = page_vec; - desc->rsc.fmr_pool = fmr_pool; - list_add(&desc->list, &fr_pool->list); - - return 0; - -err_fmr: - kfree(page_vec); -err_frpl: - kfree(desc); - - return ret; -} - -/** - * iser_free_fmr_pool - releases the FMR pool and page vec - * @ib_conn: connection RDMA resources - */ -void iser_free_fmr_pool(struct ib_conn *ib_conn) -{ - struct iser_fr_pool *fr_pool = &ib_conn->fr_pool; - struct iser_fr_desc *desc; - - desc = list_first_entry(&fr_pool->list, - struct iser_fr_desc, list); - list_del(&desc->list); - - iser_info("freeing conn %p fmr pool %p\n", - ib_conn, desc->rsc.fmr_pool); - - ib_destroy_fmr_pool(desc->rsc.fmr_pool); - kfree(desc->rsc.page_vec); - kfree(desc); -} - static struct iser_fr_desc * iser_create_fastreg_desc(struct iser_device *device, struct ib_pd *pd, @@ -667,13 +578,12 @@ static void iser_connect_error(struct rdma_cm_id *cma_id) u32 max_num_sg; /* - * FRs without SG_GAPS or FMRs can only map up to a (device) page per - * entry, but if the first entry is misaligned we'll end up using two - * entries (head and tail) for a single page worth data, so one - * additional entry is required. + * FRs without SG_GAPS can only map up to a (device) page per entry, + * but if the first entry is misaligned we'll end up using two entries + * (head and tail) for a single page worth data, so one additional + * entry is required. */ - if ((attr->device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS) && - (attr->device_cap_flags & IB_DEVICE_SG_GAPS_REG)) + if (attr->device_cap_flags & IB_DEVICE_SG_GAPS_REG) reserved_mr_pages = 0; else reserved_mr_pages = 1; @@ -684,14 +594,8 @@ static void iser_connect_error(struct rdma_cm_id *cma_id) max_num_sg = attr->max_fast_reg_page_list_len; sg_tablesize = DIV_ROUND_UP(max_sectors * SECTOR_SIZE, SZ_4K); - if (attr->device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS) - sup_sg_tablesize = - min_t( - uint, ISCSI_ISER_MAX_SG_TABLESIZE, - max_num_sg - reserved_mr_pages); - else - sup_sg_tablesize = ISCSI_ISER_MAX_SG_TABLESIZE; - + sup_sg_tablesize = min_t(uint, ISCSI_ISER_MAX_SG_TABLESIZE, + max_num_sg - reserved_mr_pages); iser_conn->scsi_sg_tablesize = min(sg_tablesize, sup_sg_tablesize); iser_conn->pages_per_mr = iser_conn->scsi_sg_tablesize + reserved_mr_pages; @@ -755,7 +659,7 @@ static void iser_route_handler(struct rdma_cm_id *cma_id) struct iser_cm_hdr req_hdr; struct iser_conn *iser_conn = (struct iser_conn *)cma_id->context; struct ib_conn *ib_conn = &iser_conn->ib_conn; - struct iser_device *device = ib_conn->device; + struct ib_device *ib_dev = ib_conn->device->ib_device; if (iser_conn->state != ISER_CONN_PENDING) /* bailout */ @@ -766,14 +670,14 @@ static void iser_route_handler(struct rdma_cm_id *cma_id) goto failure; memset(&conn_param, 0, sizeof conn_param); - conn_param.responder_resources = device->ib_device->attrs.max_qp_rd_atom; + conn_param.responder_resources = ib_dev->attrs.max_qp_rd_atom; conn_param.initiator_depth = 1; conn_param.retry_count = 7; conn_param.rnr_retry_count = 6; memset(&req_hdr, 0, sizeof(req_hdr)); req_hdr.flags = ISER_ZBVA_NOT_SUP; - if (!device->remote_inv_sup) + if (!iser_always_reg) req_hdr.flags |= ISER_SEND_W_INV_NOT_SUP; conn_param.private_data = (void *)&req_hdr; conn_param.private_data_len = sizeof(struct iser_cm_hdr); From patchwork Wed May 27 09:46:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Gurtovoy X-Patchwork-Id: 218448 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.7 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 636CAC433E2 for ; Wed, 27 May 2020 09:47:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4C43F2088E for ; Wed, 27 May 2020 09:47:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729335AbgE0JrQ (ORCPT ); Wed, 27 May 2020 05:47:16 -0400 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:58171 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729051AbgE0JrP (ORCPT ); Wed, 27 May 2020 05:47:15 -0400 Received: from Internal Mail-Server by MTLPINE2 (envelope-from maxg@mellanox.com) with ESMTPS (AES256-SHA encrypted); 27 May 2020 12:46:36 +0300 Received: from mtr-vdi-031.wap.labs.mlnx. (mtr-vdi-031.wap.labs.mlnx [10.209.102.136]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 04R9kYip009430; Wed, 27 May 2020 12:46:36 +0300 From: Max Gurtovoy To: jgg@mellanox.com, dledford@redhat.com, leon@kernel.org, galpress@amazon.com, dennis.dalessandro@intel.com, netdev@vger.kernel.org, sagi@grimberg.me, linux-rdma@vger.kernel.org, bvanassche@acm.org, santosh.shilimkar@oracle.com, tom@talpey.com Cc: aron.silverton@oracle.com, israelr@mellanox.com, oren@mellanox.com, shlomin@mellanox.com, vladimirk@mellanox.com, Max Gurtovoy Subject: [PATCH 8/9] RDMA/core: remove FMR pool API Date: Wed, 27 May 2020 12:46:33 +0300 Message-Id: <20200527094634.24240-9-maxg@mellanox.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200527094634.24240-1-maxg@mellanox.com> References: <20200527094634.24240-1-maxg@mellanox.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This ancient and unsafe method for memory registration is no longer used by any RDMA based ULP. Remove the FMR pool API from the core driver. Signed-off-by: Max Gurtovoy --- Documentation/driver-api/infiniband.rst | 3 - drivers/infiniband/core/Makefile | 2 +- drivers/infiniband/core/fmr_pool.c | 494 -------------------------------- include/rdma/ib_fmr_pool.h | 93 ------ 4 files changed, 1 insertion(+), 591 deletions(-) delete mode 100644 drivers/infiniband/core/fmr_pool.c delete mode 100644 include/rdma/ib_fmr_pool.h diff --git a/Documentation/driver-api/infiniband.rst b/Documentation/driver-api/infiniband.rst index 1a3116f..30e142c 100644 --- a/Documentation/driver-api/infiniband.rst +++ b/Documentation/driver-api/infiniband.rst @@ -37,9 +37,6 @@ InfiniBand core interfaces .. kernel-doc:: drivers/infiniband/core/ud_header.c :export: -.. kernel-doc:: drivers/infiniband/core/fmr_pool.c - :export: - .. kernel-doc:: drivers/infiniband/core/umem.c :export: diff --git a/drivers/infiniband/core/Makefile b/drivers/infiniband/core/Makefile index d1b14887..064cd34 100644 --- a/drivers/infiniband/core/Makefile +++ b/drivers/infiniband/core/Makefile @@ -8,7 +8,7 @@ obj-$(CONFIG_INFINIBAND_USER_MAD) += ib_umad.o obj-$(CONFIG_INFINIBAND_USER_ACCESS) += ib_uverbs.o $(user_access-y) ib_core-y := packer.o ud_header.o verbs.o cq.o rw.o sysfs.o \ - device.o fmr_pool.o cache.o netlink.o \ + device.o cache.o netlink.o \ roce_gid_mgmt.o mr_pool.o addr.o sa_query.o \ multicast.o mad.o smi.o agent.o mad_rmpp.o \ nldev.o restrack.o counters.o ib_core_uverbs.o \ diff --git a/drivers/infiniband/core/fmr_pool.c b/drivers/infiniband/core/fmr_pool.c deleted file mode 100644 index e08aec4..0000000 --- a/drivers/infiniband/core/fmr_pool.c +++ /dev/null @@ -1,494 +0,0 @@ -/* - * Copyright (c) 2004 Topspin Communications. All rights reserved. - * Copyright (c) 2005 Sun Microsystems, Inc. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - */ - -#include -#include -#include -#include -#include -#include - -#include - -#include "core_priv.h" - -#define PFX "fmr_pool: " - -enum { - IB_FMR_MAX_REMAPS = 32, - - IB_FMR_HASH_BITS = 8, - IB_FMR_HASH_SIZE = 1 << IB_FMR_HASH_BITS, - IB_FMR_HASH_MASK = IB_FMR_HASH_SIZE - 1 -}; - -/* - * If an FMR is not in use, then the list member will point to either - * its pool's free_list (if the FMR can be mapped again; that is, - * remap_count < pool->max_remaps) or its pool's dirty_list (if the - * FMR needs to be unmapped before being remapped). In either of - * these cases it is a bug if the ref_count is not 0. In other words, - * if ref_count is > 0, then the list member must not be linked into - * either free_list or dirty_list. - * - * The cache_node member is used to link the FMR into a cache bucket - * (if caching is enabled). This is independent of the reference - * count of the FMR. When a valid FMR is released, its ref_count is - * decremented, and if ref_count reaches 0, the FMR is placed in - * either free_list or dirty_list as appropriate. However, it is not - * removed from the cache and may be "revived" if a call to - * ib_fmr_register_physical() occurs before the FMR is remapped. In - * this case we just increment the ref_count and remove the FMR from - * free_list/dirty_list. - * - * Before we remap an FMR from free_list, we remove it from the cache - * (to prevent another user from obtaining a stale FMR). When an FMR - * is released, we add it to the tail of the free list, so that our - * cache eviction policy is "least recently used." - * - * All manipulation of ref_count, list and cache_node is protected by - * pool_lock to maintain consistency. - */ - -struct ib_fmr_pool { - spinlock_t pool_lock; - - int pool_size; - int max_pages; - int max_remaps; - int dirty_watermark; - int dirty_len; - struct list_head free_list; - struct list_head dirty_list; - struct hlist_head *cache_bucket; - - void (*flush_function)(struct ib_fmr_pool *pool, - void * arg); - void *flush_arg; - - struct kthread_worker *worker; - struct kthread_work work; - - atomic_t req_ser; - atomic_t flush_ser; - - wait_queue_head_t force_wait; -}; - -static inline u32 ib_fmr_hash(u64 first_page) -{ - return jhash_2words((u32) first_page, (u32) (first_page >> 32), 0) & - (IB_FMR_HASH_SIZE - 1); -} - -/* Caller must hold pool_lock */ -static inline struct ib_pool_fmr *ib_fmr_cache_lookup(struct ib_fmr_pool *pool, - u64 *page_list, - int page_list_len, - u64 io_virtual_address) -{ - struct hlist_head *bucket; - struct ib_pool_fmr *fmr; - - if (!pool->cache_bucket) - return NULL; - - bucket = pool->cache_bucket + ib_fmr_hash(*page_list); - - hlist_for_each_entry(fmr, bucket, cache_node) - if (io_virtual_address == fmr->io_virtual_address && - page_list_len == fmr->page_list_len && - !memcmp(page_list, fmr->page_list, - page_list_len * sizeof *page_list)) - return fmr; - - return NULL; -} - -static void ib_fmr_batch_release(struct ib_fmr_pool *pool) -{ - int ret; - struct ib_pool_fmr *fmr; - LIST_HEAD(unmap_list); - LIST_HEAD(fmr_list); - - spin_lock_irq(&pool->pool_lock); - - list_for_each_entry(fmr, &pool->dirty_list, list) { - hlist_del_init(&fmr->cache_node); - fmr->remap_count = 0; - list_add_tail(&fmr->fmr->list, &fmr_list); - } - - list_splice_init(&pool->dirty_list, &unmap_list); - pool->dirty_len = 0; - - spin_unlock_irq(&pool->pool_lock); - - if (list_empty(&unmap_list)) { - return; - } - - ret = ib_unmap_fmr(&fmr_list); - if (ret) - pr_warn(PFX "ib_unmap_fmr returned %d\n", ret); - - spin_lock_irq(&pool->pool_lock); - list_splice(&unmap_list, &pool->free_list); - spin_unlock_irq(&pool->pool_lock); -} - -static void ib_fmr_cleanup_func(struct kthread_work *work) -{ - struct ib_fmr_pool *pool = container_of(work, struct ib_fmr_pool, work); - - ib_fmr_batch_release(pool); - atomic_inc(&pool->flush_ser); - wake_up_interruptible(&pool->force_wait); - - if (pool->flush_function) - pool->flush_function(pool, pool->flush_arg); - - if (atomic_read(&pool->flush_ser) - atomic_read(&pool->req_ser) < 0) - kthread_queue_work(pool->worker, &pool->work); -} - -/** - * ib_create_fmr_pool - Create an FMR pool - * @pd:Protection domain for FMRs - * @params:FMR pool parameters - * - * Create a pool of FMRs. Return value is pointer to new pool or - * error code if creation failed. - */ -struct ib_fmr_pool *ib_create_fmr_pool(struct ib_pd *pd, - struct ib_fmr_pool_param *params) -{ - struct ib_device *device; - struct ib_fmr_pool *pool; - int i; - int ret; - int max_remaps; - - if (!params) - return ERR_PTR(-EINVAL); - - device = pd->device; - if (!device->ops.alloc_fmr || !device->ops.dealloc_fmr || - !device->ops.map_phys_fmr || !device->ops.unmap_fmr) { - dev_info(&device->dev, "Device does not support FMRs\n"); - return ERR_PTR(-ENOSYS); - } - - if (!device->attrs.max_map_per_fmr) - max_remaps = IB_FMR_MAX_REMAPS; - else - max_remaps = device->attrs.max_map_per_fmr; - - pool = kmalloc(sizeof *pool, GFP_KERNEL); - if (!pool) - return ERR_PTR(-ENOMEM); - - pool->cache_bucket = NULL; - pool->flush_function = params->flush_function; - pool->flush_arg = params->flush_arg; - - INIT_LIST_HEAD(&pool->free_list); - INIT_LIST_HEAD(&pool->dirty_list); - - if (params->cache) { - pool->cache_bucket = - kmalloc_array(IB_FMR_HASH_SIZE, - sizeof(*pool->cache_bucket), - GFP_KERNEL); - if (!pool->cache_bucket) { - ret = -ENOMEM; - goto out_free_pool; - } - - for (i = 0; i < IB_FMR_HASH_SIZE; ++i) - INIT_HLIST_HEAD(pool->cache_bucket + i); - } - - pool->pool_size = 0; - pool->max_pages = params->max_pages_per_fmr; - pool->max_remaps = max_remaps; - pool->dirty_watermark = params->dirty_watermark; - pool->dirty_len = 0; - spin_lock_init(&pool->pool_lock); - atomic_set(&pool->req_ser, 0); - atomic_set(&pool->flush_ser, 0); - init_waitqueue_head(&pool->force_wait); - - pool->worker = - kthread_create_worker(0, "ib_fmr(%s)", dev_name(&device->dev)); - if (IS_ERR(pool->worker)) { - pr_warn(PFX "couldn't start cleanup kthread worker\n"); - ret = PTR_ERR(pool->worker); - goto out_free_pool; - } - kthread_init_work(&pool->work, ib_fmr_cleanup_func); - - { - struct ib_pool_fmr *fmr; - struct ib_fmr_attr fmr_attr = { - .max_pages = params->max_pages_per_fmr, - .max_maps = pool->max_remaps, - .page_shift = params->page_shift - }; - int bytes_per_fmr = sizeof *fmr; - - if (pool->cache_bucket) - bytes_per_fmr += params->max_pages_per_fmr * sizeof (u64); - - for (i = 0; i < params->pool_size; ++i) { - fmr = kmalloc(bytes_per_fmr, GFP_KERNEL); - if (!fmr) - goto out_fail; - - fmr->pool = pool; - fmr->remap_count = 0; - fmr->ref_count = 0; - INIT_HLIST_NODE(&fmr->cache_node); - - fmr->fmr = ib_alloc_fmr(pd, params->access, &fmr_attr); - if (IS_ERR(fmr->fmr)) { - pr_warn(PFX "fmr_create failed for FMR %d\n", - i); - kfree(fmr); - goto out_fail; - } - - list_add_tail(&fmr->list, &pool->free_list); - ++pool->pool_size; - } - } - - return pool; - - out_free_pool: - kfree(pool->cache_bucket); - kfree(pool); - - return ERR_PTR(ret); - - out_fail: - ib_destroy_fmr_pool(pool); - - return ERR_PTR(-ENOMEM); -} -EXPORT_SYMBOL(ib_create_fmr_pool); - -/** - * ib_destroy_fmr_pool - Free FMR pool - * @pool:FMR pool to free - * - * Destroy an FMR pool and free all associated resources. - */ -void ib_destroy_fmr_pool(struct ib_fmr_pool *pool) -{ - struct ib_pool_fmr *fmr; - struct ib_pool_fmr *tmp; - LIST_HEAD(fmr_list); - int i; - - kthread_destroy_worker(pool->worker); - ib_fmr_batch_release(pool); - - i = 0; - list_for_each_entry_safe(fmr, tmp, &pool->free_list, list) { - if (fmr->remap_count) { - INIT_LIST_HEAD(&fmr_list); - list_add_tail(&fmr->fmr->list, &fmr_list); - ib_unmap_fmr(&fmr_list); - } - ib_dealloc_fmr(fmr->fmr); - list_del(&fmr->list); - kfree(fmr); - ++i; - } - - if (i < pool->pool_size) - pr_warn(PFX "pool still has %d regions registered\n", - pool->pool_size - i); - - kfree(pool->cache_bucket); - kfree(pool); -} -EXPORT_SYMBOL(ib_destroy_fmr_pool); - -/** - * ib_flush_fmr_pool - Invalidate all unmapped FMRs - * @pool:FMR pool to flush - * - * Ensure that all unmapped FMRs are fully invalidated. - */ -int ib_flush_fmr_pool(struct ib_fmr_pool *pool) -{ - int serial; - struct ib_pool_fmr *fmr, *next; - - /* - * The free_list holds FMRs that may have been used - * but have not been remapped enough times to be dirty. - * Put them on the dirty list now so that the cleanup - * thread will reap them too. - */ - spin_lock_irq(&pool->pool_lock); - list_for_each_entry_safe(fmr, next, &pool->free_list, list) { - if (fmr->remap_count > 0) - list_move(&fmr->list, &pool->dirty_list); - } - spin_unlock_irq(&pool->pool_lock); - - serial = atomic_inc_return(&pool->req_ser); - kthread_queue_work(pool->worker, &pool->work); - - if (wait_event_interruptible(pool->force_wait, - atomic_read(&pool->flush_ser) - serial >= 0)) - return -EINTR; - - return 0; -} -EXPORT_SYMBOL(ib_flush_fmr_pool); - -/** - * ib_fmr_pool_map_phys - Map an FMR from an FMR pool. - * @pool_handle: FMR pool to allocate FMR from - * @page_list: List of pages to map - * @list_len: Number of pages in @page_list - * @io_virtual_address: I/O virtual address for new FMR - */ -struct ib_pool_fmr *ib_fmr_pool_map_phys(struct ib_fmr_pool *pool_handle, - u64 *page_list, - int list_len, - u64 io_virtual_address) -{ - struct ib_fmr_pool *pool = pool_handle; - struct ib_pool_fmr *fmr; - unsigned long flags; - int result; - - if (list_len < 1 || list_len > pool->max_pages) - return ERR_PTR(-EINVAL); - - spin_lock_irqsave(&pool->pool_lock, flags); - fmr = ib_fmr_cache_lookup(pool, - page_list, - list_len, - io_virtual_address); - if (fmr) { - /* found in cache */ - ++fmr->ref_count; - if (fmr->ref_count == 1) { - list_del(&fmr->list); - } - - spin_unlock_irqrestore(&pool->pool_lock, flags); - - return fmr; - } - - if (list_empty(&pool->free_list)) { - spin_unlock_irqrestore(&pool->pool_lock, flags); - return ERR_PTR(-EAGAIN); - } - - fmr = list_entry(pool->free_list.next, struct ib_pool_fmr, list); - list_del(&fmr->list); - hlist_del_init(&fmr->cache_node); - spin_unlock_irqrestore(&pool->pool_lock, flags); - - result = ib_map_phys_fmr(fmr->fmr, page_list, list_len, - io_virtual_address); - - if (result) { - spin_lock_irqsave(&pool->pool_lock, flags); - list_add(&fmr->list, &pool->free_list); - spin_unlock_irqrestore(&pool->pool_lock, flags); - - pr_warn(PFX "fmr_map returns %d\n", result); - - return ERR_PTR(result); - } - - ++fmr->remap_count; - fmr->ref_count = 1; - - if (pool->cache_bucket) { - fmr->io_virtual_address = io_virtual_address; - fmr->page_list_len = list_len; - memcpy(fmr->page_list, page_list, list_len * sizeof(*page_list)); - - spin_lock_irqsave(&pool->pool_lock, flags); - hlist_add_head(&fmr->cache_node, - pool->cache_bucket + ib_fmr_hash(fmr->page_list[0])); - spin_unlock_irqrestore(&pool->pool_lock, flags); - } - - return fmr; -} -EXPORT_SYMBOL(ib_fmr_pool_map_phys); - -/** - * ib_fmr_pool_unmap - Unmap FMR - * @fmr:FMR to unmap - * - * Unmap an FMR. The FMR mapping may remain valid until the FMR is - * reused (or until ib_flush_fmr_pool() is called). - */ -void ib_fmr_pool_unmap(struct ib_pool_fmr *fmr) -{ - struct ib_fmr_pool *pool; - unsigned long flags; - - pool = fmr->pool; - - spin_lock_irqsave(&pool->pool_lock, flags); - - --fmr->ref_count; - if (!fmr->ref_count) { - if (fmr->remap_count < pool->max_remaps) { - list_add_tail(&fmr->list, &pool->free_list); - } else { - list_add_tail(&fmr->list, &pool->dirty_list); - if (++pool->dirty_len >= pool->dirty_watermark) { - atomic_inc(&pool->req_ser); - kthread_queue_work(pool->worker, &pool->work); - } - } - } - - spin_unlock_irqrestore(&pool->pool_lock, flags); -} -EXPORT_SYMBOL(ib_fmr_pool_unmap); diff --git a/include/rdma/ib_fmr_pool.h b/include/rdma/ib_fmr_pool.h deleted file mode 100644 index 2fd9bfb..0000000 --- a/include/rdma/ib_fmr_pool.h +++ /dev/null @@ -1,93 +0,0 @@ -/* - * Copyright (c) 2004 Topspin Corporation. All rights reserved. - * Copyright (c) 2005 Sun Microsystems, Inc. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - */ - -#if !defined(IB_FMR_POOL_H) -#define IB_FMR_POOL_H - -#include - -struct ib_fmr_pool; - -/** - * struct ib_fmr_pool_param - Parameters for creating FMR pool - * @max_pages_per_fmr:Maximum number of pages per map request. - * @page_shift: Log2 of sizeof "pages" mapped by this fmr - * @access:Access flags for FMRs in pool. - * @pool_size:Number of FMRs to allocate for pool. - * @dirty_watermark:Flush is triggered when @dirty_watermark dirty - * FMRs are present. - * @flush_function:Callback called when unmapped FMRs are flushed and - * more FMRs are possibly available for mapping - * @flush_arg:Context passed to user's flush function. - * @cache:If set, FMRs may be reused after unmapping for identical map - * requests. - */ -struct ib_fmr_pool_param { - int max_pages_per_fmr; - int page_shift; - enum ib_access_flags access; - int pool_size; - int dirty_watermark; - void (*flush_function)(struct ib_fmr_pool *pool, - void *arg); - void *flush_arg; - unsigned cache:1; -}; - -struct ib_pool_fmr { - struct ib_fmr *fmr; - struct ib_fmr_pool *pool; - struct list_head list; - struct hlist_node cache_node; - int ref_count; - int remap_count; - u64 io_virtual_address; - int page_list_len; - u64 page_list[]; -}; - -struct ib_fmr_pool *ib_create_fmr_pool(struct ib_pd *pd, - struct ib_fmr_pool_param *params); - -void ib_destroy_fmr_pool(struct ib_fmr_pool *pool); - -int ib_flush_fmr_pool(struct ib_fmr_pool *pool); - -struct ib_pool_fmr *ib_fmr_pool_map_phys(struct ib_fmr_pool *pool_handle, - u64 *page_list, - int list_len, - u64 io_virtual_address); - -void ib_fmr_pool_unmap(struct ib_pool_fmr *fmr); - -#endif /* IB_FMR_POOL_H */