From patchwork Wed May 13 20:04:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kirti Wankhede X-Patchwork-Id: 282819 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4C3B1C433DF for ; Wed, 13 May 2020 20:39:04 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7EFD8205ED for ; Wed, 13 May 2020 20:39:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=nvidia.com header.i=@nvidia.com header.b="pbeRmTnj" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7EFD8205ED Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=nvidia.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:44502 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYy9f-0000Cw-9p for qemu-devel@archiver.kernel.org; Wed, 13 May 2020 16:39:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54346) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYy8m-0006vg-EL for qemu-devel@nongnu.org; Wed, 13 May 2020 16:38:08 -0400 Received: from hqnvemgate25.nvidia.com ([216.228.121.64]:11646) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYy8l-0006x2-C5 for qemu-devel@nongnu.org; Wed, 13 May 2020 16:38:08 -0400 Received: from hqpgpgate101.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate25.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Wed, 13 May 2020 13:36:51 -0700 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate101.nvidia.com (PGP Universal service); Wed, 13 May 2020 13:38:06 -0700 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Wed, 13 May 2020 13:38:06 -0700 Received: from HQMAIL107.nvidia.com (172.20.187.13) by HQMAIL111.nvidia.com (172.20.187.18) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Wed, 13 May 2020 20:38:05 +0000 Received: from kwankhede-dev.nvidia.com (172.20.13.39) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1473.3 via Frontend Transport; Wed, 13 May 2020 20:37:59 +0000 From: Kirti Wankhede To: , Subject: [PATCH Kernel v19 2/8] vfio iommu: Remove atomicity of ref_count of pinned pages Date: Thu, 14 May 2020 01:34:33 +0530 Message-ID: <1589400279-28522-3-git-send-email-kwankhede@nvidia.com> X-Mailer: git-send-email 2.7.0 In-Reply-To: <1589400279-28522-1-git-send-email-kwankhede@nvidia.com> References: <1589400279-28522-1-git-send-email-kwankhede@nvidia.com> X-NVConfidentiality: public MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1589402211; bh=uUgKJVCFdD4Zp7du3o0u/sQDLu1K2BjFd3xLP7hSugs=; h=X-PGP-Universal:From:To:CC:Subject:Date:Message-ID:X-Mailer: In-Reply-To:References:X-NVConfidentiality:MIME-Version: Content-Type; b=pbeRmTnjP9OA5iM39o6OLI2wl2NEtMoqg60mbTMv6oT2xFp0cjLTLvDLr1mGAlt4O BLJk+Sf8wKyw1+wjX8R77nROfLNHFF6l0zh6Ml2Q6Dfda+icsZv5K3/RZcseICzlO8 gK2BNijOx5sgF5LxBE9dkn27p8Du6ETgqllZoaOXQ+erkLCdE27qJfKx5qS3Ul6EX9 LyEIZHANRE4RJr6pdDWdTrz92SZWuK4+UnoNcG8R2zX7w/DikzQny/5UFWnqvl9O4X 33cQLNkNo1iOMac/hot4DeM2aKO0+wCJ6YmC+uMQrUFgQdCAhnDiQWNQd0Su/OyyK2 YEaXvh1QpT70Q== Received-SPF: pass client-ip=216.228.121.64; envelope-from=kwankhede@nvidia.com; helo=hqnvemgate25.nvidia.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/05/13 16:26:46 X-ACL-Warn: Detected OS = Windows 7 or 8 [fuzzy] X-Spam_score_int: -70 X-Spam_score: -7.1 X-Spam_bar: ------- X-Spam_report: (-7.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Zhengxiao.zx@Alibaba-inc.com, kevin.tian@intel.com, yi.l.liu@intel.com, yan.y.zhao@intel.com, kvm@vger.kernel.org, eskultet@redhat.com, ziye.yang@intel.com, qemu-devel@nongnu.org, cohuck@redhat.com, shuangtai.tst@alibaba-inc.com, dgilbert@redhat.com, zhi.a.wang@intel.com, mlevitsk@redhat.com, pasic@linux.ibm.com, aik@ozlabs.ru, Kirti Wankhede , eauger@redhat.com, felipe@nutanix.com, jonathan.davies@nutanix.com, changpeng.liu@intel.com, Ken.Xue@amd.com Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" vfio_pfn.ref_count is always updated while holding iommu->lock, using atomic variable is overkill. Signed-off-by: Kirti Wankhede Reviewed-by: Neo Jia Reviewed-by: Eric Auger Reviewed-by: Cornelia Huck --- drivers/vfio/vfio_iommu_type1.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index a0c60f895b24..fa735047b04d 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -112,7 +112,7 @@ struct vfio_pfn { struct rb_node node; dma_addr_t iova; /* Device address */ unsigned long pfn; /* Host pfn */ - atomic_t ref_count; + unsigned int ref_count; }; struct vfio_regions { @@ -233,7 +233,7 @@ static int vfio_add_to_pfn_list(struct vfio_dma *dma, dma_addr_t iova, vpfn->iova = iova; vpfn->pfn = pfn; - atomic_set(&vpfn->ref_count, 1); + vpfn->ref_count = 1; vfio_link_pfn(dma, vpfn); return 0; } @@ -251,7 +251,7 @@ static struct vfio_pfn *vfio_iova_get_vfio_pfn(struct vfio_dma *dma, struct vfio_pfn *vpfn = vfio_find_vpfn(dma, iova); if (vpfn) - atomic_inc(&vpfn->ref_count); + vpfn->ref_count++; return vpfn; } @@ -259,7 +259,8 @@ static int vfio_iova_put_vfio_pfn(struct vfio_dma *dma, struct vfio_pfn *vpfn) { int ret = 0; - if (atomic_dec_and_test(&vpfn->ref_count)) { + vpfn->ref_count--; + if (!vpfn->ref_count) { ret = put_pfn(vpfn->pfn, dma->prot); vfio_remove_from_pfn_list(dma, vpfn); } From patchwork Wed May 13 20:04:34 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kirti Wankhede X-Patchwork-Id: 282818 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.7 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, UNWANTED_LANGUAGE_BODY, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6AFBFC433E0 for ; Wed, 13 May 2020 20:40:05 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 95D0A205ED for ; Wed, 13 May 2020 20:40:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=nvidia.com header.i=@nvidia.com header.b="MribBTbr" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 95D0A205ED Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=nvidia.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:50532 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYyAe-0002vx-Ch for qemu-devel@archiver.kernel.org; Wed, 13 May 2020 16:40:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54380) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYy8t-0007BA-KN for qemu-devel@nongnu.org; Wed, 13 May 2020 16:38:15 -0400 Received: from hqnvemgate25.nvidia.com ([216.228.121.64]:11677) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYy8s-0006xP-5v for qemu-devel@nongnu.org; Wed, 13 May 2020 16:38:15 -0400 Received: from hqpgpgate102.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate25.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Wed, 13 May 2020 13:36:58 -0700 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate102.nvidia.com (PGP Universal service); Wed, 13 May 2020 13:38:12 -0700 X-PGP-Universal: processed; by hqpgpgate102.nvidia.com on Wed, 13 May 2020 13:38:12 -0700 Received: from HQMAIL107.nvidia.com (172.20.187.13) by HQMAIL109.nvidia.com (172.20.187.15) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Wed, 13 May 2020 20:38:12 +0000 Received: from kwankhede-dev.nvidia.com (172.20.13.39) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1473.3 via Frontend Transport; Wed, 13 May 2020 20:38:06 +0000 From: Kirti Wankhede To: , Subject: [PATCH Kernel v19 3/8] vfio iommu: Cache pgsize_bitmap in struct vfio_iommu Date: Thu, 14 May 2020 01:34:34 +0530 Message-ID: <1589400279-28522-4-git-send-email-kwankhede@nvidia.com> X-Mailer: git-send-email 2.7.0 In-Reply-To: <1589400279-28522-1-git-send-email-kwankhede@nvidia.com> References: <1589400279-28522-1-git-send-email-kwankhede@nvidia.com> X-NVConfidentiality: public MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1589402218; bh=p18Tye1V+UyJHxKzz13RY6bXcoaqy1XoKO753WoD52U=; h=X-PGP-Universal:From:To:CC:Subject:Date:Message-ID:X-Mailer: In-Reply-To:References:X-NVConfidentiality:MIME-Version: Content-Type; b=MribBTbrxv0kRjXi5aQTTazsokWJI5mqNTjdX9ldshlENNvvH6V+6hX0P0DzimCey vMZ27FSCI6trKgQ/VY/cEfGZo6hJeA/uFcAUcR0O29Q5GiFAJEUNLYHR3CBxyRSmP4 uCDpJ//70FhDjAictwVxSwfsTkqHjTPvzb74GBbcgShgT7mHlFsWAnh0mS73RHAUGF ygepTUvWV4fG4jGmaxe8dR4qEY7DwuC8QLR7YFUphFeKY+uLb5jXC2+k5jomlKbltu lnKR6eIzxsH+k8mk6+PERLrZ6CpxO7PfaU77MI6HN24B17VZ8ylmmSjcD52qw+ESX0 geqKLGEXNyPfQ== Received-SPF: pass client-ip=216.228.121.64; envelope-from=kwankhede@nvidia.com; helo=hqnvemgate25.nvidia.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/05/13 16:26:46 X-ACL-Warn: Detected OS = Windows 7 or 8 [fuzzy] X-Spam_score_int: -70 X-Spam_score: -7.1 X-Spam_bar: ------- X-Spam_report: (-7.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Zhengxiao.zx@Alibaba-inc.com, kevin.tian@intel.com, yi.l.liu@intel.com, yan.y.zhao@intel.com, kvm@vger.kernel.org, eskultet@redhat.com, ziye.yang@intel.com, qemu-devel@nongnu.org, cohuck@redhat.com, shuangtai.tst@alibaba-inc.com, dgilbert@redhat.com, zhi.a.wang@intel.com, mlevitsk@redhat.com, pasic@linux.ibm.com, aik@ozlabs.ru, Kirti Wankhede , eauger@redhat.com, felipe@nutanix.com, jonathan.davies@nutanix.com, changpeng.liu@intel.com, Ken.Xue@amd.com Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Calculate and cache pgsize_bitmap when iommu->domain_list is updated. Add iommu->lock protection when cached pgsize_bitmap is accessed. Signed-off-by: Kirti Wankhede Reviewed-by: Neo Jia --- drivers/vfio/vfio_iommu_type1.c | 87 +++++++++++++++++++++++------------------ 1 file changed, 48 insertions(+), 39 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index fa735047b04d..6f09fbabed12 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -69,6 +69,7 @@ struct vfio_iommu { struct rb_root dma_list; struct blocking_notifier_head notifier; unsigned int dma_avail; + uint64_t pgsize_bitmap; bool v2; bool nesting; }; @@ -805,15 +806,14 @@ static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma) iommu->dma_avail++; } -static unsigned long vfio_pgsize_bitmap(struct vfio_iommu *iommu) +static void vfio_pgsize_bitmap(struct vfio_iommu *iommu) { struct vfio_domain *domain; - unsigned long bitmap = ULONG_MAX; - mutex_lock(&iommu->lock); + iommu->pgsize_bitmap = ULONG_MAX; + list_for_each_entry(domain, &iommu->domain_list, next) - bitmap &= domain->domain->pgsize_bitmap; - mutex_unlock(&iommu->lock); + iommu->pgsize_bitmap &= domain->domain->pgsize_bitmap; /* * In case the IOMMU supports page sizes smaller than PAGE_SIZE @@ -823,12 +823,10 @@ static unsigned long vfio_pgsize_bitmap(struct vfio_iommu *iommu) * granularity while iommu driver can use the sub-PAGE_SIZE size * to map the buffer. */ - if (bitmap & ~PAGE_MASK) { - bitmap &= PAGE_MASK; - bitmap |= PAGE_SIZE; + if (iommu->pgsize_bitmap & ~PAGE_MASK) { + iommu->pgsize_bitmap &= PAGE_MASK; + iommu->pgsize_bitmap |= PAGE_SIZE; } - - return bitmap; } static int vfio_dma_do_unmap(struct vfio_iommu *iommu, @@ -839,19 +837,28 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu, size_t unmapped = 0; int ret = 0, retries = 0; - mask = ((uint64_t)1 << __ffs(vfio_pgsize_bitmap(iommu))) - 1; + mutex_lock(&iommu->lock); + + mask = ((uint64_t)1 << __ffs(iommu->pgsize_bitmap)) - 1; + + if (unmap->iova & mask) { + ret = -EINVAL; + goto unlock; + } + + if (!unmap->size || unmap->size & mask) { + ret = -EINVAL; + goto unlock; + } - if (unmap->iova & mask) - return -EINVAL; - if (!unmap->size || unmap->size & mask) - return -EINVAL; if (unmap->iova + unmap->size - 1 < unmap->iova || - unmap->size > SIZE_MAX) - return -EINVAL; + unmap->size > SIZE_MAX) { + ret = -EINVAL; + goto unlock; + } WARN_ON(mask & PAGE_MASK); again: - mutex_lock(&iommu->lock); /* * vfio-iommu-type1 (v1) - User mappings were coalesced together to @@ -930,6 +937,7 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu, blocking_notifier_call_chain(&iommu->notifier, VFIO_IOMMU_NOTIFY_DMA_UNMAP, &nb_unmap); + mutex_lock(&iommu->lock); goto again; } unmapped += dma->size; @@ -1045,24 +1053,28 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu, if (map->size != size || map->vaddr != vaddr || map->iova != iova) return -EINVAL; - mask = ((uint64_t)1 << __ffs(vfio_pgsize_bitmap(iommu))) - 1; - - WARN_ON(mask & PAGE_MASK); - /* READ/WRITE from device perspective */ if (map->flags & VFIO_DMA_MAP_FLAG_WRITE) prot |= IOMMU_WRITE; if (map->flags & VFIO_DMA_MAP_FLAG_READ) prot |= IOMMU_READ; - if (!prot || !size || (size | iova | vaddr) & mask) - return -EINVAL; + mutex_lock(&iommu->lock); - /* Don't allow IOVA or virtual address wrap */ - if (iova + size - 1 < iova || vaddr + size - 1 < vaddr) - return -EINVAL; + mask = ((uint64_t)1 << __ffs(iommu->pgsize_bitmap)) - 1; - mutex_lock(&iommu->lock); + WARN_ON(mask & PAGE_MASK); + + if (!prot || !size || (size | iova | vaddr) & mask) { + ret = -EINVAL; + goto out_unlock; + } + + /* Don't allow IOVA or virtual address wrap */ + if (iova + size - 1 < iova || vaddr + size - 1 < vaddr) { + ret = -EINVAL; + goto out_unlock; + } if (vfio_find_dma(iommu, iova, size)) { ret = -EEXIST; @@ -1793,6 +1805,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, } list_add(&domain->next, &iommu->domain_list); + vfio_pgsize_bitmap(iommu); done: /* Delete the old one and insert new iova list */ vfio_iommu_iova_insert_copy(iommu, &iova_copy); @@ -2004,6 +2017,7 @@ static void vfio_iommu_type1_detach_group(void *iommu_data, list_del(&domain->next); kfree(domain); vfio_iommu_aper_expand(iommu, &iova_copy); + vfio_pgsize_bitmap(iommu); } break; } @@ -2136,8 +2150,6 @@ static int vfio_iommu_iova_build_caps(struct vfio_iommu *iommu, size_t size; int iovas = 0, i = 0, ret; - mutex_lock(&iommu->lock); - list_for_each_entry(iova, &iommu->iova_list, list) iovas++; @@ -2146,17 +2158,14 @@ static int vfio_iommu_iova_build_caps(struct vfio_iommu *iommu, * Return 0 as a container with a single mdev device * will have an empty list */ - ret = 0; - goto out_unlock; + return 0; } size = sizeof(*cap_iovas) + (iovas * sizeof(*cap_iovas->iova_ranges)); cap_iovas = kzalloc(size, GFP_KERNEL); - if (!cap_iovas) { - ret = -ENOMEM; - goto out_unlock; - } + if (!cap_iovas) + return -ENOMEM; cap_iovas->nr_iovas = iovas; @@ -2169,8 +2178,6 @@ static int vfio_iommu_iova_build_caps(struct vfio_iommu *iommu, ret = vfio_iommu_iova_add_cap(caps, cap_iovas, size); kfree(cap_iovas); -out_unlock: - mutex_unlock(&iommu->lock); return ret; } @@ -2215,11 +2222,13 @@ static long vfio_iommu_type1_ioctl(void *iommu_data, info.cap_offset = 0; /* output, no-recopy necessary */ } + mutex_lock(&iommu->lock); info.flags = VFIO_IOMMU_INFO_PGSIZES; - info.iova_pgsizes = vfio_pgsize_bitmap(iommu); + info.iova_pgsizes = iommu->pgsize_bitmap; ret = vfio_iommu_iova_build_caps(iommu, &caps); + mutex_unlock(&iommu->lock); if (ret) return ret; From patchwork Wed May 13 20:04:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kirti Wankhede X-Patchwork-Id: 282817 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.5 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C08F9C433E0 for ; Wed, 13 May 2020 20:41:42 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id EB0D3205ED for ; Wed, 13 May 2020 20:41:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=nvidia.com header.i=@nvidia.com header.b="g4rC62X8" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EB0D3205ED Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=nvidia.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:54842 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYyCD-0005cx-NU for qemu-devel@archiver.kernel.org; Wed, 13 May 2020 16:41:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54416) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYy9E-000827-NJ for qemu-devel@nongnu.org; Wed, 13 May 2020 16:38:36 -0400 Received: from hqnvemgate26.nvidia.com ([216.228.121.65]:12780) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYy9C-0006yv-Rx for qemu-devel@nongnu.org; Wed, 13 May 2020 16:38:36 -0400 Received: from hqpgpgate102.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate26.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Wed, 13 May 2020 13:38:21 -0700 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate102.nvidia.com (PGP Universal service); Wed, 13 May 2020 13:38:33 -0700 X-PGP-Universal: processed; by hqpgpgate102.nvidia.com on Wed, 13 May 2020 13:38:33 -0700 Received: from HQMAIL107.nvidia.com (172.20.187.13) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Wed, 13 May 2020 20:38:32 +0000 Received: from kwankhede-dev.nvidia.com (172.20.13.39) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1473.3 via Frontend Transport; Wed, 13 May 2020 20:38:26 +0000 From: Kirti Wankhede To: , Subject: [PATCH Kernel v19 6/8] vfio iommu: Update UNMAP_DMA ioctl to get dirty bitmap before unmap Date: Thu, 14 May 2020 01:34:37 +0530 Message-ID: <1589400279-28522-7-git-send-email-kwankhede@nvidia.com> X-Mailer: git-send-email 2.7.0 In-Reply-To: <1589400279-28522-1-git-send-email-kwankhede@nvidia.com> References: <1589400279-28522-1-git-send-email-kwankhede@nvidia.com> X-NVConfidentiality: public MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1589402301; bh=9LD0a/Z1ShtI/FKdxjy1MYmdGpWtLaS09SsstiNYbcw=; h=X-PGP-Universal:From:To:CC:Subject:Date:Message-ID:X-Mailer: In-Reply-To:References:X-NVConfidentiality:MIME-Version: Content-Type; b=g4rC62X8V/1QYSShMfUbYXJcsT9OGElk/QctCmltmW3oqeeRIGhmas1BPwx/WG2Sn /Emv/YkNLlh1vWu03uNa7rqN7z7LKlvVBv6ZD2Y2khkC/TUWpN04EybnlI8IjmUbMn 3s0zkuGNoEItSRwjJovY5FFKSafJc57JeJCD5Kv9ph+gDW89WfqomBeIgUilsxLTKL uz4SEz5PUZV8D2mHWxndmFrOKerfpt9NjBYGRUqtdWbe9SAh+/MarexEZfUfYtdvsS hr9JeYj20Ab1oMh+azO8hupQs9g+GQE8jyO5Y9BCqIABoGRXpWX0dY4qbUJpJHm5/v uBGqKbuA1y62Q== Received-SPF: pass client-ip=216.228.121.65; envelope-from=kwankhede@nvidia.com; helo=hqnvemgate26.nvidia.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/05/13 16:38:33 X-ACL-Warn: Detected OS = Windows 7 or 8 [fuzzy] X-Spam_score_int: -70 X-Spam_score: -7.1 X-Spam_bar: ------- X-Spam_report: (-7.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Zhengxiao.zx@Alibaba-inc.com, kevin.tian@intel.com, yi.l.liu@intel.com, yan.y.zhao@intel.com, kvm@vger.kernel.org, eskultet@redhat.com, ziye.yang@intel.com, qemu-devel@nongnu.org, cohuck@redhat.com, shuangtai.tst@alibaba-inc.com, dgilbert@redhat.com, zhi.a.wang@intel.com, mlevitsk@redhat.com, pasic@linux.ibm.com, aik@ozlabs.ru, Kirti Wankhede , eauger@redhat.com, felipe@nutanix.com, jonathan.davies@nutanix.com, changpeng.liu@intel.com, Ken.Xue@amd.com Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" DMA mapped pages, including those pinned by mdev vendor drivers, might get unpinned and unmapped while migration is active and device is still running. For example, in pre-copy phase while guest driver could access those pages, host device or vendor driver can dirty these mapped pages. Such pages should be marked dirty so as to maintain memory consistency for a user making use of dirty page tracking. To get bitmap during unmap, user should allocate memory for bitmap, set size of allocated memory, set page size to be considered for bitmap and set flag VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP. Signed-off-by: Kirti Wankhede Reviewed-by: Neo Jia --- drivers/vfio/vfio_iommu_type1.c | 102 +++++++++++++++++++++++++++++++++++----- include/uapi/linux/vfio.h | 10 ++++ 2 files changed, 99 insertions(+), 13 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 469b09185b83..4358be26ff80 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -195,11 +195,15 @@ static void vfio_unlink_dma(struct vfio_iommu *iommu, struct vfio_dma *old) static int vfio_dma_bitmap_alloc(struct vfio_dma *dma, size_t pgsize) { uint64_t npages = dma->size / pgsize; + size_t bitmap_size; if (npages > DIRTY_BITMAP_PAGES_MAX) return -EINVAL; - dma->bitmap = kvzalloc(DIRTY_BITMAP_BYTES(npages), GFP_KERNEL); + /* Allocate extra 64 bits which are used for bitmap manipulation */ + bitmap_size = DIRTY_BITMAP_BYTES(npages) + sizeof(u64); + + dma->bitmap = kvzalloc(bitmap_size, GFP_KERNEL); if (!dma->bitmap) return -ENOMEM; @@ -979,23 +983,25 @@ static int verify_bitmap_size(uint64_t npages, uint64_t bitmap_size) } static int vfio_dma_do_unmap(struct vfio_iommu *iommu, - struct vfio_iommu_type1_dma_unmap *unmap) + struct vfio_iommu_type1_dma_unmap *unmap, + struct vfio_bitmap *bitmap) { - uint64_t mask; struct vfio_dma *dma, *dma_last = NULL; - size_t unmapped = 0; - int ret = 0, retries = 0; + size_t unmapped = 0, pgsize; + int ret = 0, retries = 0, cnt = 0; + unsigned long pgshift, shift = 0, leftover; mutex_lock(&iommu->lock); - mask = ((uint64_t)1 << __ffs(iommu->pgsize_bitmap)) - 1; + pgshift = __ffs(iommu->pgsize_bitmap); + pgsize = (size_t)1 << pgshift; - if (unmap->iova & mask) { + if (unmap->iova & (pgsize - 1)) { ret = -EINVAL; goto unlock; } - if (!unmap->size || unmap->size & mask) { + if (!unmap->size || unmap->size & (pgsize - 1)) { ret = -EINVAL; goto unlock; } @@ -1006,9 +1012,15 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu, goto unlock; } - WARN_ON(mask & PAGE_MASK); -again: + /* When dirty tracking is enabled, allow only min supported pgsize */ + if ((unmap->flags & VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP) && + (!iommu->dirty_page_tracking || (bitmap->pgsize != pgsize))) { + ret = -EINVAL; + goto unlock; + } + WARN_ON((pgsize - 1) & PAGE_MASK); +again: /* * vfio-iommu-type1 (v1) - User mappings were coalesced together to * avoid tracking individual mappings. This means that the granularity @@ -1046,6 +1058,7 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu, ret = -EINVAL; goto unlock; } + dma = vfio_find_dma(iommu, unmap->iova + unmap->size - 1, 0); if (dma && dma->iova + dma->size != unmap->iova + unmap->size) { ret = -EINVAL; @@ -1063,6 +1076,39 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu, if (dma->task->mm != current->mm) break; + if ((unmap->flags & VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP) && + (dma_last != dma)) { + unsigned int nbits = dma->size >> pgshift; + int curr_lcnt = nbits / BITS_PER_LONG; + + /* + * mark all pages dirty if all pages are pinned and + * mapped. + */ + if (dma->iommu_mapped) + bitmap_set(dma->bitmap, 0, nbits); + + if (shift) { + bitmap_shift_left(dma->bitmap, dma->bitmap, + shift, nbits + shift); + bitmap_or(dma->bitmap, dma->bitmap, &leftover, + shift); + nbits += shift; + curr_lcnt = nbits / BITS_PER_LONG; + } + + if (copy_to_user((void __user *)bitmap->data + cnt, + dma->bitmap, curr_lcnt * sizeof(u64))) { + ret = -EFAULT; + break; + } + + shift = nbits % BITS_PER_LONG; + if (shift) + leftover = *(u64 *)(dma->bitmap + curr_lcnt); + cnt += curr_lcnt; + } + if (!RB_EMPTY_ROOT(&dma->pfn_list)) { struct vfio_iommu_type1_dma_unmap nb_unmap; @@ -1093,6 +1139,13 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu, vfio_remove_dma(iommu, dma); } + if (!ret && (unmap->flags & VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP) && + shift) { + if (copy_to_user((void __user *)bitmap->data + cnt, &leftover, + sizeof(leftover))) + ret = -EFAULT; + } + unlock: mutex_unlock(&iommu->lock); @@ -2426,17 +2479,40 @@ static long vfio_iommu_type1_ioctl(void *iommu_data, } else if (cmd == VFIO_IOMMU_UNMAP_DMA) { struct vfio_iommu_type1_dma_unmap unmap; - long ret; + struct vfio_bitmap bitmap = { 0 }; + int ret; minsz = offsetofend(struct vfio_iommu_type1_dma_unmap, size); if (copy_from_user(&unmap, (void __user *)arg, minsz)) return -EFAULT; - if (unmap.argsz < minsz || unmap.flags) + if (unmap.argsz < minsz || + unmap.flags & ~VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP) return -EINVAL; - ret = vfio_dma_do_unmap(iommu, &unmap); + if (unmap.flags & VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP) { + unsigned long pgshift; + + if (unmap.argsz < (minsz + sizeof(bitmap))) + return -EINVAL; + + if (copy_from_user(&bitmap, + (void __user *)(arg + minsz), + sizeof(bitmap))) + return -EFAULT; + + if (!access_ok((void __user *)bitmap.data, bitmap.size)) + return -EINVAL; + + pgshift = __ffs(bitmap.pgsize); + ret = verify_bitmap_size(unmap.size >> pgshift, + bitmap.size); + if (ret) + return ret; + } + + ret = vfio_dma_do_unmap(iommu, &unmap, &bitmap); if (ret) return ret; diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 5f359c63f5ef..e3cbf8b78623 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -1048,12 +1048,22 @@ struct vfio_bitmap { * field. No guarantee is made to the user that arbitrary unmaps of iova * or size different from those used in the original mapping call will * succeed. + * VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP should be set to get dirty bitmap + * before unmapping IO virtual addresses. When this flag is set, user must + * provide data[] as structure vfio_bitmap. User must allocate memory to get + * bitmap and must set size of allocated memory in vfio_bitmap.size field. + * A bit in bitmap represents one page of user provided page size in 'pgsize', + * consecutively starting from iova offset. Bit set indicates page at that + * offset from iova is dirty. Bitmap of pages in the range of unmapped size is + * returned in vfio_bitmap.data */ struct vfio_iommu_type1_dma_unmap { __u32 argsz; __u32 flags; +#define VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP (1 << 0) __u64 iova; /* IO virtual address */ __u64 size; /* Size of mapping (bytes) */ + __u8 data[]; }; #define VFIO_IOMMU_UNMAP_DMA _IO(VFIO_TYPE, VFIO_BASE + 14)