Message ID | 20231020093246.17015-1-yi.l.liu@intel.com |
---|---|
Headers | show |
Series | Add Intel VT-d nested translation (part 1/2) | expand |
On 10/20/23 5:32 PM, Yi Liu wrote: > From: Lu Baolu <baolu.lu@linux.intel.com> > > When remapping hardware is configured by system software in scalable mode > as Nested (PGTT=011b) and with PWSNP field Set in the PASID-table-entry, > it may Set Accessed bit and Dirty bit (and Extended Access bit if enabled) > in first-stage page-table entries even when second-stage mappings indicate > that corresponding first-stage page-table is Read-Only. > > As the result, contents of pages designated by VMM as Read-Only can be > modified by IOMMU via PML5E (PML4E for 4-level tables) access as part of > address translation process due to DMAs issued by Guest. > > This disallows read-only mappings in the domain that is supposed to be used > as nested parent. Reference from Sapphire Rapids Specification Update [1], > errata details, SPR17. Userspace should know this limitation by checking > the IOMMU_HW_INFO_VTD_ERRATA_772415_SPR17 flag reported in the IOMMU_GET_HW_INFO > ioctl. > > [1] https://www.intel.com/content/www/us/en/content-details/772415/content-details.html > > Reviewed-by: Kevin Tian <kevin.tian@intel.com> > Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> > Signed-off-by: Yi Liu <yi.l.liu@intel.com> > --- > drivers/iommu/intel/iommu.c | 9 +++++++++ > drivers/iommu/intel/iommu.h | 1 + > include/uapi/linux/iommufd.h | 12 +++++++++++- > 3 files changed, 21 insertions(+), 1 deletion(-) > > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c > index c7704e7efd4a..a0341a069fbf 100644 > --- a/drivers/iommu/intel/iommu.c > +++ b/drivers/iommu/intel/iommu.c > @@ -2193,6 +2193,11 @@ __domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn, > if ((prot & (DMA_PTE_READ|DMA_PTE_WRITE)) == 0) > return -EINVAL; > > + if (!(prot & DMA_PTE_WRITE) && domain->is_nested_parent) { > + pr_err_ratelimited("Read-only mapping is disallowed on the domain which serves as the parent in a nested configuration, due to HW errata (ERRATA_772415_SPR17)\n"); > + return -EINVAL; > + } > + > attr = prot & (DMA_PTE_READ | DMA_PTE_WRITE | DMA_PTE_SNP); > attr |= DMA_FL_PTE_PRESENT; > if (domain->use_first_level) { > @@ -4101,6 +4106,9 @@ intel_iommu_domain_alloc_user(struct device *dev, u32 flags, > domain = iommu_domain_alloc(dev->bus); > if (!domain) > return ERR_PTR(-ENOMEM); > + container_of(domain, > + struct dmar_domain, > + domain)->is_nested_parent = request_nest_parent; How about to_dmar_domain(domain)->is_nested_parent = ...; ? I would also prefer to introduce is_nested_parent_domain to the user domain allocation patch (patch 7/8). This field should be checked when allocating a nested user domain. diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 8f81a5c9fcc0..d3f6bc1f6590 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -4121,6 +4121,8 @@ intel_iommu_domain_alloc_user(struct device *dev, u32 flags, return ERR_PTR(-EINVAL); if (request_nest_parent) return ERR_PTR(-EINVAL); + if (!to_dmar_domain(parent)->is_nested_parent) + return ERR_PTR(-EINVAL); return intel_nested_domain_alloc(parent, user_data); } Best regards, baolu
> From: Baolu Lu <baolu.lu@linux.intel.com> > Sent: Friday, October 20, 2023 7:49 PM > > On 2023/10/20 17:32, Yi Liu wrote: > > From: Lu Baolu <baolu.lu@linux.intel.com> > > > > This adds helper for accepting user parameters and allocate a nested > > domain. > > > > Reviewed-by: Kevin Tian <kevin.tian@intel.com> > > Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> > > Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> > > Signed-off-by: Yi Liu <yi.l.liu@intel.com> > > --- > > drivers/iommu/intel/Makefile | 2 +- > > drivers/iommu/intel/iommu.h | 2 ++ > > drivers/iommu/intel/nested.c | 55 ++++++++++++++++++++++++++++++++++++ > > 3 files changed, 58 insertions(+), 1 deletion(-) > > create mode 100644 drivers/iommu/intel/nested.c > > > > diff --git a/drivers/iommu/intel/Makefile b/drivers/iommu/intel/Makefile > > index 7af3b8a4f2a0..5dabf081a779 100644 > > --- a/drivers/iommu/intel/Makefile > > +++ b/drivers/iommu/intel/Makefile > > @@ -1,6 +1,6 @@ > > # SPDX-License-Identifier: GPL-2.0 > > obj-$(CONFIG_DMAR_TABLE) += dmar.o > > -obj-$(CONFIG_INTEL_IOMMU) += iommu.o pasid.o > > +obj-$(CONFIG_INTEL_IOMMU) += iommu.o pasid.o nested.o > > obj-$(CONFIG_DMAR_TABLE) += trace.o cap_audit.o > > obj-$(CONFIG_DMAR_PERF) += perf.o > > obj-$(CONFIG_INTEL_IOMMU_DEBUGFS) += debugfs.o > > diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h > > index a91077a063ee..ff55184456dd 100644 > > --- a/drivers/iommu/intel/iommu.h > > +++ b/drivers/iommu/intel/iommu.h > > @@ -866,6 +866,8 @@ void *alloc_pgtable_page(int node, gfp_t gfp); > > void free_pgtable_page(void *vaddr); > > void iommu_flush_write_buffer(struct intel_iommu *iommu); > > struct intel_iommu *device_to_iommu(struct device *dev, u8 *bus, u8 *devfn); > > +struct iommu_domain *intel_nested_domain_alloc(struct iommu_domain > *s2_domain, > > + const struct iommu_user_data *user_data); > > > > #ifdef CONFIG_INTEL_IOMMU_SVM > > void intel_svm_check(struct intel_iommu *iommu); > > diff --git a/drivers/iommu/intel/nested.c b/drivers/iommu/intel/nested.c > > new file mode 100644 > > index 000000000000..5a2920a98e47 > > --- /dev/null > > +++ b/drivers/iommu/intel/nested.c > > @@ -0,0 +1,55 @@ > > +// SPDX-License-Identifier: GPL-2.0 > > +/* > > + * nested.c - nested mode translation support > > + * > > + * Copyright (C) 2023 Intel Corporation > > + * > > + * Author: Lu Baolu <baolu.lu@linux.intel.com> > > + * Jacob Pan <jacob.jun.pan@linux.intel.com> > > + * Yi Liu <yi.l.liu@intel.com> > > + */ > > + > > +#define pr_fmt(fmt) "DMAR: " fmt > > + > > +#include <linux/iommu.h> > > + > > +#include "iommu.h" > > + > > +static void intel_nested_domain_free(struct iommu_domain *domain) > > +{ > > + kfree(to_dmar_domain(domain)); > > +} > > + > > +static const struct iommu_domain_ops intel_nested_domain_ops = { > > + .free = intel_nested_domain_free, > > +}; > > + > > +struct iommu_domain *intel_nested_domain_alloc(struct iommu_domain > *s2_domain, > > + const struct iommu_user_data *user_data) > > +{ > > + struct iommu_hwpt_vtd_s1 vtd; > > + struct dmar_domain *domain; > > + int ret; > > + > > + ret = iommu_copy_struct_from_user(&vtd, user_data, > > + IOMMU_HWPT_DATA_VTD_S1, __reserved); > > + if (ret) > > + return ERR_PTR(ret); > > + > > + domain = kzalloc(sizeof(*domain), GFP_KERNEL_ACCOUNT); > > + if (!domain) > > + return NULL; > > return ERR_PTR(-ENOMEM); > ? Yes, good catch! > > + > > + domain->use_first_level = true; > > + domain->s2_domain = to_dmar_domain(s2_domain); > > + domain->s1_pgtbl = vtd.pgtbl_addr; > > + domain->s1_cfg = vtd; > > + domain->domain.ops = &intel_nested_domain_ops; > > + domain->domain.type = IOMMU_DOMAIN_NESTED; > > + INIT_LIST_HEAD(&domain->devices); > > + INIT_LIST_HEAD(&domain->dev_pasids); > > + spin_lock_init(&domain->lock); > > + xa_init(&domain->iommu_array); > > + > > + return &domain->domain; > > +} > > Best regards, > baolu
> From: Baolu Lu <baolu.lu@linux.intel.com> > Sent: Saturday, October 21, 2023 11:24 AM > > On 10/20/23 5:32 PM, Yi Liu wrote: > > From: Lu Baolu <baolu.lu@linux.intel.com> > > > > When remapping hardware is configured by system software in scalable mode > > as Nested (PGTT=011b) and with PWSNP field Set in the PASID-table-entry, > > it may Set Accessed bit and Dirty bit (and Extended Access bit if enabled) > > in first-stage page-table entries even when second-stage mappings indicate > > that corresponding first-stage page-table is Read-Only. > > > > As the result, contents of pages designated by VMM as Read-Only can be > > modified by IOMMU via PML5E (PML4E for 4-level tables) access as part of > > address translation process due to DMAs issued by Guest. > > > > This disallows read-only mappings in the domain that is supposed to be used > > as nested parent. Reference from Sapphire Rapids Specification Update [1], > > errata details, SPR17. Userspace should know this limitation by checking > > the IOMMU_HW_INFO_VTD_ERRATA_772415_SPR17 flag reported in the > IOMMU_GET_HW_INFO > > ioctl. > > > > [1] https://www.intel.com/content/www/us/en/content-details/772415/content- > details.html > > > > Reviewed-by: Kevin Tian <kevin.tian@intel.com> > > Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> > > Signed-off-by: Yi Liu <yi.l.liu@intel.com> > > --- > > drivers/iommu/intel/iommu.c | 9 +++++++++ > > drivers/iommu/intel/iommu.h | 1 + > > include/uapi/linux/iommufd.h | 12 +++++++++++- > > 3 files changed, 21 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c > > index c7704e7efd4a..a0341a069fbf 100644 > > --- a/drivers/iommu/intel/iommu.c > > +++ b/drivers/iommu/intel/iommu.c > > @@ -2193,6 +2193,11 @@ __domain_mapping(struct dmar_domain *domain, > unsigned long iov_pfn, > > if ((prot & (DMA_PTE_READ|DMA_PTE_WRITE)) == 0) > > return -EINVAL; > > > > + if (!(prot & DMA_PTE_WRITE) && domain->is_nested_parent) { > > + pr_err_ratelimited("Read-only mapping is disallowed on the domain > which serves as the parent in a nested configuration, due to HW errata > (ERRATA_772415_SPR17)\n"); > > + return -EINVAL; > > + } > > + > > attr = prot & (DMA_PTE_READ | DMA_PTE_WRITE | DMA_PTE_SNP); > > attr |= DMA_FL_PTE_PRESENT; > > if (domain->use_first_level) { > > @@ -4101,6 +4106,9 @@ intel_iommu_domain_alloc_user(struct device *dev, u32 > flags, > > domain = iommu_domain_alloc(dev->bus); > > if (!domain) > > return ERR_PTR(-ENOMEM); > > + container_of(domain, > > + struct dmar_domain, > > + domain)->is_nested_parent = request_nest_parent; > > How about > to_dmar_domain(domain)->is_nested_parent = ...; > ? Yes. > > I would also prefer to introduce is_nested_parent_domain to the user > domain allocation patch (patch 7/8). This field should be checked when > allocating a nested user domain. A ctually, no need. This should be a common check, so iommufd core already has the check. So the parent should be a nest parent domain, otherwise already returned in iommufd. + if (!parent->nest_parent) + return ERR_PTR(-EINVAL); https://lore.kernel.org/linux-iommu/20231020091946.12173-8-yi.l.liu@intel.com/ > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c > index 8f81a5c9fcc0..d3f6bc1f6590 100644 > --- a/drivers/iommu/intel/iommu.c > +++ b/drivers/iommu/intel/iommu.c > @@ -4121,6 +4121,8 @@ intel_iommu_domain_alloc_user(struct device *dev, > u32 flags, > return ERR_PTR(-EINVAL); > if (request_nest_parent) > return ERR_PTR(-EINVAL); > + if (!to_dmar_domain(parent)->is_nested_parent) > + return ERR_PTR(-EINVAL); > > return intel_nested_domain_alloc(parent, user_data); > } > > Best regards, > baolu
On 2023/10/23 19:15, Liu, Yi L wrote: >> I would also prefer to introduce is_nested_parent_domain to the user >> domain allocation patch (patch 7/8). This field should be checked when >> allocating a nested user domain. > A ctually, no need. This should be a common check, so iommufd core already > has the check. So the parent should be a nest parent domain, otherwise already > returned in iommufd. > > + if (!parent->nest_parent) > + return ERR_PTR(-EINVAL); I know this will not cause errors in the code. But since you are introducing is_parent property in the vt-d driver. The integrity of the property should be ensured. In this way, it will make the code more readable and maintainable. Best regards, baolu