mbox series

[v7,0/3] Add Intel VT-d nested translation (part 2/2)

Message ID 20231117131816.24359-1-yi.l.liu@intel.com
Headers show
Series Add Intel VT-d nested translation (part 2/2) | expand

Message

Yi Liu Nov. 17, 2023, 1:18 p.m. UTC
This is the second part to add Intel VT-d nested translation based on IOMMUFD
nesting infrastructure. As the iommufd nesting infrastructure series [1],
iommu core supports new ops to invalidate the cache after the modifictions
in stage-1 page table. So far, the cache invalidation data is vendor specific,
the data_type (IOMMU_HWPT_DATA_VTD_S1) defined for the vendor specific HWPT
allocation is reused in the cache invalidation path. User should provide the
correct data_type that suit with the type used in HWPT allocation.

IOMMU_HWPT_INVALIDATE iotcl returns an error in @out_driver_error_code. However
Intel VT-d does not define error code so far, so it's not easy to pre-define it
in iommufd neither. As a result, this field should just be ignored on VT-d platform.

Complete code can be found in [2], corresponding QEMU could can be found in [3].

[1] https://lore.kernel.org/linux-iommu/20231117130717.19875-1-yi.l.liu@intel.com/#t
[2] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting
[3] https://github.com/yiliu1765/qemu/tree/zhenzhong/wip/iommufd_nesting_rfcv1

Change log:

v7:
 - No much change, just rebase on top of 6.7-rc1

v6: https://lore.kernel.org/linux-iommu/20231020093719.18725-1-yi.l.liu@intel.com/
 - Address comments from Kevin
 - Split the VT-d nesting series into two parts (Jason)

v5: https://lore.kernel.org/linux-iommu/20230921075431.125239-1-yi.l.liu@intel.com/
 - Add Kevin's r-b for patch 2, 3 ,5 8, 10
 - Drop enforce_cache_coherency callback from the nested type domain ops (Kevin)
 - Remove duplicate agaw check in patch 04 (Kevin)
 - Remove duplicate domain_update_iommu_cap() in patch 06 (Kevin)
 - Check parent's force_snooping to set pgsnp in the pasid entry (Kevin)
 - uapi data structure check (Kevin)
 - Simplify the errata handling as user can allocate nested parent domain

v4: https://lore.kernel.org/linux-iommu/20230724111335.107427-1-yi.l.liu@intel.com/
 - Remove ascii art tables (Jason)
 - Drop EMT (Tina, Jason)
 - Drop MTS and related definitions (Kevin)
 - Rename macro IOMMU_VTD_PGTBL_ to IOMMU_VTD_S1_ (Kevin)
 - Rename struct iommu_hwpt_intel_vtd_ to iommu_hwpt_vtd_ (Kevin)
 - Rename struct iommu_hwpt_intel_vtd to iommu_hwpt_vtd_s1 (Kevin)
 - Put the vendor specific hwpt alloc data structure before enuma iommu_hwpt_type (Kevin)
 - Do not trim the higher page levels of S2 domain in nested domain attachment as the
   S2 domain may have been used independently. (Kevin)
 - Remove the first-stage pgd check against the maximum address of s2_domain as hw
   can check it anyhow. It makes sense to check every pfns used in the stage-1 page
   table. But it cannot make it. So just leave it to hw. (Kevin)
 - Split the iotlb flush part into an order of uapi, helper and callback implementation (Kevin)
 - Change the policy of VT-d nesting errata, disallow RO mapping once a domain is used
   as parent domain of a nested domain. This removes the nested_users counting. (Kevin)
 - Minor fix for "make htmldocs"

v3: https://lore.kernel.org/linux-iommu/20230511145110.27707-1-yi.l.liu@intel.com/
 - Further split the patches into an order of adding helpers for nested
   domain, iotlb flush, nested domain attachment and nested domain allocation
   callback, then report the hw_info to userspace.
 - Add batch support in cache invalidation from userspace
 - Disallow nested translation usage if RO mappings exists in stage-2 domain
   due to errata on readonly mappings on Sapphire Rapids platform.

v2: https://lore.kernel.org/linux-iommu/20230309082207.612346-1-yi.l.liu@intel.com/
 - The iommufd infrastructure is split to be separate series.

v1: https://lore.kernel.org/linux-iommu/20230209043153.14964-1-yi.l.liu@intel.com/

Regards,
	Yi Liu

Yi Liu (3):
  iommufd: Add data structure for Intel VT-d stage-1 cache invalidation
  iommu/vt-d: Make iotlb flush helpers to be extern
  iommu/vt-d: Add iotlb flush for nested domain

 drivers/iommu/intel/iommu.c  | 10 +++----
 drivers/iommu/intel/iommu.h  |  6 ++++
 drivers/iommu/intel/nested.c | 54 ++++++++++++++++++++++++++++++++++++
 include/uapi/linux/iommufd.h | 36 ++++++++++++++++++++++++
 4 files changed, 101 insertions(+), 5 deletions(-)

Comments

Jason Gunthorpe Dec. 6, 2023, 6:56 p.m. UTC | #1
On Fri, Nov 17, 2023 at 05:18:16AM -0800, Yi Liu wrote:
> +static int intel_nested_cache_invalidate_user(struct iommu_domain *domain,
> +					      struct iommu_user_data_array *array,
> +					      u32 *cerror_idx)
> +{
> +	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> +	struct iommu_hwpt_vtd_s1_invalidate inv_info;
> +	u32 index;
> +	int ret;
> +
> +	/* REVISIT:
> +	 * VT-d has defined ITE, ICE, IQE for invalidation failure per hardware,
> +	 * but no error code yet, so just set the error code to be 0.
> +	 */
> +	*cerror_idx = 0;
> +
> +	for (index = 0; index < array->entry_num; index++) {
> +		ret = iommu_copy_struct_from_user_array(&inv_info, array,
> +							IOMMU_HWPT_DATA_VTD_S1,
> +							index, __reserved);
> +		if (ret) {
> +			pr_err_ratelimited("Failed to fetch invalidation request\n");
> +			break;

No error prints on ioctls!

> +		if (inv_info.addr == 0 && inv_info.npages == -1)
> +			intel_flush_iotlb_all(domain);

-1 is clearer written as U64_MAX - same remark for the comment
documenting it.

Jason
Yi Liu Dec. 12, 2023, 3:58 a.m. UTC | #2
On 2023/11/20 16:32, Tian, Kevin wrote:
>> From: Liu, Yi L <yi.l.liu@intel.com>
>> Sent: Friday, November 17, 2023 9:18 PM
>> +
>> +		if (inv_info.__reserved || (inv_info.flags &
>> ~IOMMU_VTD_INV_FLAGS_LEAF) ||
>> +		    !IS_ALIGNED(inv_info.addr, VTD_PAGE_SIZE)) {
>> +			ret = -EINVAL;
>> +			break;
>> +		}
> 
> -EOPNOTSUPP for the first two checks.
> 
yes.
Yi Liu Dec. 12, 2023, 3:59 a.m. UTC | #3
On 2023/12/7 02:56, Jason Gunthorpe wrote:
> On Fri, Nov 17, 2023 at 05:18:16AM -0800, Yi Liu wrote:
>> +static int intel_nested_cache_invalidate_user(struct iommu_domain *domain,
>> +					      struct iommu_user_data_array *array,
>> +					      u32 *cerror_idx)
>> +{
>> +	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
>> +	struct iommu_hwpt_vtd_s1_invalidate inv_info;
>> +	u32 index;
>> +	int ret;
>> +
>> +	/* REVISIT:
>> +	 * VT-d has defined ITE, ICE, IQE for invalidation failure per hardware,
>> +	 * but no error code yet, so just set the error code to be 0.
>> +	 */
>> +	*cerror_idx = 0;
>> +
>> +	for (index = 0; index < array->entry_num; index++) {
>> +		ret = iommu_copy_struct_from_user_array(&inv_info, array,
>> +							IOMMU_HWPT_DATA_VTD_S1,
>> +							index, __reserved);
>> +		if (ret) {
>> +			pr_err_ratelimited("Failed to fetch invalidation request\n");
>> +			break;
> 
> No error prints on ioctls!

ok, will remove it.

> 
>> +		if (inv_info.addr == 0 && inv_info.npages == -1)
>> +			intel_flush_iotlb_all(domain);
> 
> -1 is clearer written as U64_MAX - same remark for the comment
> documenting it.

sure.