diff mbox series

[v2,17/19] iommu/arm-smmu-v3: Add arm_smmu_viommu_cache_invalidate

Message ID 4b61aba3bc6c1cce628d9db44d5b18ea567a8be1.1724776335.git.nicolinc@nvidia.com
State New
Headers show
Series iommufd: Add VIOMMU infrastructure (Part-1) | expand

Commit Message

Nicolin Chen Aug. 27, 2024, 4:59 p.m. UTC
Add an arm_smmu_viommu_cache_invalidate() function for user space to issue
cache invalidation commands via viommu.

The viommu invalidation takes the same native format of a 128-bit command,
as the hwpt invalidation. Thus, reuse the same driver data structure, but
make it wider to accept CMDQ_OP_ATC_INV and CMDQ_OP_CFGI_CD{_ALL}.

Scan the commands against the supported ist and fix the VMIDs and SIDs.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 76 ++++++++++++++++++++-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  1 +
 include/uapi/linux/iommufd.h                |  7 +-
 3 files changed, 80 insertions(+), 4 deletions(-)

Comments

Tian, Kevin Sept. 11, 2024, 6:25 a.m. UTC | #1
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Friday, September 6, 2024 2:22 AM
> 
> On Thu, Sep 05, 2024 at 11:00:49AM -0700, Nicolin Chen wrote:
> > On Thu, Sep 05, 2024 at 01:20:39PM -0300, Jason Gunthorpe wrote:
> > > On Tue, Aug 27, 2024 at 09:59:54AM -0700, Nicolin Chen wrote:
> > >
> > > > +static int arm_smmu_viommu_cache_invalidate(struct
> iommufd_viommu *viommu,
> > > > +					    struct iommu_user_data_array
> *array)
> > > > +{
> > > > +	struct iommu_domain *domain =
> iommufd_viommu_to_parent_domain(viommu);
> > > > +
> > > > +	return __arm_smmu_cache_invalidate_user(
> > > > +			to_smmu_domain(domain), viommu, array);
> > >
> > > I'd like to have the viommu struct directly hold the VMID. The nested
> > > parent should be sharable between multiple viommus, it doesn't make
> > > any sense that it would hold the vmid.
> > >
> > > This is struggling because it is trying too hard to not have the
> > > driver allocate the viommu, and I think we should just go ahead and do
> > > that. Store the vmid, today copied from the nesting parent in the vmid
> > > private struct. No need for iommufd_viommu_to_parent_domain(), just
> > > rework the APIs to pass the vmid down not a domain.
> >
> > OK. When I designed all this stuff, we still haven't made mind
> > about sharing the s2 domain, i.e. moving the VMID, which might
> > need a couple of more patches to achieve.
> 
> Yes, many more patches, and don't try to do it now.. But we can copy
> the vmid from the s2 and place it in the viommu struct during
> allocation time.
> 

does it assume that a viommu object cannot span multiple physical
IOMMUs so there is only one vmid per viommu?
Nicolin Chen Sept. 11, 2024, 7:20 a.m. UTC | #2
On Wed, Sep 11, 2024 at 06:25:16AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Friday, September 6, 2024 2:22 AM
> >
> > On Thu, Sep 05, 2024 at 11:00:49AM -0700, Nicolin Chen wrote:
> > > On Thu, Sep 05, 2024 at 01:20:39PM -0300, Jason Gunthorpe wrote:
> > > > On Tue, Aug 27, 2024 at 09:59:54AM -0700, Nicolin Chen wrote:
> > > >
> > > > > +static int arm_smmu_viommu_cache_invalidate(struct
> > iommufd_viommu *viommu,
> > > > > +                                           struct iommu_user_data_array
> > *array)
> > > > > +{
> > > > > +       struct iommu_domain *domain =
> > iommufd_viommu_to_parent_domain(viommu);
> > > > > +
> > > > > +       return __arm_smmu_cache_invalidate_user(
> > > > > +                       to_smmu_domain(domain), viommu, array);
> > > >
> > > > I'd like to have the viommu struct directly hold the VMID. The nested
> > > > parent should be sharable between multiple viommus, it doesn't make
> > > > any sense that it would hold the vmid.
> > > >
> > > > This is struggling because it is trying too hard to not have the
> > > > driver allocate the viommu, and I think we should just go ahead and do
> > > > that. Store the vmid, today copied from the nesting parent in the vmid
> > > > private struct. No need for iommufd_viommu_to_parent_domain(), just
> > > > rework the APIs to pass the vmid down not a domain.
> > >
> > > OK. When I designed all this stuff, we still haven't made mind
> > > about sharing the s2 domain, i.e. moving the VMID, which might
> > > need a couple of more patches to achieve.
> >
> > Yes, many more patches, and don't try to do it now.. But we can copy
> > the vmid from the s2 and place it in the viommu struct during
> > allocation time.
> >
> 
> does it assume that a viommu object cannot span multiple physical
> IOMMUs so there is only one vmid per viommu?

I think so. One the reasons of introducing vIOMMU is to maintain
the shareability across physical IOMMUs at the s2 HWPT_PAGING.

Thanks
Nicolin
Baolu Lu Sept. 11, 2024, 7:50 a.m. UTC | #3
On 2024/9/11 15:20, Nicolin Chen wrote:
> On Wed, Sep 11, 2024 at 06:25:16AM +0000, Tian, Kevin wrote:
>>> From: Jason Gunthorpe<jgg@nvidia.com>
>>> Sent: Friday, September 6, 2024 2:22 AM
>>>
>>> On Thu, Sep 05, 2024 at 11:00:49AM -0700, Nicolin Chen wrote:
>>>> On Thu, Sep 05, 2024 at 01:20:39PM -0300, Jason Gunthorpe wrote:
>>>>> On Tue, Aug 27, 2024 at 09:59:54AM -0700, Nicolin Chen wrote:
>>>>>
>>>>>> +static int arm_smmu_viommu_cache_invalidate(struct
>>> iommufd_viommu *viommu,
>>>>>> +                                           struct iommu_user_data_array
>>> *array)
>>>>>> +{
>>>>>> +       struct iommu_domain *domain =
>>> iommufd_viommu_to_parent_domain(viommu);
>>>>>> +
>>>>>> +       return __arm_smmu_cache_invalidate_user(
>>>>>> +                       to_smmu_domain(domain), viommu, array);
>>>>> I'd like to have the viommu struct directly hold the VMID. The nested
>>>>> parent should be sharable between multiple viommus, it doesn't make
>>>>> any sense that it would hold the vmid.
>>>>>
>>>>> This is struggling because it is trying too hard to not have the
>>>>> driver allocate the viommu, and I think we should just go ahead and do
>>>>> that. Store the vmid, today copied from the nesting parent in the vmid
>>>>> private struct. No need for iommufd_viommu_to_parent_domain(), just
>>>>> rework the APIs to pass the vmid down not a domain.
>>>> OK. When I designed all this stuff, we still haven't made mind
>>>> about sharing the s2 domain, i.e. moving the VMID, which might
>>>> need a couple of more patches to achieve.
>>> Yes, many more patches, and don't try to do it now.. But we can copy
>>> the vmid from the s2 and place it in the viommu struct during
>>> allocation time.
>>>
>> does it assume that a viommu object cannot span multiple physical
>> IOMMUs so there is only one vmid per viommu?
> I think so. One the reasons of introducing vIOMMU is to maintain
> the shareability across physical IOMMUs at the s2 HWPT_PAGING.

My understanding of VMID is something like domain id in x86 arch's. Is
my understanding correct?

If a VMID for an S2 hwpt is valid on physical IOMMU A but has already
been allocated for another purpose on physical IOMMU B, how can it be
shared across both IOMMUs? Or the VMID is allocated globally?

Thanks,
baolu
Tian, Kevin Sept. 11, 2024, 8:13 a.m. UTC | #4
> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Wednesday, September 11, 2024 3:21 PM
> 
> On Wed, Sep 11, 2024 at 06:25:16AM +0000, Tian, Kevin wrote:
> > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > Sent: Friday, September 6, 2024 2:22 AM
> > >
> > > On Thu, Sep 05, 2024 at 11:00:49AM -0700, Nicolin Chen wrote:
> > > > On Thu, Sep 05, 2024 at 01:20:39PM -0300, Jason Gunthorpe wrote:
> > > > > On Tue, Aug 27, 2024 at 09:59:54AM -0700, Nicolin Chen wrote:
> > > > >
> > > > > > +static int arm_smmu_viommu_cache_invalidate(struct
> > > iommufd_viommu *viommu,
> > > > > > +                                           struct iommu_user_data_array
> > > *array)
> > > > > > +{
> > > > > > +       struct iommu_domain *domain =
> > > iommufd_viommu_to_parent_domain(viommu);
> > > > > > +
> > > > > > +       return __arm_smmu_cache_invalidate_user(
> > > > > > +                       to_smmu_domain(domain), viommu, array);
> > > > >
> > > > > I'd like to have the viommu struct directly hold the VMID. The nested
> > > > > parent should be sharable between multiple viommus, it doesn't
> make
> > > > > any sense that it would hold the vmid.
> > > > >
> > > > > This is struggling because it is trying too hard to not have the
> > > > > driver allocate the viommu, and I think we should just go ahead and
> do
> > > > > that. Store the vmid, today copied from the nesting parent in the vmid
> > > > > private struct. No need for iommufd_viommu_to_parent_domain(),
> just
> > > > > rework the APIs to pass the vmid down not a domain.
> > > >
> > > > OK. When I designed all this stuff, we still haven't made mind
> > > > about sharing the s2 domain, i.e. moving the VMID, which might
> > > > need a couple of more patches to achieve.
> > >
> > > Yes, many more patches, and don't try to do it now.. But we can copy
> > > the vmid from the s2 and place it in the viommu struct during
> > > allocation time.
> > >
> >
> > does it assume that a viommu object cannot span multiple physical
> > IOMMUs so there is only one vmid per viommu?
> 
> I think so. One the reasons of introducing vIOMMU is to maintain
> the shareability across physical IOMMUs at the s2 HWPT_PAGING.
> 

I don't quite get it. e.g. for intel-iommu the S2 domain itself can
be shared across physical IOMMUs then what is the problem 
preventing a vIOMMU object using that S2 to span multiple IOMMUs?

Probably there is a good reason e.g. for simplification or better
aligned with hw accel stuff. But it's not explained clearly so far.
Tian, Kevin Sept. 11, 2024, 8:17 a.m. UTC | #5
> From: Baolu Lu <baolu.lu@linux.intel.com>
> Sent: Wednesday, September 11, 2024 3:51 PM
> 
> On 2024/9/11 15:20, Nicolin Chen wrote:
> > On Wed, Sep 11, 2024 at 06:25:16AM +0000, Tian, Kevin wrote:
> >>> From: Jason Gunthorpe<jgg@nvidia.com>
> >>> Sent: Friday, September 6, 2024 2:22 AM
> >>>
> >>> On Thu, Sep 05, 2024 at 11:00:49AM -0700, Nicolin Chen wrote:
> >>>> On Thu, Sep 05, 2024 at 01:20:39PM -0300, Jason Gunthorpe wrote:
> >>>>> On Tue, Aug 27, 2024 at 09:59:54AM -0700, Nicolin Chen wrote:
> >>>>>
> >>>>>> +static int arm_smmu_viommu_cache_invalidate(struct
> >>> iommufd_viommu *viommu,
> >>>>>> +                                           struct iommu_user_data_array
> >>> *array)
> >>>>>> +{
> >>>>>> +       struct iommu_domain *domain =
> >>> iommufd_viommu_to_parent_domain(viommu);
> >>>>>> +
> >>>>>> +       return __arm_smmu_cache_invalidate_user(
> >>>>>> +                       to_smmu_domain(domain), viommu, array);
> >>>>> I'd like to have the viommu struct directly hold the VMID. The nested
> >>>>> parent should be sharable between multiple viommus, it doesn't make
> >>>>> any sense that it would hold the vmid.
> >>>>>
> >>>>> This is struggling because it is trying too hard to not have the
> >>>>> driver allocate the viommu, and I think we should just go ahead and
> do
> >>>>> that. Store the vmid, today copied from the nesting parent in the vmid
> >>>>> private struct. No need for iommufd_viommu_to_parent_domain(),
> just
> >>>>> rework the APIs to pass the vmid down not a domain.
> >>>> OK. When I designed all this stuff, we still haven't made mind
> >>>> about sharing the s2 domain, i.e. moving the VMID, which might
> >>>> need a couple of more patches to achieve.
> >>> Yes, many more patches, and don't try to do it now.. But we can copy
> >>> the vmid from the s2 and place it in the viommu struct during
> >>> allocation time.
> >>>
> >> does it assume that a viommu object cannot span multiple physical
> >> IOMMUs so there is only one vmid per viommu?
> > I think so. One the reasons of introducing vIOMMU is to maintain
> > the shareability across physical IOMMUs at the s2 HWPT_PAGING.
> 
> My understanding of VMID is something like domain id in x86 arch's. Is
> my understanding correct?

yes

> 
> If a VMID for an S2 hwpt is valid on physical IOMMU A but has already
> been allocated for another purpose on physical IOMMU B, how can it be
> shared across both IOMMUs? Or the VMID is allocated globally?
> 

I'm not sure that's a problem. The point is that each vIOMMU object 
will get a VMID from the SMMU which it's associated to (assume
one vIOMMU cannot span multiple SMMU). Whether that VMID
is globally allocated or per-SMMU is the policy in the SMMU driver.

It's the driver's responsibility to ensure not using a conflicting VMID
when creating an vIOMMU instance.
Baolu Lu Sept. 11, 2024, 8:19 a.m. UTC | #6
On 2024/9/11 16:17, Tian, Kevin wrote:
>> If a VMID for an S2 hwpt is valid on physical IOMMU A but has already
>> been allocated for another purpose on physical IOMMU B, how can it be
>> shared across both IOMMUs? Or the VMID is allocated globally?
>>
> I'm not sure that's a problem. The point is that each vIOMMU object
> will get a VMID from the SMMU which it's associated to (assume
> one vIOMMU cannot span multiple SMMU). Whether that VMID
> is globally allocated or per-SMMU is the policy in the SMMU driver.
> 
> It's the driver's responsibility to ensure not using a conflicting VMID
> when creating an vIOMMU instance.

Make sense.

Thanks,
baolu
Nicolin Chen Sept. 11, 2024, 8:53 p.m. UTC | #7
On Wed, Sep 11, 2024 at 08:13:01AM +0000, Tian, Kevin wrote:
> > > > Yes, many more patches, and don't try to do it now.. But we can copy
> > > > the vmid from the s2 and place it in the viommu struct during
> > > > allocation time.
> > > >
> > >
> > > does it assume that a viommu object cannot span multiple physical
> > > IOMMUs so there is only one vmid per viommu?
> >
> > I think so. One the reasons of introducing vIOMMU is to maintain
> > the shareability across physical IOMMUs at the s2 HWPT_PAGING.
> >
> 
> I don't quite get it. e.g. for intel-iommu the S2 domain itself can
> be shared across physical IOMMUs

SMMU does the same, but needs a VMID per pSMMU to tag that S2
domain:

vIOMMU0 (VMIDx of pSMMU0) -> shared S2
vIOMMU1 (VMIDy of pSMMU1) -> shared S2

Note: x and y might be different.

> then what is the problem
> preventing a vIOMMU object using that S2 to span multiple IOMMUs?

Jason previously suggested the way of implementing multi-vIOMMU
in a VMM to be one vIOMMU object representing a vIOMMU instance
(of a physical IOMMU) in the VM. So, it'd be only one VMID per
one vIOMMU object.

Sharing one vIOMMU object on the other hand needs the vIOMMU to
hold a list of VMIDs for all (or attached?) physical IOMMUs. This
would change what a vIOMMU object represents.

> Probably there is a good reason e.g. for simplification or better
> aligned with hw accel stuff. But it's not explained clearly so far.

I will try emphasizing that in the next version, likely in the
rst file that I am patching for HWPT_PAGING/NESTED at this point.

Thanks
Nicolin
Nicolin Chen Sept. 11, 2024, 9:08 p.m. UTC | #8
On Wed, Sep 11, 2024 at 08:17:23AM +0000, Tian, Kevin wrote:

> > My understanding of VMID is something like domain id in x86 arch's. Is
> > my understanding correct?
> 
> yes
> 
> >
> > If a VMID for an S2 hwpt is valid on physical IOMMU A but has already
> > been allocated for another purpose on physical IOMMU B, how can it be
> > shared across both IOMMUs? Or the VMID is allocated globally?
> >
> 
> I'm not sure that's a problem. The point is that each vIOMMU object
> will get a VMID from the SMMU which it's associated to (assume
> one vIOMMU cannot span multiple SMMU). Whether that VMID
> is globally allocated or per-SMMU is the policy in the SMMU driver.
> 
> It's the driver's responsibility to ensure not using a conflicting VMID
> when creating an vIOMMU instance.

It can happen to be the same VMID across all physical SMMUs, but
not necessary to be the same, i.e. two SMMUs might have two VMIDs
with different ID values, allocated from the their own VMID pools,
since cache entries in their own TLB can be tagged with their own
VMIDs.

Does domain id for intel-iommu have to be the same? I recall there
is only one iommu instance on intel chips at this moment?

Thanks
Nicolin
Jason Gunthorpe Sept. 11, 2024, 11 p.m. UTC | #9
On Wed, Sep 11, 2024 at 06:25:16AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Friday, September 6, 2024 2:22 AM
> > 
> > On Thu, Sep 05, 2024 at 11:00:49AM -0700, Nicolin Chen wrote:
> > > On Thu, Sep 05, 2024 at 01:20:39PM -0300, Jason Gunthorpe wrote:
> > > > On Tue, Aug 27, 2024 at 09:59:54AM -0700, Nicolin Chen wrote:
> > > >
> > > > > +static int arm_smmu_viommu_cache_invalidate(struct
> > iommufd_viommu *viommu,
> > > > > +					    struct iommu_user_data_array
> > *array)
> > > > > +{
> > > > > +	struct iommu_domain *domain =
> > iommufd_viommu_to_parent_domain(viommu);
> > > > > +
> > > > > +	return __arm_smmu_cache_invalidate_user(
> > > > > +			to_smmu_domain(domain), viommu, array);
> > > >
> > > > I'd like to have the viommu struct directly hold the VMID. The nested
> > > > parent should be sharable between multiple viommus, it doesn't make
> > > > any sense that it would hold the vmid.
> > > >
> > > > This is struggling because it is trying too hard to not have the
> > > > driver allocate the viommu, and I think we should just go ahead and do
> > > > that. Store the vmid, today copied from the nesting parent in the vmid
> > > > private struct. No need for iommufd_viommu_to_parent_domain(), just
> > > > rework the APIs to pass the vmid down not a domain.
> > >
> > > OK. When I designed all this stuff, we still haven't made mind
> > > about sharing the s2 domain, i.e. moving the VMID, which might
> > > need a couple of more patches to achieve.
> > 
> > Yes, many more patches, and don't try to do it now.. But we can copy
> > the vmid from the s2 and place it in the viommu struct during
> > allocation time.
> 
> does it assume that a viommu object cannot span multiple physical
> IOMMUs so there is only one vmid per viommu?

Yes, the viommu is not intended to cross physical iommus, it is
intended to contain objects, like invalidation queues, that are tied
to single piommus only.

If someone does want to make vIOMMU that unifies multiple pIOMMUS, and
they have the feature set that would make that possible, then they
will need multiple IOMMUFD VFIOMMU objects and will have to divide up
the nested domains appropriately.

Jason
Jason Gunthorpe Sept. 11, 2024, 11:07 p.m. UTC | #10
On Wed, Sep 11, 2024 at 08:13:01AM +0000, Tian, Kevin wrote:

> Probably there is a good reason e.g. for simplification or better
> aligned with hw accel stuff. But it's not explained clearly so far.

Probably the most concrete thing is if you have a direct assignment
invalidation queue (ie DMA'd directly by HW) then it only applies to a
single pIOMMU and invalidation commands placed there are unavoidably
limited in scope.

This creates a representation problem, if we have a vIOMMU that spans
many pIOMMUs but invalidations do some subset how to do we model
that. Just saying the vIOMMU is linked to the pIOMMU solves this
nicely.

Jason
Baolu Lu Sept. 12, 2024, 4:45 a.m. UTC | #11
On 9/12/24 5:08 AM, Nicolin Chen wrote:
> On Wed, Sep 11, 2024 at 08:17:23AM +0000, Tian, Kevin wrote:
> 
>>> My understanding of VMID is something like domain id in x86 arch's. Is
>>> my understanding correct?
>> yes
>>
>>> If a VMID for an S2 hwpt is valid on physical IOMMU A but has already
>>> been allocated for another purpose on physical IOMMU B, how can it be
>>> shared across both IOMMUs? Or the VMID is allocated globally?
>>>
>> I'm not sure that's a problem. The point is that each vIOMMU object
>> will get a VMID from the SMMU which it's associated to (assume
>> one vIOMMU cannot span multiple SMMU). Whether that VMID
>> is globally allocated or per-SMMU is the policy in the SMMU driver.
>>
>> It's the driver's responsibility to ensure not using a conflicting VMID
>> when creating an vIOMMU instance.
> It can happen to be the same VMID across all physical SMMUs, but
> not necessary to be the same, i.e. two SMMUs might have two VMIDs
> with different ID values, allocated from the their own VMID pools,
> since cache entries in their own TLB can be tagged with their own
> VMIDs.
> 
> Does domain id for intel-iommu have to be the same?

No. A paging domain may have different domain IDs on different IOMMUs
for Intel iommu driver.

> I recall there
> is only one iommu instance on intel chips at this moment?

No. There might be multiple iommu instances on a chip but they share a
common iommu driver ops.

Thanks,
baolu
Tian, Kevin Sept. 13, 2024, 2:33 a.m. UTC | #12
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Thursday, September 12, 2024 7:08 AM
> 
> On Wed, Sep 11, 2024 at 08:13:01AM +0000, Tian, Kevin wrote:
> 
> > Probably there is a good reason e.g. for simplification or better
> > aligned with hw accel stuff. But it's not explained clearly so far.
> 
> Probably the most concrete thing is if you have a direct assignment
> invalidation queue (ie DMA'd directly by HW) then it only applies to a
> single pIOMMU and invalidation commands placed there are unavoidably
> limited in scope.
> 
> This creates a representation problem, if we have a vIOMMU that spans
> many pIOMMUs but invalidations do some subset how to do we model
> that. Just saying the vIOMMU is linked to the pIOMMU solves this
> nicely.
> 

yes that is a good reason.

btw do we expect the VMM to try-and-fail when deciding whether a
new vIOMMU object is required when creating a new vdev?
Jason Gunthorpe Sept. 14, 2024, 2:50 p.m. UTC | #13
On Fri, Sep 13, 2024 at 02:33:59AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Thursday, September 12, 2024 7:08 AM
> > 
> > On Wed, Sep 11, 2024 at 08:13:01AM +0000, Tian, Kevin wrote:
> > 
> > > Probably there is a good reason e.g. for simplification or better
> > > aligned with hw accel stuff. But it's not explained clearly so far.
> > 
> > Probably the most concrete thing is if you have a direct assignment
> > invalidation queue (ie DMA'd directly by HW) then it only applies to a
> > single pIOMMU and invalidation commands placed there are unavoidably
> > limited in scope.
> > 
> > This creates a representation problem, if we have a vIOMMU that spans
> > many pIOMMUs but invalidations do some subset how to do we model
> > that. Just saying the vIOMMU is linked to the pIOMMU solves this
> > nicely.
> > 
> 
> yes that is a good reason.
> 
> btw do we expect the VMM to try-and-fail when deciding whether a
> new vIOMMU object is required when creating a new vdev?

I think there was some suggestion the getinfo could return this, but
also I think qemu needs to have a command line that matches physical
so maybe it needs some sysfs?

Jason
Tian, Kevin Sept. 18, 2024, 8:10 a.m. UTC | #14
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Saturday, September 14, 2024 10:51 PM
> 
> On Fri, Sep 13, 2024 at 02:33:59AM +0000, Tian, Kevin wrote:
> > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > Sent: Thursday, September 12, 2024 7:08 AM
> > >
> > > On Wed, Sep 11, 2024 at 08:13:01AM +0000, Tian, Kevin wrote:
> > >
> > > > Probably there is a good reason e.g. for simplification or better
> > > > aligned with hw accel stuff. But it's not explained clearly so far.
> > >
> > > Probably the most concrete thing is if you have a direct assignment
> > > invalidation queue (ie DMA'd directly by HW) then it only applies to a
> > > single pIOMMU and invalidation commands placed there are unavoidably
> > > limited in scope.
> > >
> > > This creates a representation problem, if we have a vIOMMU that spans
> > > many pIOMMUs but invalidations do some subset how to do we model
> > > that. Just saying the vIOMMU is linked to the pIOMMU solves this
> > > nicely.
> > >
> >
> > yes that is a good reason.
> >
> > btw do we expect the VMM to try-and-fail when deciding whether a
> > new vIOMMU object is required when creating a new vdev?
> 
> I think there was some suggestion the getinfo could return this, but
> also I think qemu needs to have a command line that matches physical
> so maybe it needs some sysfs?
> 

My impression was that Qemu is moving away from directly accessing
sysfs (e.g. as the reason behind allowing Libvirt to pass in an opened 
cdev fd to Qemu). So probably getinfo makes more sense...
Jason Gunthorpe Sept. 23, 2024, 6:34 p.m. UTC | #15
On Wed, Sep 18, 2024 at 08:10:52AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Saturday, September 14, 2024 10:51 PM
> > 
> > On Fri, Sep 13, 2024 at 02:33:59AM +0000, Tian, Kevin wrote:
> > > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > > Sent: Thursday, September 12, 2024 7:08 AM
> > > >
> > > > On Wed, Sep 11, 2024 at 08:13:01AM +0000, Tian, Kevin wrote:
> > > >
> > > > > Probably there is a good reason e.g. for simplification or better
> > > > > aligned with hw accel stuff. But it's not explained clearly so far.
> > > >
> > > > Probably the most concrete thing is if you have a direct assignment
> > > > invalidation queue (ie DMA'd directly by HW) then it only applies to a
> > > > single pIOMMU and invalidation commands placed there are unavoidably
> > > > limited in scope.
> > > >
> > > > This creates a representation problem, if we have a vIOMMU that spans
> > > > many pIOMMUs but invalidations do some subset how to do we model
> > > > that. Just saying the vIOMMU is linked to the pIOMMU solves this
> > > > nicely.
> > > >
> > >
> > > yes that is a good reason.
> > >
> > > btw do we expect the VMM to try-and-fail when deciding whether a
> > > new vIOMMU object is required when creating a new vdev?
> > 
> > I think there was some suggestion the getinfo could return this, but
> > also I think qemu needs to have a command line that matches physical
> > so maybe it needs some sysfs?
> > 
> 
> My impression was that Qemu is moving away from directly accessing
> sysfs (e.g. as the reason behind allowing Libvirt to pass in an opened 
> cdev fd to Qemu). So probably getinfo makes more sense...

Yes, but I think libvirt needs this information before it invokes
qemu..

The physical and virtual iommus need to sort of match, something
should figure this out automatically I would guess.

Jason
diff mbox series

Patch

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index a2af693bc7b2..bddbb98da414 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3267,15 +3267,32 @@  static void arm_smmu_domain_nested_free(struct iommu_domain *domain)
 	kfree(container_of(domain, struct arm_smmu_nested_domain, domain));
 }
 
+static int arm_smmu_convert_viommu_vdev_id(struct iommufd_viommu *viommu,
+					   u32 vdev_id, u32 *sid)
+{
+	struct arm_smmu_master *master;
+	struct device *dev;
+
+	dev = iommufd_viommu_find_device(viommu, vdev_id);
+	if (!dev)
+		return -EIO;
+	master = dev_iommu_priv_get(dev);
+
+	if (sid)
+		*sid = master->streams[0].id;
+	return 0;
+}
+
 /*
  * Convert, in place, the raw invalidation command into an internal format that
  * can be passed to arm_smmu_cmdq_issue_cmdlist(). Internally commands are
  * stored in CPU endian.
  *
- * Enforce the VMID on the command.
+ * Enforce the VMID or the SID on the command.
  */
 static int
 arm_smmu_convert_user_cmd(struct arm_smmu_domain *s2_parent,
+			  struct iommufd_viommu *viommu,
 			  struct iommu_hwpt_arm_smmuv3_invalidate *cmd)
 {
 	u16 vmid = s2_parent->s2_cfg.vmid;
@@ -3297,13 +3314,46 @@  arm_smmu_convert_user_cmd(struct arm_smmu_domain *s2_parent,
 		cmd->cmd[0] &= ~CMDQ_TLBI_0_VMID;
 		cmd->cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_VMID, vmid);
 		break;
+	case CMDQ_OP_ATC_INV:
+	case CMDQ_OP_CFGI_CD:
+	case CMDQ_OP_CFGI_CD_ALL:
+		if (viommu) {
+			u32 sid, vsid = FIELD_GET(CMDQ_CFGI_0_SID, cmd->cmd[0]);
+
+			if (arm_smmu_convert_viommu_vdev_id(viommu, vsid, &sid))
+				return -EIO;
+			cmd->cmd[0] &= ~CMDQ_CFGI_0_SID;
+			cmd->cmd[0] |= FIELD_PREP(CMDQ_CFGI_0_SID, sid);
+			break;
+		}
+		fallthrough;
 	default:
 		return -EIO;
 	}
 	return 0;
 }
 
+static inline bool
+arm_smmu_must_lock_vdev_id(struct iommu_hwpt_arm_smmuv3_invalidate *cmds,
+			   unsigned int num_cmds)
+{
+	int i;
+
+	for (i = 0; i < num_cmds; i++) {
+		switch (cmds[i].cmd[0] & CMDQ_0_OP) {
+		case CMDQ_OP_ATC_INV:
+		case CMDQ_OP_CFGI_CD:
+		case CMDQ_OP_CFGI_CD_ALL:
+			return true;
+		default:
+			continue;
+		}
+	}
+	return false;
+}
+
 static int __arm_smmu_cache_invalidate_user(struct arm_smmu_domain *s2_parent,
+					    struct iommufd_viommu *viommu,
 					    struct iommu_user_data_array *array)
 {
 	struct arm_smmu_device *smmu = s2_parent->smmu;
@@ -3313,6 +3363,7 @@  static int __arm_smmu_cache_invalidate_user(struct arm_smmu_domain *s2_parent,
 	struct iommu_hwpt_arm_smmuv3_invalidate *end;
 	struct arm_smmu_cmdq_ent ent;
 	struct arm_smmu_cmdq *cmdq;
+	bool must_lock = false;
 	int ret;
 
 	/* A zero-length array is allowed to validate the array type */
@@ -3335,12 +3386,17 @@  static int __arm_smmu_cache_invalidate_user(struct arm_smmu_domain *s2_parent,
 	if (ret)
 		goto out;
 
+	if (viommu)
+		must_lock = arm_smmu_must_lock_vdev_id(cmds, array->entry_num);
+	if (must_lock)
+		iommufd_viommu_lock_vdev_id(viommu);
+
 	ent.opcode = cmds->cmd[0] & CMDQ_0_OP;
 	cmdq = arm_smmu_get_cmdq(smmu, &ent);
 
 	last_batch = cmds;
 	while (cur != end) {
-		ret = arm_smmu_convert_user_cmd(s2_parent, cur);
+		ret = arm_smmu_convert_user_cmd(s2_parent, viommu, cur);
 		if (ret)
 			goto out;
 
@@ -3358,6 +3414,8 @@  static int __arm_smmu_cache_invalidate_user(struct arm_smmu_domain *s2_parent,
 		last_batch = cur;
 	}
 out:
+	if (must_lock)
+		iommufd_viommu_unlock_vdev_id(viommu);
 	array->entry_num = cur - cmds;
 	kfree(cmds);
 	return ret;
@@ -3370,7 +3428,7 @@  static int arm_smmu_cache_invalidate_user(struct iommu_domain *domain,
 		container_of(domain, struct arm_smmu_nested_domain, domain);
 
 	return __arm_smmu_cache_invalidate_user(
-			nested_domain->s2_parent, array);
+			nested_domain->s2_parent, NULL, array);
 }
 
 static const struct iommu_domain_ops arm_smmu_nested_ops = {
@@ -3863,6 +3921,15 @@  static int arm_smmu_def_domain_type(struct device *dev)
 	return 0;
 }
 
+static int arm_smmu_viommu_cache_invalidate(struct iommufd_viommu *viommu,
+					    struct iommu_user_data_array *array)
+{
+	struct iommu_domain *domain = iommufd_viommu_to_parent_domain(viommu);
+
+	return __arm_smmu_cache_invalidate_user(
+			to_smmu_domain(domain), viommu, array);
+}
+
 static struct iommu_ops arm_smmu_ops = {
 	.identity_domain	= &arm_smmu_identity_domain,
 	.blocked_domain		= &arm_smmu_blocked_domain,
@@ -3893,6 +3960,9 @@  static struct iommu_ops arm_smmu_ops = {
 		.iotlb_sync		= arm_smmu_iotlb_sync,
 		.iova_to_phys		= arm_smmu_iova_to_phys,
 		.free			= arm_smmu_domain_free_paging,
+		.default_viommu_ops = &(const struct iommufd_viommu_ops) {
+			.cache_invalidate = arm_smmu_viommu_cache_invalidate,
+		}
 	}
 };
 
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 6c8ae70c90fe..e7f6e9194a9e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -10,6 +10,7 @@ 
 
 #include <linux/bitfield.h>
 #include <linux/iommu.h>
+#include <linux/iommufd.h>
 #include <linux/kernel.h>
 #include <linux/mmzone.h>
 #include <linux/sizes.h>
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index f3aefb11f681..0d973486b604 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -734,13 +734,18 @@  struct iommu_hwpt_vtd_s1_invalidate {
  * @cmd: 128-bit cache invalidation command that runs in SMMU CMDQ.
  *       Must be little-endian.
  *
- * Supported command list:
+ * Supported command list when passing in a HWPT via @hwpt_id:
  *     CMDQ_OP_TLBI_NSNH_ALL
  *     CMDQ_OP_TLBI_NH_VA
  *     CMDQ_OP_TLBI_NH_VAA
  *     CMDQ_OP_TLBI_NH_ALL
  *     CMDQ_OP_TLBI_NH_ASID
  *
+ * Additional to the list above, when passing in a VIOMMU via @hwpt_id:
+ *     CMDQ_OP_ATC_INV
+ *     CMDQ_OP_CFGI_CD
+ *     CMDQ_OP_CFGI_CD_ALL
+ *
  * -EIO will be returned if the command is not supported.
  */
 struct iommu_hwpt_arm_smmuv3_invalidate {