mbox series

[v3,0/5] Simplify vfio_iommu_type1 attach/detach routine

Message ID 20220623200029.26007-1-nicolinc@nvidia.com
Headers show
Series Simplify vfio_iommu_type1 attach/detach routine | expand

Message

Nicolin Chen June 23, 2022, 8 p.m. UTC
This is a preparatory series for IOMMUFD v2 patches. It enforces error
code -EMEDIUMTYPE in iommu_attach_device() and iommu_attach_group() when
an IOMMU domain and a device/group are incompatible. It also drops the
useless domain->ops check since it won't fail in current environment.

These allow VFIO iommu code to simplify its group attachment routine, by
avoiding the extra IOMMU domain allocations and attach/detach sequences
of the old code.

Worths mentioning the exact match for enforce_cache_coherency is removed
with this series, since there's very less value in doing that since KVM
won't be able to take advantage of it -- this just wastes domain memory.
Instead, we rely on Intel IOMMU driver taking care of that internally.

This is on github:
https://github.com/nicolinc/iommufd/commits/vfio_iommu_attach

Changelog
v3:
 * Dropped all dev_err since -EMEDIUMTYPE clearly indicates what error.
 * Updated commit message of enforce_cache_coherency removing patch.
 * Updated commit message of domain->ops removing patch.
 * Replaced "goto out_unlock" with simply mutex_unlock() and return.
 * Added a line of comments for -EMEDIUMTYPE return check.
 * Moved iommu_get_msi_cookie() into alloc_attach_domain() as a cookie
   should be logically tied to the lifetime of a domain itself.
 * Added Kevin's "Reviewed-by".
v2: https://lore.kernel.org/kvm/20220616000304.23890-1-nicolinc@nvidia.com/
 * Added -EMEDIUMTYPE to more IOMMU drivers that fit the category.
 * Changed dev_err to dev_dbg for -EMEDIUMTYPE to avoid kernel log spam.
 * Dropped iommu_ops patch, and removed domain->ops in VFIO directly,
   since there's no mixed-driver use case that would fail the sanity.
 * Updated commit log of the patch removing enforce_cache_coherency.
 * Fixed a misplace of "num_non_pinned_groups--" in detach_group patch.
 * Moved "num_non_pinned_groups++" in PATCH-5 to the common path between
   domain-reusing and new-domain pathways, like the code previously did.
 * Fixed a typo in EMEDIUMTYPE patch.
v1: https://lore.kernel.org/kvm/20220606061927.26049-1-nicolinc@nvidia.com/

Jason Gunthorpe (1):
  vfio/iommu_type1: Prefer to reuse domains vs match enforced cache
    coherency

Nicolin Chen (4):
  iommu: Return -EMEDIUMTYPE for incompatible domain and device/group
  vfio/iommu_type1: Remove the domain->ops comparison
  vfio/iommu_type1: Clean up update_dirty_scope in detach_group()
  vfio/iommu_type1: Simplify group attachment

 drivers/iommu/amd/iommu.c                   |   2 +-
 drivers/iommu/apple-dart.c                  |   4 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  15 +-
 drivers/iommu/arm/arm-smmu/arm-smmu.c       |   6 +-
 drivers/iommu/arm/arm-smmu/qcom_iommu.c     |   9 +-
 drivers/iommu/intel/iommu.c                 |  10 +-
 drivers/iommu/iommu.c                       |  28 ++
 drivers/iommu/ipmmu-vmsa.c                  |   4 +-
 drivers/iommu/mtk_iommu_v1.c                |   2 +-
 drivers/iommu/omap-iommu.c                  |   3 +-
 drivers/iommu/s390-iommu.c                  |   2 +-
 drivers/iommu/sprd-iommu.c                  |   6 +-
 drivers/iommu/tegra-gart.c                  |   2 +-
 drivers/iommu/virtio-iommu.c                |   3 +-
 drivers/vfio/vfio_iommu_type1.c             | 340 ++++++++++----------
 15 files changed, 225 insertions(+), 211 deletions(-)

Comments

Baolu Lu June 24, 2022, 1:35 a.m. UTC | #1
On 2022/6/24 04:00, Nicolin Chen wrote:
> diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
> index e1cb51b9866c..5386d889429d 100644
> --- a/drivers/iommu/mtk_iommu_v1.c
> +++ b/drivers/iommu/mtk_iommu_v1.c
> @@ -304,7 +304,7 @@ static int mtk_iommu_v1_attach_device(struct iommu_domain *domain, struct device
>   	/* Only allow the domain created internally. */
>   	mtk_mapping = data->mapping;
>   	if (mtk_mapping->domain != domain)
> -		return 0;
> +		return -EMEDIUMTYPE;
>   
>   	if (!data->m4u_dom) {
>   		data->m4u_dom = dom;

This change looks odd. It turns the return value from success to
failure. Is it a bug? If so, it should go through a separated fix patch.

Best regards,
baolu
Baolu Lu June 24, 2022, 1:50 a.m. UTC | #2
On 2022/6/24 04:00, Nicolin Chen wrote:
> From: Jason Gunthorpe <jgg@nvidia.com>
> 
> The KVM mechanism for controlling wbinvd is based on OR of the coherency
> property of all devices attached to a guest, no matter whether those
> devices are attached to a single domain or multiple domains.
> 
> On the other hand, the benefit to using separate domains was that those
> devices attached to domains supporting enforced cache coherency always
> mapped with the attributes necessary to provide that feature, therefore
> if a non-enforced domain was dropped, the associated group removal would
> re-trigger an evaluation by KVM.
> 
> In practice however, the only known cases of such mixed domains included
> an Intel IGD device behind an IOMMU lacking snoop control, where such
> devices do not support hotplug, therefore this scenario lacks testing and
> is not considered sufficiently relevant to support.
> 
> After all, KVM won't take advantage of trying to push a device that could
> do enforced cache coherency to a dedicated domain vs re-using an existing
> domain, which is non-coherent.
> 
> Simplify this code and eliminate the test. This removes the only logic
> that needed to have a dummy domain attached prior to searching for a
> matching domain and simplifies the next patches.
> 
> It's unclear whether we want to further optimize the Intel driver to
> update the domain coherency after a device is detached from it, at
> least not before KVM can be verified to handle such dynamics in related
> emulation paths (wbinvd, vcpu load, write_cr0, ept, etc.). In reality
> we don't see an usage requiring such optimization as the only device
> which imposes such non-coherency is Intel GPU which even doesn't
> support hotplug/hot remove.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
>   drivers/vfio/vfio_iommu_type1.c | 4 +---
>   1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index c13b9290e357..f4e3b423a453 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -2285,9 +2285,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
>   	 * testing if they're on the same bus_type.
>   	 */
>   	list_for_each_entry(d, &iommu->domain_list, next) {
> -		if (d->domain->ops == domain->domain->ops &&
> -		    d->enforce_cache_coherency ==
> -			    domain->enforce_cache_coherency) {
> +		if (d->domain->ops == domain->domain->ops) {
>   			iommu_detach_group(domain->domain, group->iommu_group);
>   			if (!iommu_attach_group(d->domain,
>   						group->iommu_group)) {

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>

Best regards,
baolu
Nicolin Chen June 24, 2022, 2:44 a.m. UTC | #3
On Fri, Jun 24, 2022 at 09:35:49AM +0800, Baolu Lu wrote:
> External email: Use caution opening links or attachments
> 
> 
> On 2022/6/24 04:00, Nicolin Chen wrote:
> > diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
> > index e1cb51b9866c..5386d889429d 100644
> > --- a/drivers/iommu/mtk_iommu_v1.c
> > +++ b/drivers/iommu/mtk_iommu_v1.c
> > @@ -304,7 +304,7 @@ static int mtk_iommu_v1_attach_device(struct iommu_domain *domain, struct device
> >       /* Only allow the domain created internally. */
> >       mtk_mapping = data->mapping;
> >       if (mtk_mapping->domain != domain)
> > -             return 0;
> > +             return -EMEDIUMTYPE;
> > 
> >       if (!data->m4u_dom) {
> >               data->m4u_dom = dom;
> 
> This change looks odd. It turns the return value from success to
> failure. Is it a bug? If so, it should go through a separated fix patch.

Makes sense.

I read the commit log of the original change:
https://lore.kernel.org/r/1589530123-30240-1-git-send-email-yong.wu@mediatek.com

It doesn't seem to allow devices to get attached to different
domains other than the shared mapping->domain, created in the
in the mtk_iommu_probe_device(). So it looks like returning 0
is intentional. Though I am still very confused by this return
value here, I doubt it has ever been used in a VFIO context.

Young, would you please give us some input?

Overall, I feel it's better to play it safe here by dropping
this part. If we later confirm there is a need to fix it, we
will do that in a separate patch anyway.

Thanks
Nic
Yong Wu (吴勇) June 24, 2022, 5:38 a.m. UTC | #4
On Thu, 2022-06-23 at 19:44 -0700, Nicolin Chen wrote:
> On Fri, Jun 24, 2022 at 09:35:49AM +0800, Baolu Lu wrote:
> > External email: Use caution opening links or attachments
> > 
> > 
> > On 2022/6/24 04:00, Nicolin Chen wrote:
> > > diff --git a/drivers/iommu/mtk_iommu_v1.c
> > > b/drivers/iommu/mtk_iommu_v1.c
> > > index e1cb51b9866c..5386d889429d 100644
> > > --- a/drivers/iommu/mtk_iommu_v1.c
> > > +++ b/drivers/iommu/mtk_iommu_v1.c
> > > @@ -304,7 +304,7 @@ static int mtk_iommu_v1_attach_device(struct
> > > iommu_domain *domain, struct device
> > >       /* Only allow the domain created internally. */
> > >       mtk_mapping = data->mapping;
> > >       if (mtk_mapping->domain != domain)
> > > -             return 0;
> > > +             return -EMEDIUMTYPE;
> > > 
> > >       if (!data->m4u_dom) {
> > >               data->m4u_dom = dom;
> > 
> > This change looks odd. It turns the return value from success to
> > failure. Is it a bug? If so, it should go through a separated fix
> > patch.

Thanks for the review:)

> 
> Makes sense.
> 
> I read the commit log of the original change:
> 
https://lore.kernel.org/r/1589530123-30240-1-git-send-email-yong.wu@mediatek.com
> 
> It doesn't seem to allow devices to get attached to different
> domains other than the shared mapping->domain, created in the
> in the mtk_iommu_probe_device(). So it looks like returning 0
> is intentional. Though I am still very confused by this return
> value here, I doubt it has ever been used in a VFIO context.

It's not used in VFIO context. "return 0" just satisfy the iommu
framework to go ahead. and yes, here we only allow the shared "mapping-
>domain" (All the devices share a domain created internally).

thus I think we should still keep "return 0" here.

Thanks:)

> 
> Young, would you please give us some input?
> 
> Overall, I feel it's better to play it safe here by dropping
> this part. If we later confirm there is a need to fix it, we
> will do that in a separate patch anyway.
> 
> Thanks
> Nic
Nicolin Chen June 24, 2022, 5:41 a.m. UTC | #5
On Fri, Jun 24, 2022 at 01:38:58PM +0800, Yong Wu wrote:

> > > > diff --git a/drivers/iommu/mtk_iommu_v1.c
> > > > b/drivers/iommu/mtk_iommu_v1.c
> > > > index e1cb51b9866c..5386d889429d 100644
> > > > --- a/drivers/iommu/mtk_iommu_v1.c
> > > > +++ b/drivers/iommu/mtk_iommu_v1.c
> > > > @@ -304,7 +304,7 @@ static int mtk_iommu_v1_attach_device(struct
> > > > iommu_domain *domain, struct device
> > > >       /* Only allow the domain created internally. */
> > > >       mtk_mapping = data->mapping;
> > > >       if (mtk_mapping->domain != domain)
> > > > -             return 0;
> > > > +             return -EMEDIUMTYPE;
> > > >
> > > >       if (!data->m4u_dom) {
> > > >               data->m4u_dom = dom;
> > >
> > > This change looks odd. It turns the return value from success to
> > > failure. Is it a bug? If so, it should go through a separated fix
> > > patch.
> 
> Thanks for the review:)
> 
> >
> > Makes sense.
> >
> > I read the commit log of the original change:
> >
> https://lore.kernel.org/r/1589530123-30240-1-git-send-email-yong.wu@mediatek.com
> >
> > It doesn't seem to allow devices to get attached to different
> > domains other than the shared mapping->domain, created in the
> > in the mtk_iommu_probe_device(). So it looks like returning 0
> > is intentional. Though I am still very confused by this return
> > value here, I doubt it has ever been used in a VFIO context.
> 
> It's not used in VFIO context. "return 0" just satisfy the iommu
> framework to go ahead. and yes, here we only allow the shared "mapping-
> >domain" (All the devices share a domain created internally).
> 
> thus I think we should still keep "return 0" here.

Thanks for the reply. I will just drop the change of this file.
Tian, Kevin June 24, 2022, 6:16 a.m. UTC | #6
> From: Yong Wu
> Sent: Friday, June 24, 2022 1:39 PM
> 
> On Thu, 2022-06-23 at 19:44 -0700, Nicolin Chen wrote:
> > On Fri, Jun 24, 2022 at 09:35:49AM +0800, Baolu Lu wrote:
> > > External email: Use caution opening links or attachments
> > >
> > >
> > > On 2022/6/24 04:00, Nicolin Chen wrote:
> > > > diff --git a/drivers/iommu/mtk_iommu_v1.c
> > > > b/drivers/iommu/mtk_iommu_v1.c
> > > > index e1cb51b9866c..5386d889429d 100644
> > > > --- a/drivers/iommu/mtk_iommu_v1.c
> > > > +++ b/drivers/iommu/mtk_iommu_v1.c
> > > > @@ -304,7 +304,7 @@ static int mtk_iommu_v1_attach_device(struct
> > > > iommu_domain *domain, struct device
> > > >       /* Only allow the domain created internally. */
> > > >       mtk_mapping = data->mapping;
> > > >       if (mtk_mapping->domain != domain)
> > > > -             return 0;
> > > > +             return -EMEDIUMTYPE;
> > > >
> > > >       if (!data->m4u_dom) {
> > > >               data->m4u_dom = dom;
> > >
> > > This change looks odd. It turns the return value from success to
> > > failure. Is it a bug? If so, it should go through a separated fix
> > > patch.
> 
> Thanks for the review:)
> 
> >
> > Makes sense.
> >
> > I read the commit log of the original change:
> >
> https://lore.kernel.org/r/1589530123-30240-1-git-send-email-
> yong.wu@mediatek.com
> >
> > It doesn't seem to allow devices to get attached to different
> > domains other than the shared mapping->domain, created in the
> > in the mtk_iommu_probe_device(). So it looks like returning 0
> > is intentional. Though I am still very confused by this return
> > value here, I doubt it has ever been used in a VFIO context.
> 
> It's not used in VFIO context. "return 0" just satisfy the iommu
> framework to go ahead. and yes, here we only allow the shared "mapping-
> >domain" (All the devices share a domain created internally).
> 
> thus I think we should still keep "return 0" here.
> 

What prevent this driver from being used in VFIO context?

and why would we want to go ahead when an obvious error occurs
i.e. when a device is attached to an unexpected domain?
Yong Wu (吴勇) June 24, 2022, 10:35 a.m. UTC | #7
On Fri, 2022-06-24 at 06:16 +0000, Tian, Kevin wrote:
> > From: Yong Wu
> > Sent: Friday, June 24, 2022 1:39 PM
> > 
> > On Thu, 2022-06-23 at 19:44 -0700, Nicolin Chen wrote:
> > > On Fri, Jun 24, 2022 at 09:35:49AM +0800, Baolu Lu wrote:
> > > > External email: Use caution opening links or attachments
> > > > 
> > > > 
> > > > On 2022/6/24 04:00, Nicolin Chen wrote:
> > > > > diff --git a/drivers/iommu/mtk_iommu_v1.c
> > > > > b/drivers/iommu/mtk_iommu_v1.c
> > > > > index e1cb51b9866c..5386d889429d 100644
> > > > > --- a/drivers/iommu/mtk_iommu_v1.c
> > > > > +++ b/drivers/iommu/mtk_iommu_v1.c
> > > > > @@ -304,7 +304,7 @@ static int
> > > > > mtk_iommu_v1_attach_device(struct
> > > > > iommu_domain *domain, struct device
> > > > >       /* Only allow the domain created internally. */
> > > > >       mtk_mapping = data->mapping;
> > > > >       if (mtk_mapping->domain != domain)
> > > > > -             return 0;
> > > > > +             return -EMEDIUMTYPE;
> > > > > 
> > > > >       if (!data->m4u_dom) {
> > > > >               data->m4u_dom = dom;
> > > > 
> > > > This change looks odd. It turns the return value from success
> > > > to
> > > > failure. Is it a bug? If so, it should go through a separated
> > > > fix
> > > > patch.
> > 
> > Thanks for the review:)
> > 
> > > 
> > > Makes sense.
> > > 
> > > I read the commit log of the original change:
> > > 
> > 
> > https://lore.kernel.org/r/1589530123-30240-1-git-send-email-
> > yong.wu@mediatek.com
> > > 
> > > It doesn't seem to allow devices to get attached to different
> > > domains other than the shared mapping->domain, created in the
> > > in the mtk_iommu_probe_device(). So it looks like returning 0
> > > is intentional. Though I am still very confused by this return
> > > value here, I doubt it has ever been used in a VFIO context.
> > 
> > It's not used in VFIO context. "return 0" just satisfy the iommu
> > framework to go ahead. and yes, here we only allow the shared
> > "mapping-
> > > domain" (All the devices share a domain created internally).
> > 
> > thus I think we should still keep "return 0" here.
> > 
> 
> What prevent this driver from being used in VFIO context?

Nothing prevent this. Just I didn't test. mtk_iommu_v1.c only is used
in mt2701 and there is no VFIO scenario. I'm not sure if it supports
VFIO. (mtk_iommu.c support VFIO.)

> and why would we want to go ahead when an obvious error occurs
> i.e. when a device is attached to an unexpected domain?

The iommu flow in this file always is a bit odd as we need share iommu
domain in ARM32. As I tested before in the above link, "The iommu
framework will create a iommu domain for each a device.", therefore we
have to *workaround* in this file.

And this was expected to be fixed by:

https://lore.kernel.org/linux-iommu/cover.1597931875.git.robin.murphy@arm.com/

sorry, I don't know its current status.

Thanks.
Jason Gunthorpe June 24, 2022, 6:19 p.m. UTC | #8
On Fri, Jun 24, 2022 at 06:35:49PM +0800, Yong Wu wrote:

> > > It's not used in VFIO context. "return 0" just satisfy the iommu
> > > framework to go ahead. and yes, here we only allow the shared
> > > "mapping-domain" (All the devices share a domain created
> > > internally).

What part of the iommu framework is trying to attach a domain and
wants to see success when the domain was not actually attached ?

> > What prevent this driver from being used in VFIO context?
> 
> Nothing prevent this. Just I didn't test.

This is why it is wrong to return success here.

Jason
Nicolin Chen June 29, 2022, 7:47 p.m. UTC | #9
On Fri, Jun 24, 2022 at 03:19:43PM -0300, Jason Gunthorpe wrote:
> On Fri, Jun 24, 2022 at 06:35:49PM +0800, Yong Wu wrote:
> 
> > > > It's not used in VFIO context. "return 0" just satisfy the iommu
> > > > framework to go ahead. and yes, here we only allow the shared
> > > > "mapping-domain" (All the devices share a domain created
> > > > internally).
> 
> What part of the iommu framework is trying to attach a domain and
> wants to see success when the domain was not actually attached ?
> 
> > > What prevent this driver from being used in VFIO context?
> > 
> > Nothing prevent this. Just I didn't test.
> 
> This is why it is wrong to return success here.

Hi Yong, would you or someone you know be able to confirm whether
this "return 0" is still a must or not?

Considering that it's an old 32-bit platform for MTK, if it would
take time to do so, I'd like to drop the change in MTK driver and
note in commit log for you or other MTK folks to change in future.

Thanks
Nic
Robin Murphy June 30, 2022, 8:21 a.m. UTC | #10
On 2022-06-29 20:47, Nicolin Chen wrote:
> On Fri, Jun 24, 2022 at 03:19:43PM -0300, Jason Gunthorpe wrote:
>> On Fri, Jun 24, 2022 at 06:35:49PM +0800, Yong Wu wrote:
>>
>>>>> It's not used in VFIO context. "return 0" just satisfy the iommu
>>>>> framework to go ahead. and yes, here we only allow the shared
>>>>> "mapping-domain" (All the devices share a domain created
>>>>> internally).
>>
>> What part of the iommu framework is trying to attach a domain and
>> wants to see success when the domain was not actually attached ?
>>
>>>> What prevent this driver from being used in VFIO context?
>>>
>>> Nothing prevent this. Just I didn't test.
>>
>> This is why it is wrong to return success here.
> 
> Hi Yong, would you or someone you know be able to confirm whether
> this "return 0" is still a must or not?

 From memory, it is unfortunately required, due to this driver being in 
the rare position of having to support multiple devices in a single 
address space on 32-bit ARM. Since the old ARM DMA code doesn't 
understand groups, the driver sets up its own canonical 
dma_iommu_mapping to act like a default domain, but then has to politely 
say "yeah OK" to arm_setup_iommu_dma_ops() for each device so that they 
do all end up with the right DMA ops rather than dying in screaming 
failure (the ARM code's per-device mappings then get leaked, but we 
can't really do any better).

The whole mess disappears in the proper default domain conversion, but 
in the meantime, it's still safe to assume that nobody's doing VFIO with 
embedded display/video codec/etc. blocks that don't even have reset drivers.

Thanks,
Robin.
Yong Wu (吴勇) June 30, 2022, 9:33 a.m. UTC | #11
On Wed, 2022-06-29 at 12:47 -0700, Nicolin Chen wrote:
> On Fri, Jun 24, 2022 at 03:19:43PM -0300, Jason Gunthorpe wrote:
> > On Fri, Jun 24, 2022 at 06:35:49PM +0800, Yong Wu wrote:
> > 
> > > > > It's not used in VFIO context. "return 0" just satisfy the
> > > > > iommu
> > > > > framework to go ahead. and yes, here we only allow the shared
> > > > > "mapping-domain" (All the devices share a domain created
> > > > > internally).
> > 
> > What part of the iommu framework is trying to attach a domain and
> > wants to see success when the domain was not actually attached ?
> > 
> > > > What prevent this driver from being used in VFIO context?
> > > 
> > > Nothing prevent this. Just I didn't test.
> > 
> > This is why it is wrong to return success here.
> 
> Hi Yong, would you or someone you know be able to confirm whether
> this "return 0" is still a must or not?
> 
> Considering that it's an old 32-bit platform for MTK, if it would
> take time to do so, I'd like to drop the change in MTK driver and
> note in commit log for you or other MTK folks to change in future.

Yes. Please help drop the change in this file.

Sorry I don't have the board at hand right now and I could not list the
backtrace where this is needed(should be bus_iommu_probe from the
previous debug...)

> 
> Thanks
> Nic
Tian, Kevin June 30, 2022, 9:57 a.m. UTC | #12
> From: Robin Murphy <robin.murphy@arm.com>
> Sent: Thursday, June 30, 2022 4:22 PM
> 
> On 2022-06-29 20:47, Nicolin Chen wrote:
> > On Fri, Jun 24, 2022 at 03:19:43PM -0300, Jason Gunthorpe wrote:
> >> On Fri, Jun 24, 2022 at 06:35:49PM +0800, Yong Wu wrote:
> >>
> >>>>> It's not used in VFIO context. "return 0" just satisfy the iommu
> >>>>> framework to go ahead. and yes, here we only allow the shared
> >>>>> "mapping-domain" (All the devices share a domain created
> >>>>> internally).
> >>
> >> What part of the iommu framework is trying to attach a domain and
> >> wants to see success when the domain was not actually attached ?
> >>
> >>>> What prevent this driver from being used in VFIO context?
> >>>
> >>> Nothing prevent this. Just I didn't test.
> >>
> >> This is why it is wrong to return success here.
> >
> > Hi Yong, would you or someone you know be able to confirm whether
> > this "return 0" is still a must or not?
> 
>  From memory, it is unfortunately required, due to this driver being in
> the rare position of having to support multiple devices in a single
> address space on 32-bit ARM. Since the old ARM DMA code doesn't
> understand groups, the driver sets up its own canonical
> dma_iommu_mapping to act like a default domain, but then has to politely
> say "yeah OK" to arm_setup_iommu_dma_ops() for each device so that they
> do all end up with the right DMA ops rather than dying in screaming
> failure (the ARM code's per-device mappings then get leaked, but we
> can't really do any better).
> 
> The whole mess disappears in the proper default domain conversion, but
> in the meantime, it's still safe to assume that nobody's doing VFIO with
> embedded display/video codec/etc. blocks that don't even have reset drivers.
> 

Probably above is worth a comment in mtk code so we don't need
always dig it out from memory when similar question arises in the
the future. 😊
Nicolin Chen June 30, 2022, 3:45 p.m. UTC | #13
On Thu, Jun 30, 2022 at 05:33:16PM +0800, Yong Wu wrote:
> External email: Use caution opening links or attachments
> 
> 
> On Wed, 2022-06-29 at 12:47 -0700, Nicolin Chen wrote:
> > On Fri, Jun 24, 2022 at 03:19:43PM -0300, Jason Gunthorpe wrote:
> > > On Fri, Jun 24, 2022 at 06:35:49PM +0800, Yong Wu wrote:
> > >
> > > > > > It's not used in VFIO context. "return 0" just satisfy the
> > > > > > iommu
> > > > > > framework to go ahead. and yes, here we only allow the shared
> > > > > > "mapping-domain" (All the devices share a domain created
> > > > > > internally).
> > >
> > > What part of the iommu framework is trying to attach a domain and
> > > wants to see success when the domain was not actually attached ?
> > >
> > > > > What prevent this driver from being used in VFIO context?
> > > >
> > > > Nothing prevent this. Just I didn't test.
> > >
> > > This is why it is wrong to return success here.
> >
> > Hi Yong, would you or someone you know be able to confirm whether
> > this "return 0" is still a must or not?
> >
> > Considering that it's an old 32-bit platform for MTK, if it would
> > take time to do so, I'd like to drop the change in MTK driver and
> > note in commit log for you or other MTK folks to change in future.
> 
> Yes. Please help drop the change in this file.
> 
> Sorry I don't have the board at hand right now and I could not list the
> backtrace where this is needed(should be bus_iommu_probe from the
> previous debug...)

OK. Thanks for the reply.
Nicolin Chen June 30, 2022, 3:47 p.m. UTC | #14
On Thu, Jun 30, 2022 at 09:21:42AM +0100, Robin Murphy wrote:
> External email: Use caution opening links or attachments
> 
> 
> On 2022-06-29 20:47, Nicolin Chen wrote:
> > On Fri, Jun 24, 2022 at 03:19:43PM -0300, Jason Gunthorpe wrote:
> > > On Fri, Jun 24, 2022 at 06:35:49PM +0800, Yong Wu wrote:
> > > 
> > > > > > It's not used in VFIO context. "return 0" just satisfy the iommu
> > > > > > framework to go ahead. and yes, here we only allow the shared
> > > > > > "mapping-domain" (All the devices share a domain created
> > > > > > internally).
> > > 
> > > What part of the iommu framework is trying to attach a domain and
> > > wants to see success when the domain was not actually attached ?
> > > 
> > > > > What prevent this driver from being used in VFIO context?
> > > > 
> > > > Nothing prevent this. Just I didn't test.
> > > 
> > > This is why it is wrong to return success here.
> > 
> > Hi Yong, would you or someone you know be able to confirm whether
> > this "return 0" is still a must or not?
> 
> From memory, it is unfortunately required, due to this driver being in
> the rare position of having to support multiple devices in a single
> address space on 32-bit ARM. Since the old ARM DMA code doesn't
> understand groups, the driver sets up its own canonical
> dma_iommu_mapping to act like a default domain, but then has to politely
> say "yeah OK" to arm_setup_iommu_dma_ops() for each device so that they
> do all end up with the right DMA ops rather than dying in screaming
> failure (the ARM code's per-device mappings then get leaked, but we
> can't really do any better).
> 
> The whole mess disappears in the proper default domain conversion, but
> in the meantime, it's still safe to assume that nobody's doing VFIO with
> embedded display/video codec/etc. blocks that don't even have reset drivers.

Thanks for the input! I'll just respin it by dropping mtk_v1 diff.

Nic