mbox series

[RFC,00/34] iommu: Move iommu_group setup to IOMMU core code

Message ID 20200407183742.4344-1-joro@8bytes.org
Headers show
Series iommu: Move iommu_group setup to IOMMU core code | expand

Message

Joerg Roedel April 7, 2020, 6:37 p.m. UTC
Hi,

here is a patch-set to remove all calls of iommu_group_get_for_dev() from
the IOMMU drivers and move the per-device group setup and default domain
allocation into the IOMMU core code.

This eliminates some ugly back and forth between IOMMU core code and the
IOMMU drivers, where the driver called iommu_group_get_for_dev() which itself
called back into the driver.

The patch-set started as a "quick" Friday afternoon project to split the
IOMMU group creation and the allocation of the default domain, so that the
default domain is not allocated before all devices are added to the group.
In the end it took 1.5 weeks to get this in a reasonable shape, but now the
code (during bus probing) first adds all devices to their respective IOMMU
group before it determines the default domain type and then allocates it for
the group.

It turned out that this required to remove the calls of
iommu_group_get_for_dev() from the IOMMU drivers. While at it, the calls to
iommu_device_link()/unlink() where also moved out of the drivers, which
required a different interface than add_device()/remove_device(). The result
is the new probe_device()/release_device() interface, where the driver just
does its own setup and then returns the iommu_device which belongs to the
device being probed.

There is certainly more room for cleanups, but I think this is a good start
to simplify the code flow during IOMMU device probing.  It is also a more
robust base for the pending patch-sets which implement per-group default
domain types and the removal of the private domains from the Intel VT-d
driver.

With regards to testing, I verified this code works on three IOMMUs:

	- AMD-Vi
	- Intel VT-d (but there might be breakages on some hardware, the
	  patches to remove the private domain handling from the VT-d driver
	  should be rebased to these patches)
	- ARM-SMMU-v3 (as emulated by QEMU)

Most driver conversions to the probe_device()/release_device() interface
were trivial, but there were also some hard nuts, which I am not sure still
work. The more difficult drivers were:

	- ARM-SMMU-v2
	- OMAP
	- Renesas
	- Mediatek IOMMU v1
	- Exynos

It would be great if the changes could be tested (and possibly fixed) on
those IOMMUs, as I can't do testing on them.

The patches are based on the current iommu/next branch, I will rebase them
to v5.7-rc1 when it comes out. A branch with these patches applied can be
found here:

	https://git.kernel.org/pub/scm/linux/kernel/git/joro/linux.git/log/?h=iommu-probe-device

Please review and test these changes and let me know what breaks.

Thanks,

	Joerg

Joerg Roedel (33):
  iommu: Move default domain allocation to separate function
  iommu/amd: Implement iommu_ops->def_domain_type call-back
  iommu/vt-d: Wire up iommu_ops->def_domain_type
  iommu/amd: Remove dma_mask check from check_device()
  iommu/amd: Return -ENODEV in add_device when device is not handled by
    IOMMU
  iommu: Add probe_device() and remove_device() call-backs
  iommu: Move default domain allocation to iommu_probe_device()
  iommu: Keep a list of allocated groups in __iommu_probe_device()
  iommu: Move new probe_device path to separate function
  iommu: Split off default domain allocation from group assignment
  iommu: Move iommu_group_create_direct_mappings() out of
    iommu_group_add_device()
  iommu: Export bus_iommu_probe() and make is safe for re-probing
  iommu/amd: Remove dev_data->passthrough
  iommu/amd: Convert to probe/release_device() call-backs
  iommu/vt-d: Convert to probe/release_device() call-backs
  iommu/arm-smmu: Store device instead of group in arm_smmu_s2cr
  iommu/arm-smmu: Convert to probe/release_device() call-backs
  iommu/pamu: Convert to probe/release_device() call-backs
  iommu/s390: Convert to probe/release_device() call-backs
  iommu/virtio: Convert to probe/release_device() call-backs
  iommu/msm: Convert to probe/release_device() call-backs
  iommu/mediatek: Convert to probe/release_device() call-backs
  iommu/mediatek-v1 Convert to probe/release_device() call-backs
  iommu/qcom: Convert to probe/release_device() call-backs
  iommu/rockchip: Convert to probe/release_device() call-backs
  iommu/tegra: Convert to probe/release_device() call-backs
  iommu/renesas: Convert to probe/release_device() call-backs
  iommu/omap: Remove orphan_dev tracking
  iommu/omap: Convert to probe/release_device() call-backs
  iommu/exynos: Create iommu_device in struct exynos_iommu_owner
  iommu/exynos: Convert to probe/release_device() call-backs
  iommu: Remove add_device()/remove_device() code-paths
  iommu: Unexport iommu_group_get_for_dev()

Sai Praneeth Prakhya (1):
  iommu: Add def_domain_type() callback in iommu_ops

 drivers/iommu/amd_iommu.c       |  97 ++++----
 drivers/iommu/amd_iommu_types.h |   1 -
 drivers/iommu/arm-smmu-v3.c     |  42 +---
 drivers/iommu/arm-smmu.c        |  44 ++--
 drivers/iommu/exynos-iommu.c    | 113 ++++++---
 drivers/iommu/fsl_pamu_domain.c |  22 +-
 drivers/iommu/intel-iommu.c     |  68 +-----
 drivers/iommu/iommu.c           | 391 +++++++++++++++++++++++++-------
 drivers/iommu/ipmmu-vmsa.c      |  60 ++---
 drivers/iommu/msm_iommu.c       |  34 +--
 drivers/iommu/mtk_iommu.c       |  24 +-
 drivers/iommu/mtk_iommu_v1.c    |  50 ++--
 drivers/iommu/omap-iommu.c      |  99 ++------
 drivers/iommu/qcom_iommu.c      |  24 +-
 drivers/iommu/rockchip-iommu.c  |  26 +--
 drivers/iommu/s390-iommu.c      |  22 +-
 drivers/iommu/tegra-gart.c      |  24 +-
 drivers/iommu/tegra-smmu.c      |  31 +--
 drivers/iommu/virtio-iommu.c    |  41 +---
 include/linux/iommu.h           |  21 +-
 20 files changed, 600 insertions(+), 634 deletions(-)

Comments

Marek Szyprowski April 8, 2020, 12:23 p.m. UTC | #1
Hi Joerg,

On 07.04.2020 20:37, Joerg Roedel wrote:
> From: Joerg Roedel <jroedel@suse.de>
>
> The 'struct exynos_iommu_owner' is an umbrella for multiple SYSMMU
> instances attached to one master. As such all these instances are
> handled the same, they are all configured with the same iommu_domain,
> for example.
>
> The IOMMU core code expects each device to have only one IOMMU
> attached, so create the IOMMU-device for the umbrella instead of each
> hardware SYSMMU.
>
> Signed-off-by: Joerg Roedel <jroedel@suse.de>
> ---
>   drivers/iommu/exynos-iommu.c | 96 +++++++++++++++++++++++++++---------
>   1 file changed, 73 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
> index 186ff5cc975c..86ecccbf0438 100644
> --- a/drivers/iommu/exynos-iommu.c
> +++ b/drivers/iommu/exynos-iommu.c
> @@ -235,6 +235,8 @@ struct exynos_iommu_owner {
>   	struct list_head controllers;	/* list of sysmmu_drvdata.owner_node */
>   	struct iommu_domain *domain;	/* domain this device is attached */
>   	struct mutex rpm_lock;		/* for runtime pm of all sysmmus */
> +
> +	struct iommu_device iommu;	/* IOMMU core handle */
>   };
>   
>   /*
> @@ -274,8 +276,6 @@ struct sysmmu_drvdata {
>   	struct list_head owner_node;	/* node for owner controllers list */
>   	phys_addr_t pgtable;		/* assigned page table structure */
>   	unsigned int version;		/* our version */
> -
> -	struct iommu_device iommu;	/* IOMMU core handle */
>   };
>   
>   static struct exynos_iommu_domain *to_exynos_domain(struct iommu_domain *dom)
> @@ -625,18 +625,6 @@ static int exynos_sysmmu_probe(struct platform_device *pdev)
>   	data->sysmmu = dev;
>   	spin_lock_init(&data->lock);
>   
> -	ret = iommu_device_sysfs_add(&data->iommu, &pdev->dev, NULL,
> -				     dev_name(data->sysmmu));
> -	if (ret)
> -		return ret;
> -
> -	iommu_device_set_ops(&data->iommu, &exynos_iommu_ops);
> -	iommu_device_set_fwnode(&data->iommu, &dev->of_node->fwnode);

The iommu_device_set_fwnode() call is lost during this conversion, what breaks driver operation. Most of the above IOMMU fw calls you have moved to xlate function. I've checked briefly but it looks that there is a chicken-egg problem here. The owner structure is allocated and initialized from of_xlate(), which won't be called without linking the problem iommu structure with the fwnode first, what might be done only in sysmmu_probe(). I will check how to handle this in a different way.

> -
> -	ret = iommu_device_register(&data->iommu);
> -	if (ret)
> -		return ret;
> -
>   	platform_set_drvdata(pdev, data);
>   
>   	__sysmmu_get_version(data);
> @@ -1261,6 +1249,8 @@ static int exynos_iommu_add_device(struct device *dev)
>   	}
>   	iommu_group_put(group);
>   
> +	iommu_device_link(&owner->iommu, dev);
> +
>   	return 0;
>   }
>   
> @@ -1282,18 +1272,82 @@ static void exynos_iommu_remove_device(struct device *dev)
>   			iommu_group_put(group);
>   		}
>   	}
> +	iommu_device_unlink(&owner->iommu, dev);
>   	iommu_group_remove_device(dev);
>   
>   	list_for_each_entry(data, &owner->controllers, owner_node)
>   		device_link_del(data->link);
>   }
>   
> +static int exynos_iommu_device_init(struct exynos_iommu_owner *owner)
> +{
> +	static u32 counter = 0;
> +	int ret;
> +
> +	/*
> +	 * Create a virtual IOMMU device. In reality it is an umbrella for a
> +	 * number of SYSMMU platform devices, but that also means that any
> +	 * master can have more than one real IOMMU device. This drivers handles
> +	 * all the real devices for one master synchronously, so they appear as
> +	 * one anyway.
> +	 */
> +	ret = iommu_device_sysfs_add(&owner->iommu, NULL, NULL,
> +				     "sysmmu-owner-%d", counter++);
> +	if (ret)
> +		return ret;
> +
> +	iommu_device_set_ops(&owner->iommu, &exynos_iommu_ops);
> +
> +	return 0;
> +}
> +
> +static void exynos_iommu_device_remove(struct exynos_iommu_owner *owner)
> +{
> +	iommu_device_set_ops(&owner->iommu, NULL);
> +	iommu_device_sysfs_remove(&owner->iommu);
> +}
> +
> +static int exynos_owner_init(struct device *dev)
> +{
> +	struct exynos_iommu_owner *owner = dev->archdata.iommu;
> +	int ret;
> +
> +	if (owner)
> +		return 0;
> +
> +	owner = kzalloc(sizeof(*owner), GFP_KERNEL);
> +	if (!owner)
> +		return -ENOMEM;
> +
> +	ret = exynos_iommu_device_init(owner);
> +	if (ret)
> +		goto out_free_owner;
> +
> +	ret = iommu_device_register(&owner->iommu);
> +	if (ret)
> +		goto out_remove_iommu_device;
> +
> +	INIT_LIST_HEAD(&owner->controllers);
> +	mutex_init(&owner->rpm_lock);
> +	dev->archdata.iommu = owner;
> +
> +	return 0;
> +
> +out_remove_iommu_device:
> +	exynos_iommu_device_remove(owner);
> +out_free_owner:
> +	kfree(owner);
> +
> +	return ret;
> +}
> +
>   static int exynos_iommu_of_xlate(struct device *dev,
>   				 struct of_phandle_args *spec)
>   {
> -	struct exynos_iommu_owner *owner = dev->archdata.iommu;
>   	struct platform_device *sysmmu = of_find_device_by_node(spec->np);
>   	struct sysmmu_drvdata *data, *entry;
> +	struct exynos_iommu_owner *owner;
> +	int ret;
>   
>   	if (!sysmmu)
>   		return -ENODEV;
> @@ -1302,15 +1356,11 @@ static int exynos_iommu_of_xlate(struct device *dev,
>   	if (!data)
>   		return -ENODEV;
>   
> -	if (!owner) {
> -		owner = kzalloc(sizeof(*owner), GFP_KERNEL);
> -		if (!owner)
> -			return -ENOMEM;
> +	ret = exynos_owner_init(dev);
> +	if (ret)
> +		return ret;
>   
> -		INIT_LIST_HEAD(&owner->controllers);
> -		mutex_init(&owner->rpm_lock);
> -		dev->archdata.iommu = owner;
> -	}
> +	owner = dev->archdata.iommu;
>   
>   	list_for_each_entry(entry, &owner->controllers, owner_node)
>   		if (entry == data)

Best regards
Marek Szyprowski April 8, 2020, 2:23 p.m. UTC | #2
Hi again,

On 08.04.2020 14:23, Marek Szyprowski wrote:
> Hi Joerg,
>
> On 07.04.2020 20:37, Joerg Roedel wrote:
>> From: Joerg Roedel <jroedel@suse.de>
>>
>> The 'struct exynos_iommu_owner' is an umbrella for multiple SYSMMU
>> instances attached to one master. As such all these instances are
>> handled the same, they are all configured with the same iommu_domain,
>> for example.
>>
>> The IOMMU core code expects each device to have only one IOMMU
>> attached, so create the IOMMU-device for the umbrella instead of each
>> hardware SYSMMU.
>>
>> Signed-off-by: Joerg Roedel <jroedel@suse.de>
>> ---
>>   drivers/iommu/exynos-iommu.c | 96 +++++++++++++++++++++++++++---------
>>   1 file changed, 73 insertions(+), 23 deletions(-)
>>
>> diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
>> index 186ff5cc975c..86ecccbf0438 100644
>> --- a/drivers/iommu/exynos-iommu.c
>> +++ b/drivers/iommu/exynos-iommu.c
>> @@ -235,6 +235,8 @@ struct exynos_iommu_owner {
>>       struct list_head controllers;    /* list of 
>> sysmmu_drvdata.owner_node */
>>       struct iommu_domain *domain;    /* domain this device is 
>> attached */
>>       struct mutex rpm_lock;        /* for runtime pm of all sysmmus */
>> +
>> +    struct iommu_device iommu;    /* IOMMU core handle */
>>   };
>>     /*
>> @@ -274,8 +276,6 @@ struct sysmmu_drvdata {
>>       struct list_head owner_node;    /* node for owner controllers 
>> list */
>>       phys_addr_t pgtable;        /* assigned page table structure */
>>       unsigned int version;        /* our version */
>> -
>> -    struct iommu_device iommu;    /* IOMMU core handle */
>>   };
>>     static struct exynos_iommu_domain *to_exynos_domain(struct 
>> iommu_domain *dom)
>> @@ -625,18 +625,6 @@ static int exynos_sysmmu_probe(struct 
>> platform_device *pdev)
>>       data->sysmmu = dev;
>>       spin_lock_init(&data->lock);
>>   -    ret = iommu_device_sysfs_add(&data->iommu, &pdev->dev, NULL,
>> -                     dev_name(data->sysmmu));
>> -    if (ret)
>> -        return ret;
>> -
>> -    iommu_device_set_ops(&data->iommu, &exynos_iommu_ops);
>> -    iommu_device_set_fwnode(&data->iommu, &dev->of_node->fwnode);
>
> The iommu_device_set_fwnode() call is lost during this conversion, 
> what breaks driver operation. Most of the above IOMMU fw calls you 
> have moved to xlate function. I've checked briefly but it looks that 
> there is a chicken-egg problem here. The owner structure is allocated 
> and initialized from of_xlate(), which won't be called without linking 
> the problem iommu structure with the fwnode first, what might be done 
> only in sysmmu_probe(). I will check how to handle this in a different 
> way.

I've played with this a bit and it looks that won't be easy to make it 
working on ARM 32bit.

I need a place to initialize properly all the structures for the given 
master (IOMMU client device, the one which performs DMA operations).

I tried to move all the initialization from xlate() to device_probe(), 
but such approach doesn't work.

On ARM32bit exynos_iommu_device_probe() is called by the device core and 
IOMMU framework very early, before the Exynos SYSMMU platform devices 
(controllers) are probed yet. Even iommu class is not yet initialized 
that time, so returning anything successful from it causes following 
NULL ptr dereference:

Unable to handle kernel NULL pointer dereference at virtual address 0000004c
...

(__iommu_probe_device) from [<c055b334>] (iommu_probe_device+0x18/0x114)
(iommu_probe_device) from [<c055b4ac>] (iommu_bus_notifier+0x7c/0x94)
(iommu_bus_notifier) from [<c014e8ec>] (notifier_call_chain+0x44/0x84)
(notifier_call_chain) from [<c014e9ec>] 
(__blocking_notifier_call_chain+0x48/0x60)
(__blocking_notifier_call_chain) from [<c014ea1c>] 
(blocking_notifier_call_chain+0x18/0x20)
(blocking_notifier_call_chain) from [<c05c8d1c>] (device_add+0x3e8/0x704)
(device_add) from [<c07bafc4>] (of_platform_device_create_pdata+0x90/0xc4)
(of_platform_device_create_pdata) from [<c07bb138>] 
(of_platform_bus_create+0x134/0x48c)
(of_platform_bus_create) from [<c07bb1a4>] 
(of_platform_bus_create+0x1a0/0x48c)
(of_platform_bus_create) from [<c07bb63c>] (of_platform_populate+0x84/0x114)
(of_platform_populate) from [<c1125e9c>] 
(of_platform_default_populate_init+0x90/0xac)
(of_platform_default_populate_init) from [<c010326c>] 
(do_one_initcall+0x80/0x42c)
(do_one_initcall) from [<c1101074>] (kernel_init_freeable+0x15c/0x208)
(kernel_init_freeable) from [<c0afdde0>] (kernel_init+0x8/0x118)
(kernel_init) from [<c01010b4>] (ret_from_fork+0x14/0x20)

I've tried returning ERR_PTR(-EPROBE_DEFER) from device_probe(), but I 
doesn't work there. Some more changes in the framework are needed...

 > ...

Best regards
Marek Szyprowski April 9, 2020, 1:58 p.m. UTC | #3
Hi Joerg,

On 09.04.2020 13:46, Joerg Roedel wrote:
> Hi Marek,
>
> I had some more thoughts and discussions with Robin about how to make
> this work with the Exynos driver. The result is the patch below, can you
> please test it and report back? Even better if you can fix up any
> breakage it might cause :)

I've checked and it works fine on top of 
ff68eb23308e6538ec7864c83d39540f423bbe90. However I'm not a fan of 
removing this 'owner' structure. It gave a nice abstraction for the all 
SYSMMU controllers for the given device (although most devices in the 
system have only one SYSMMU). Why this structure is a problem for your 
rework?

I've also spent some time trying to fix exynos-iommu on top of your 
iommu-probe-device branch. I really wonder if it works on any ARM 32bit 
or 64bit systems for other IOMMUs.

I got something working on ARM32bit, but I have to move all the 
initialization from exynos_iommu_probe_device/exynos_iommu_of_xlate to 
exynos_sysmmu_probe(). I don't like such approach, because I had to use 
bus_find_device() and manually find the owner for the every SYSMMU 
controller in the system. This approach also lack a proper symmetry and 
release path.

The main problem after your conversion is the fact that ->probe_device() 
is called very early, before any other platform device (thus IOMMU 
controller) is is probed. It doesn't handle EPROBE_DEFER too.

The other issue I've noticed is that iommu_device_set_ops() is not 
enough to assign ops properly. I had to add iommu_fwspec_init(dev, 
&dev->of_node->fwnode, &exynos_iommu_ops) to ensure that the 'dev' gets 
proper iommu ops.

I will send my patch in a few minutes to show you the changes.

> >From 60a288509baa34df6a0bf437c977925a0a617c72 Mon Sep 17 00:00:00 2001
> From: Joerg Roedel <jroedel@suse.de>
> Date: Thu, 9 Apr 2020 13:38:18 +0200
> Subject: [PATCH] iommu/exynos: Get rid of 'struct exynos_iommu_owner'
>
> Remove 'struct exynos_iommu_owner' and replace it with a single-linked
> list of 'struct sysmmu_drvdata'. The first item in the list acts as a
> replacement for the previous exynos_iommu_owner structure. The
> iommu_device member of the first list item is reported to the IOMMU
> core code for the master device.
>
> Signed-off-by: Joerg Roedel <jroedel@suse.de>
> ---
>   drivers/iommu/exynos-iommu.c | 155 ++++++++++++++++++++---------------
>   1 file changed, 88 insertions(+), 67 deletions(-)
>
> diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
> index 186ff5cc975c..e70eb360093f 100644
> --- a/drivers/iommu/exynos-iommu.c
> +++ b/drivers/iommu/exynos-iommu.c
> @@ -225,18 +225,6 @@ static const struct sysmmu_fault_info sysmmu_v5_faults[] = {
>   	{ 20, REG_V5_FAULT_AW_VA, "AW SECURITY PROTECTION", IOMMU_FAULT_WRITE },
>   };
>   
> -/*
> - * This structure is attached to dev.archdata.iommu of the master device
> - * on device add, contains a list of SYSMMU controllers defined by device tree,
> - * which are bound to given master device. It is usually referenced by 'owner'
> - * pointer.
> -*/
> -struct exynos_iommu_owner {
> -	struct list_head controllers;	/* list of sysmmu_drvdata.owner_node */
> -	struct iommu_domain *domain;	/* domain this device is attached */
> -	struct mutex rpm_lock;		/* for runtime pm of all sysmmus */
> -};
> -
>   /*
>    * This structure exynos specific generalization of struct iommu_domain.
>    * It contains list of SYSMMU controllers from all master devices, which has
> @@ -271,13 +259,23 @@ struct sysmmu_drvdata {
>   	bool active;			/* current status */
>   	struct exynos_iommu_domain *domain; /* domain we belong to */
>   	struct list_head domain_node;	/* node for domain clients list */
> -	struct list_head owner_node;	/* node for owner controllers list */
> +	struct sysmmu_drvdata *next;	/* Single-linked list to group SMMUs for
> +					   one master. NULL means not in any
> +					   list, ERR_PTR(-ENODEV) means end of
> +					   list */
> +	struct mutex rpm_lock;		/* for runtime pm of all sysmmus */
>   	phys_addr_t pgtable;		/* assigned page table structure */
>   	unsigned int version;		/* our version */
>   
>   	struct iommu_device iommu;	/* IOMMU core handle */
>   };
>   
> +/* Helper to iterate over all SYSMMUs for a given platform device */
> +#define for_each_sysmmu(dev, drvdata)			\
> +	for (drvdata = (dev)->archdata.iommu;		\
> +	     drvdata != ERR_PTR(-ENODEV);		\
> +	     drvdata = drvdata->next)
> +
>   static struct exynos_iommu_domain *to_exynos_domain(struct iommu_domain *dom)
>   {
>   	return container_of(dom, struct exynos_iommu_domain, domain);
> @@ -624,6 +622,8 @@ static int exynos_sysmmu_probe(struct platform_device *pdev)
>   
>   	data->sysmmu = dev;
>   	spin_lock_init(&data->lock);
> +	data->next = NULL;
> +	mutex_init(&data->rpm_lock);
>   
>   	ret = iommu_device_sysfs_add(&data->iommu, &pdev->dev, NULL,
>   				     dev_name(data->sysmmu));
> @@ -668,17 +668,20 @@ static int __maybe_unused exynos_sysmmu_suspend(struct device *dev)
>   {
>   	struct sysmmu_drvdata *data = dev_get_drvdata(dev);
>   	struct device *master = data->master;
> +	struct sysmmu_drvdata *master_data;
>   
> -	if (master) {
> -		struct exynos_iommu_owner *owner = master->archdata.iommu;
> +	if (!master)
> +		return 0;
>   
> -		mutex_lock(&owner->rpm_lock);
> -		if (data->domain) {
> -			dev_dbg(data->sysmmu, "saving state\n");
> -			__sysmmu_disable(data);
> -		}
> -		mutex_unlock(&owner->rpm_lock);
> +	master_data = master->archdata.iommu;
> +
> +	mutex_lock(&master_data->rpm_lock);
> +	if (data->domain) {
> +		dev_dbg(data->sysmmu, "saving state\n");
> +		__sysmmu_disable(data);
>   	}
> +	mutex_unlock(&master_data->rpm_lock);
> +
>   	return 0;
>   }
>   
> @@ -686,17 +689,20 @@ static int __maybe_unused exynos_sysmmu_resume(struct device *dev)
>   {
>   	struct sysmmu_drvdata *data = dev_get_drvdata(dev);
>   	struct device *master = data->master;
> +	struct sysmmu_drvdata *master_data;
>   
> -	if (master) {
> -		struct exynos_iommu_owner *owner = master->archdata.iommu;
> +	if (!master)
> +		return 0;
>   
> -		mutex_lock(&owner->rpm_lock);
> -		if (data->domain) {
> -			dev_dbg(data->sysmmu, "restoring state\n");
> -			__sysmmu_enable(data);
> -		}
> -		mutex_unlock(&owner->rpm_lock);
> +	master_data = master->archdata.iommu;
> +
> +	mutex_lock(&master_data->rpm_lock);
> +	if (data->domain) {
> +		dev_dbg(data->sysmmu, "restoring state\n");
> +		__sysmmu_enable(data);
>   	}
> +	mutex_unlock(&master_data->rpm_lock);
> +
>   	return 0;
>   }
>   
> @@ -834,21 +840,21 @@ static void exynos_iommu_domain_free(struct iommu_domain *iommu_domain)
>   	kfree(domain);
>   }
>   
> -static void exynos_iommu_detach_device(struct iommu_domain *iommu_domain,
> -				    struct device *dev)
> +static void __exynos_iommu_detach_device(struct exynos_iommu_domain *domain,
> +					 struct device *dev)
>   {
> -	struct exynos_iommu_owner *owner = dev->archdata.iommu;
> -	struct exynos_iommu_domain *domain = to_exynos_domain(iommu_domain);
>   	phys_addr_t pagetable = virt_to_phys(domain->pgtable);
> -	struct sysmmu_drvdata *data, *next;
> +	struct sysmmu_drvdata *dev_data, *data, *next;
>   	unsigned long flags;
>   
> -	if (!has_sysmmu(dev) || owner->domain != iommu_domain)
> +	dev_data = dev->archdata.iommu;
> +
> +	if (!has_sysmmu(dev) || dev_data->domain != domain)
>   		return;
>   
> -	mutex_lock(&owner->rpm_lock);
> +	mutex_lock(&dev_data->rpm_lock);
>   
> -	list_for_each_entry(data, &owner->controllers, owner_node) {
> +	for_each_sysmmu(dev, data) {
>   		pm_runtime_get_noresume(data->sysmmu);
>   		if (pm_runtime_active(data->sysmmu))
>   			__sysmmu_disable(data);
> @@ -863,51 +869,59 @@ static void exynos_iommu_detach_device(struct iommu_domain *iommu_domain,
>   		list_del_init(&data->domain_node);
>   		spin_unlock(&data->lock);
>   	}
> -	owner->domain = NULL;
>   	spin_unlock_irqrestore(&domain->lock, flags);
>   
> -	mutex_unlock(&owner->rpm_lock);
> +	mutex_unlock(&dev_data->rpm_lock);
>   
>   	dev_dbg(dev, "%s: Detached IOMMU with pgtable %pa\n", __func__,
>   		&pagetable);
>   }
>   
> +static void exynos_iommu_detach_device(struct iommu_domain *iommu_domain,
> +				       struct device *dev)
> +{
> +	struct exynos_iommu_domain *domain = to_exynos_domain(iommu_domain);
> +
> +	__exynos_iommu_detach_device(domain, dev);
> +}
> +
>   static int exynos_iommu_attach_device(struct iommu_domain *iommu_domain,
>   				   struct device *dev)
>   {
> -	struct exynos_iommu_owner *owner = dev->archdata.iommu;
>   	struct exynos_iommu_domain *domain = to_exynos_domain(iommu_domain);
> -	struct sysmmu_drvdata *data;
> +	struct sysmmu_drvdata *dev_data, *data;
>   	phys_addr_t pagetable = virt_to_phys(domain->pgtable);
>   	unsigned long flags;
>   
>   	if (!has_sysmmu(dev))
>   		return -ENODEV;
>   
> -	if (owner->domain)
> -		exynos_iommu_detach_device(owner->domain, dev);
> +	dev_data = dev->archdata.iommu;
>   
> -	mutex_lock(&owner->rpm_lock);
> +	if (dev_data->domain)
> +		__exynos_iommu_detach_device(dev_data->domain, dev);
> +
> +	mutex_lock(&dev_data->rpm_lock);
>   
>   	spin_lock_irqsave(&domain->lock, flags);
> -	list_for_each_entry(data, &owner->controllers, owner_node) {
> +	for_each_sysmmu(dev, data) {
>   		spin_lock(&data->lock);
>   		data->pgtable = pagetable;
>   		data->domain = domain;
>   		list_add_tail(&data->domain_node, &domain->clients);
>   		spin_unlock(&data->lock);
>   	}
> -	owner->domain = iommu_domain;
>   	spin_unlock_irqrestore(&domain->lock, flags);
>   
> -	list_for_each_entry(data, &owner->controllers, owner_node) {
> +
> +	for_each_sysmmu(dev, data) {
>   		pm_runtime_get_noresume(data->sysmmu);
>   		if (pm_runtime_active(data->sysmmu))
>   			__sysmmu_enable(data);
>   		pm_runtime_put(data->sysmmu);
>   	}
>   
> -	mutex_unlock(&owner->rpm_lock);
> +	mutex_unlock(&dev_data->rpm_lock);
>   
>   	dev_dbg(dev, "%s: Attached IOMMU with pgtable %pa\n", __func__,
>   		&pagetable);
> @@ -1237,7 +1251,6 @@ static phys_addr_t exynos_iommu_iova_to_phys(struct iommu_domain *iommu_domain,
>   
>   static int exynos_iommu_add_device(struct device *dev)
>   {
> -	struct exynos_iommu_owner *owner = dev->archdata.iommu;
>   	struct sysmmu_drvdata *data;
>   	struct iommu_group *group;
>   
> @@ -1249,7 +1262,7 @@ static int exynos_iommu_add_device(struct device *dev)
>   	if (IS_ERR(group))
>   		return PTR_ERR(group);
>   
> -	list_for_each_entry(data, &owner->controllers, owner_node) {
> +	for_each_sysmmu(dev, data) {
>   		/*
>   		 * SYSMMU will be runtime activated via device link
>   		 * (dependency) to its master device, so there are no
> @@ -1261,37 +1274,39 @@ static int exynos_iommu_add_device(struct device *dev)
>   	}
>   	iommu_group_put(group);
>   
> +	data = dev->archdata.iommu;
> +	iommu_device_link(&data->iommu, dev);
> +
>   	return 0;
>   }
>   
>   static void exynos_iommu_remove_device(struct device *dev)
>   {
> -	struct exynos_iommu_owner *owner = dev->archdata.iommu;
> -	struct sysmmu_drvdata *data;
> +	struct sysmmu_drvdata *data = dev->archdata.iommu;
>   
>   	if (!has_sysmmu(dev))
>   		return;
>   
> -	if (owner->domain) {
> +	if (data->domain) {
>   		struct iommu_group *group = iommu_group_get(dev);
>   
>   		if (group) {
> -			WARN_ON(owner->domain !=
> +			WARN_ON(&data->domain->domain !=
>   				iommu_group_default_domain(group));
> -			exynos_iommu_detach_device(owner->domain, dev);
> +			__exynos_iommu_detach_device(data->domain, dev);
>   			iommu_group_put(group);
>   		}
>   	}
> +	iommu_device_unlink(&data->iommu, dev);
>   	iommu_group_remove_device(dev);
>   
> -	list_for_each_entry(data, &owner->controllers, owner_node)
> +	for_each_sysmmu(dev, data)
>   		device_link_del(data->link);
>   }
>   
>   static int exynos_iommu_of_xlate(struct device *dev,
>   				 struct of_phandle_args *spec)
>   {
> -	struct exynos_iommu_owner *owner = dev->archdata.iommu;
>   	struct platform_device *sysmmu = of_find_device_by_node(spec->np);
>   	struct sysmmu_drvdata *data, *entry;
>   
> @@ -1302,22 +1317,28 @@ static int exynos_iommu_of_xlate(struct device *dev,
>   	if (!data)
>   		return -ENODEV;
>   
> -	if (!owner) {
> -		owner = kzalloc(sizeof(*owner), GFP_KERNEL);
> -		if (!owner)
> -			return -ENOMEM;
> +	data->master = dev;
>   
> -		INIT_LIST_HEAD(&owner->controllers);
> -		mutex_init(&owner->rpm_lock);
> -		dev->archdata.iommu = owner;
> +	if (!dev->archdata.iommu) {
> +		WARN_ON(data->next != NULL);
> +
> +		/* SYSMMU list is empty - add drvdata and return */
> +		data->next = ERR_PTR(-ENODEV);
> +		dev->archdata.iommu = data;
> +
> +		return 0;
>   	}
>   
> -	list_for_each_entry(entry, &owner->controllers, owner_node)
> +	/* Check if SYSMMU is already in the list */
> +	for_each_sysmmu(dev, entry)
>   		if (entry == data)
>   			return 0;
>   
> -	list_add_tail(&data->owner_node, &owner->controllers);
> -	data->master = dev;
> +	/* Not in the list yet */
> +	WARN_ON(data->next != NULL);
> +	entry = dev->archdata.iommu;
> +	data->next  = entry->next;
> +	entry->next = data;
>   
>   	return 0;
>   }

Best regards
Joerg Roedel April 9, 2020, 2:30 p.m. UTC | #4
Hi Marek,

On Thu, Apr 09, 2020 at 03:58:00PM +0200, Marek Szyprowski wrote:
> The main problem after your conversion is the fact that ->probe_device() 
> is called very early, before any other platform device (thus IOMMU 
> controller) is is probed. It doesn't handle EPROBE_DEFER too.

I don't quite understand why probe_device() is called too early, as it
is called at the same time add_device() was called before. But anyway,
I have seen a similar problem on OMAP. If the SYSMMU for a master is not
probed yet when probe_device() is called, it can just return -ENODEV and
in your driver you just call but_iommu_probe() when a new SYSMMU got
initialized to re-probe uninitialized masters on the bus. This patch-set
contains a change to export bus_iommu_probe() for exactly that reason.

What do you think?

Regards,

	Joerg
Marek Szyprowski April 10, 2020, 10:39 a.m. UTC | #5
Hi Joerg

On 07.04.2020 20:37, Joerg Roedel wrote:
> From: Joerg Roedel <jroedel@suse.de>
>
> All drivers are converted to use the probe/release_device()
> call-backs, so the add_device/remove_device() pointers are unused and
> the code using them can be removed.
>
> Signed-off-by: Joerg Roedel <jroedel@suse.de>
> ---
>   drivers/iommu/iommu.c | 145 ++++++++----------------------------------
>   include/linux/iommu.h |   4 --
>   2 files changed, 27 insertions(+), 122 deletions(-)
>
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index cf25c1e48830..d9032f9d597c 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -220,7 +220,7 @@ static int __iommu_probe_device(struct device *dev, struct list_head *group_list
>   	return ret;
>   }
>   
> -static int __iommu_probe_device_helper(struct device *dev)
> +int iommu_probe_device(struct device *dev)
>   {
>   	const struct iommu_ops *ops = dev->bus->iommu_ops;
>   	struct iommu_group *group;
> @@ -264,70 +264,17 @@ static int __iommu_probe_device_helper(struct device *dev)
>   
>   }
>   
> -int iommu_probe_device(struct device *dev)
> +void iommu_release_device(struct device *dev)
>   {
>   	const struct iommu_ops *ops = dev->bus->iommu_ops;
> -	struct iommu_group *group;
> -	int ret;
> -
> -	WARN_ON(dev->iommu_group);
> -
> -	if (!ops)
> -		return -EINVAL;
> -
> -	if (!dev_iommu_get(dev))
> -		return -ENOMEM;
> -
> -	if (!try_module_get(ops->owner)) {
> -		ret = -EINVAL;
> -		goto err_free_dev_param;
> -	}
> -
> -	if (ops->probe_device)
> -		return __iommu_probe_device_helper(dev);
> -
> -	ret = ops->add_device(dev);
> -	if (ret)
> -		goto err_module_put;
>   
> -	group = iommu_group_get(dev);
> -	iommu_create_device_direct_mappings(group, dev);
> -	iommu_group_put(group);
> -
> -	if (ops->probe_finalize)
> -		ops->probe_finalize(dev);
> -
> -	return 0;
> -
> -err_module_put:
> -	module_put(ops->owner);
> -err_free_dev_param:
> -	dev_iommu_free(dev);
> -	return ret;
> -}
> -
> -static void __iommu_release_device(struct device *dev)
> -{
> -	const struct iommu_ops *ops = dev->bus->iommu_ops;
> +	if (!dev->iommu)
> +		return;
>   
>   	iommu_device_unlink(dev->iommu->iommu_dev, dev);
> -
>   	iommu_group_remove_device(dev);
>   
>   	ops->release_device(dev);
> -}
> -
> -void iommu_release_device(struct device *dev)
> -{
> -	const struct iommu_ops *ops = dev->bus->iommu_ops;
> -
> -	if (!dev->iommu)
> -		return;
> -
> -	if (ops->release_device)
> -		__iommu_release_device(dev);
> -	else if (dev->iommu_group)
> -		ops->remove_device(dev);
>   
>   	module_put(ops->owner);
>   	dev_iommu_free(dev);
> @@ -1560,23 +1507,6 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
>   	if (ret)
>   		goto out_put_group;
>   
> -	/*
> -	 * Try to allocate a default domain - needs support from the
> -	 * IOMMU driver. There are still some drivers which don't support
> -	 * default domains, so the return value is not yet checked. Only
> -	 * allocate the domain here when the driver still has the
> -	 * add_device/remove_device call-backs implemented.
> -	 */
> -	if (!ops->probe_device) {
> -		iommu_alloc_default_domain(dev);
> -
> -		if (group->default_domain)
> -			ret = __iommu_attach_device(group->default_domain, dev);
> -
> -		if (ret)
> -			goto out_put_group;
> -	}
> -
>   	return group;
>   
>   out_put_group:
> @@ -1591,21 +1521,6 @@ struct iommu_domain *iommu_group_default_domain(struct iommu_group *group)
>   	return group->default_domain;
>   }
>   
> -static int add_iommu_group(struct device *dev, void *data)
> -{
> -	int ret = iommu_probe_device(dev);
> -
> -	/*
> -	 * We ignore -ENODEV errors for now, as they just mean that the
> -	 * device is not translated by an IOMMU. We still care about
> -	 * other errors and fail to initialize when they happen.
> -	 */
> -	if (ret == -ENODEV)
> -		ret = 0;
> -
> -	return ret;
> -}
> -
>   static int probe_iommu_group(struct device *dev, void *data)
>   {
>   	const struct iommu_ops *ops = dev->bus->iommu_ops;
> @@ -1789,45 +1704,39 @@ static int iommu_group_create_direct_mappings(struct iommu_group *group)
>   
>   int bus_iommu_probe(struct bus_type *bus)
>   {
> -	const struct iommu_ops *ops = bus->iommu_ops;
> +	struct iommu_group *group, *next;
> +	LIST_HEAD(group_list);
>   	int ret;
>   
> -	if (ops->probe_device) {
> -		struct iommu_group *group, *next;
> -		LIST_HEAD(group_list);
> -
> -		/*
> -		 * This code-path does not allocate the default domain when
> -		 * creating the iommu group, so do it after the groups are
> -		 * created.
> -		 */
> -		ret = bus_for_each_dev(bus, NULL, &group_list, probe_iommu_group);
> -		if (ret)
> -			return ret;
> +	/*
> +	 * This code-path does not allocate the default domain when
> +	 * creating the iommu group, so do it after the groups are
> +	 * created.
> +	 */
> +	ret = bus_for_each_dev(bus, NULL, &group_list, probe_iommu_group);
> +	if (ret)
> +		return ret;
>   
> -		list_for_each_entry_safe(group, next, &group_list, entry) {
> -			/* Remove item from the list */
> -			list_del_init(&group->entry);
> +	list_for_each_entry_safe(group, next, &group_list, entry) {
> +		/* Remove item from the list */
> +		list_del_init(&group->entry);
>   
> -			mutex_lock(&group->mutex);
> +		mutex_lock(&group->mutex);
>   
> -			/* Try to allocate default domain */
> -			probe_alloc_default_domain(bus, group);
> +		/* Try to allocate default domain */
> +		probe_alloc_default_domain(bus, group);
>   
> -			if (!group->default_domain)
> -				continue;
> +		if (!group->default_domain)
> +			continue;

It doesn't look straight from the above diff, but this continue leaks 
group->lock taken.

>   
> -			iommu_group_create_direct_mappings(group);
> +		iommu_group_create_direct_mappings(group);
>   
> -			ret = __iommu_group_dma_attach(group);
> +		ret = __iommu_group_dma_attach(group);
>   
> -			mutex_unlock(&group->mutex);
> +		mutex_unlock(&group->mutex);
>   
> -			if (ret)
> -				break;
> -		}
> -	} else {
> -		ret = bus_for_each_dev(bus, NULL, NULL, add_iommu_group);
> +		if (ret)
> +			break;
>   	}
>   
>   	return ret;
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index fea1622408ad..dd076366383f 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -223,8 +223,6 @@ struct iommu_iotlb_gather {
>    * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush
>    *            queue
>    * @iova_to_phys: translate iova to physical address
> - * @add_device: add device to iommu grouping
> - * @remove_device: remove device from iommu grouping
>    * @probe_device: Add device to iommu driver handling
>    * @release_device: Remove device from iommu driver handling
>    * @probe_finalize: Do final setup work after the device is added to an IOMMU
> @@ -277,8 +275,6 @@ struct iommu_ops {
>   	void (*iotlb_sync)(struct iommu_domain *domain,
>   			   struct iommu_iotlb_gather *iotlb_gather);
>   	phys_addr_t (*iova_to_phys)(struct iommu_domain *domain, dma_addr_t iova);
> -	int (*add_device)(struct device *dev);
> -	void (*remove_device)(struct device *dev);
>   	struct iommu_device *(*probe_device)(struct device *dev);
>   	void (*release_device)(struct device *dev);
>   	void (*probe_finalize)(struct device *dev);

Best regards
Joerg Roedel April 14, 2020, 1:20 p.m. UTC | #6
On Thu, Apr 09, 2020 at 03:58:00PM +0200, Marek Szyprowski wrote:
> I've checked and it works fine on top of 
> ff68eb23308e6538ec7864c83d39540f423bbe90. However I'm not a fan of 
> removing this 'owner' structure. It gave a nice abstraction for the all 
> SYSMMU controllers for the given device (although most devices in the 
> system have only one SYSMMU). Why this structure is a problem for your 
> rework?

Okay, the structure itself is not a problem, I just thought it is not
really necessary. But to keep things simple I've taken another approach
for v2 of this series: Just use the first SYSMMU of the controllers list
to link the device and the IOMMU. When the owner structure exists there
is always one entry in this list, so that should work fine.

Regards,

	Joerg