diff mbox series

thunderbolt: Make iommu_dma_protection more accurate

Message ID 2d01fa50c2650c730b0244929097737918e302e7.1647533152.git.robin.murphy@arm.com
State New
Headers show
Series thunderbolt: Make iommu_dma_protection more accurate | expand

Commit Message

Robin Murphy March 17, 2022, 4:17 p.m. UTC
Between me trying to get rid of iommu_present() and Mario wanting to
support the AMD equivalent of DMAR_PLATFORM_OPT_IN, scrutiny has shown
that the iommu_dma_protection attribute is being far too optimistic.
Even if an IOMMU might be present for some PCI segment in the system,
that doesn't necessarily mean it provides translation for the device(s)
we care about. Furthermore, all that DMAR_PLATFORM_OPT_IN really does
is tell us that memory was protected before the kernel was loaded, and
prevent the user from disabling the intel-iommu driver entirely. What
actually matters is whether we trust individual devices, based on the
"external facing" property that we expect firmware to describe for
Thunderbolt ports.

Avoid false positives by looking as close as possible to the same PCI
topology that the IOMMU layer will consider once a Thunderbolt endpoint
appears. Crucially, we can't assume that IOMMU translation being enabled
for any reason is sufficient on its own; full (expensive) DMA protection
will still only be imposed on untrusted devices.

CC: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---

This supersedes my previous attempt just trying to replace
iommu_present() at [1], further to the original discussion at [2].

[1] https://lore.kernel.org/linux-iommu/BL1PR12MB515799C0BE396377DBBEF055E2119@BL1PR12MB5157.namprd12.prod.outlook.com/T/
[2] https://lore.kernel.org/linux-iommu/202203160844.lKviWR1Q-lkp@intel.com/T/

 drivers/thunderbolt/domain.c | 12 +++---------
 drivers/thunderbolt/nhi.c    | 35 +++++++++++++++++++++++++++++++++++
 include/linux/thunderbolt.h  |  2 ++
 3 files changed, 40 insertions(+), 9 deletions(-)

Comments

Mario Limonciello March 17, 2022, 5:09 p.m. UTC | #1
[Public]



> -----Original Message-----
> From: Robin Murphy <robin.murphy@arm.com>
> Sent: Thursday, March 17, 2022 11:17
> To: andreas.noever@gmail.com; michael.jamet@intel.com;
> mika.westerberg@linux.intel.com; YehezkelShB@gmail.com
> Cc: linux-usb@vger.kernel.org; linux-kernel@vger.kernel.org;
> iommu@lists.linux-foundation.org; linux-pci@vger.kernel.org; Limonciello,
> Mario <Mario.Limonciello@amd.com>
> Subject: [PATCH] thunderbolt: Make iommu_dma_protection more accurate
> 
> Between me trying to get rid of iommu_present() and Mario wanting to
> support the AMD equivalent of DMAR_PLATFORM_OPT_IN, scrutiny has
> shown
> that the iommu_dma_protection attribute is being far too optimistic.
> Even if an IOMMU might be present for some PCI segment in the system,
> that doesn't necessarily mean it provides translation for the device(s)
> we care about. Furthermore, all that DMAR_PLATFORM_OPT_IN really does
> is tell us that memory was protected before the kernel was loaded, and
> prevent the user from disabling the intel-iommu driver entirely. What
> actually matters is whether we trust individual devices, based on the
> "external facing" property that we expect firmware to describe for
> Thunderbolt ports.
> 
> Avoid false positives by looking as close as possible to the same PCI
> topology that the IOMMU layer will consider once a Thunderbolt endpoint
> appears. Crucially, we can't assume that IOMMU translation being enabled
> for any reason is sufficient on its own; full (expensive) DMA protection
> will still only be imposed on untrusted devices.
> 
> CC: Mario Limonciello <mario.limonciello@amd.com>
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
> 
> This supersedes my previous attempt just trying to replace
> iommu_present() at [1], further to the original discussion at [2].
> 
> [1]
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.
> kernel.org%2Flinux-
> iommu%2FBL1PR12MB515799C0BE396377DBBEF055E2119%40BL1PR12MB515
> 7.namprd12.prod.outlook.com%2FT%2F&amp;data=04%7C01%7Cmario.limo
> nciello%40amd.com%7C14f5afbba9624b4d0ef508da08319b2a%7C3dd8961fe4
> 884e608e11a82d994e183d%7C0%7C0%7C637831306409535091%7CUnknown
> %7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1ha
> WwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=9xJ9bNT3pR3YhqOOqiJtGv94ln2
> IJSvrXllbPZjTI6M%3D&amp;reserved=0
> [2]
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.
> kernel.org%2Flinux-iommu%2F202203160844.lKviWR1Q-
> lkp%40intel.com%2FT%2F&amp;data=04%7C01%7Cmario.limonciello%40amd
> .com%7C14f5afbba9624b4d0ef508da08319b2a%7C3dd8961fe4884e608e11a8
> 2d994e183d%7C0%7C0%7C637831306409535091%7CUnknown%7CTWFpbGZs
> b3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn
> 0%3D%7C3000&amp;sdata=wSbpjpPQk8ulX8ifOTt%2BNMdO5svwQceQthyca
> txzScI%3D&amp;reserved=0
> 
>  drivers/thunderbolt/domain.c | 12 +++---------
>  drivers/thunderbolt/nhi.c    | 35
> +++++++++++++++++++++++++++++++++++
>  include/linux/thunderbolt.h  |  2 ++
>  3 files changed, 40 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/thunderbolt/domain.c b/drivers/thunderbolt/domain.c
> index 7018d959f775..d5c825e84ac8 100644
> --- a/drivers/thunderbolt/domain.c
> +++ b/drivers/thunderbolt/domain.c
> @@ -7,9 +7,7 @@
>   */
> 
>  #include <linux/device.h>
> -#include <linux/dmar.h>
>  #include <linux/idr.h>
> -#include <linux/iommu.h>
>  #include <linux/module.h>
>  #include <linux/pm_runtime.h>
>  #include <linux/slab.h>
> @@ -257,13 +255,9 @@ static ssize_t iommu_dma_protection_show(struct
> device *dev,
>  					 struct device_attribute *attr,
>  					 char *buf)
>  {
> -	/*
> -	 * Kernel DMA protection is a feature where Thunderbolt security is
> -	 * handled natively using IOMMU. It is enabled when IOMMU is
> -	 * enabled and ACPI DMAR table has DMAR_PLATFORM_OPT_IN set.
> -	 */
> -	return sprintf(buf, "%d\n",
> -		       iommu_present(&pci_bus_type) &&
> dmar_platform_optin());
> +	struct tb *tb = container_of(dev, struct tb, dev);
> +
> +	return sprintf(buf, "%d\n", tb->nhi->iommu_dma_protection);
>  }
>  static DEVICE_ATTR_RO(iommu_dma_protection);
> 
> diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c
> index c73da0532be4..e12c2e266741 100644
> --- a/drivers/thunderbolt/nhi.c
> +++ b/drivers/thunderbolt/nhi.c
> @@ -14,6 +14,7 @@
>  #include <linux/errno.h>
>  #include <linux/pci.h>
>  #include <linux/interrupt.h>
> +#include <linux/iommu.h>
>  #include <linux/module.h>
>  #include <linux/delay.h>
>  #include <linux/property.h>
> @@ -1102,6 +1103,39 @@ static void nhi_check_quirks(struct tb_nhi *nhi)
>  		nhi->quirks |= QUIRK_AUTO_CLEAR_INT;
>  }
> 
> +static void nhi_check_iommu(struct tb_nhi *nhi)
> +{
> +	struct pci_dev *pdev;
> +	bool port_ok = false;
> +
> +	/*
> +	 * Check for sibling devices that look like they should be our
> +	 * tunnelled ports. We can reasonably assume that if an IOMMU is
> +	 * managing the bridge it will manage any future devices beyond it
> +	 * too. If firmware has described a port as external-facing as
> +	 * expected then we can trust the IOMMU layer to enforce isolation;
> +	 * otherwise even if translation is enabled for existing devices it
> +	 * may potentially be overridden for a future tunnelled endpoint.
> +	 */
> +	for_each_pci_bridge(pdev, nhi->pdev->bus) {
> +		if (!pci_is_pcie(pdev) ||
> +		    !(pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT ||
> +		      pci_pcie_type(pdev) == PCI_EXP_TYPE_DOWNSTREAM))
> +			continue;
> +

Unfortunately I don't think this logic holds for the topology I see.

Here is the NHI on a system I have here:
$ lspci -vvv -s 64:00.5
64:00.5 USB controller: Advanced Micro Devices, Inc. [AMD] Device 162e (prog-if 40)
        Subsystem: Advanced Micro Devices, Inc. [AMD] Device 162e
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 32 bytes
        Interrupt: pin A routed to IRQ 42
        Region 0: Memory at b0500000 (64-bit, non-prefetchable) [size=512K]
        Capabilities: <access denied>
        Kernel driver in use: thunderbolt
        Kernel modules: thunderbolt

The links it makes (from those _DSD) are:
$ ls /sys/bus/pci/drivers/thunderbolt/0000\:64\:00.5/ | grep consumer
consumer:pci:0000:00:03.1
consumer:pci:0000:64:00.3

$ lspci -s 64:00.3
64:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Device 15d6
$ lspci -s 00:03.1
00:03.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 14cd

Looking at the topology the PCIE root port for tunneling (00:03.1) isn't actually on the same bridge.
$ lspci -t
-[0000:00]-+-00.0
           +-00.2
           +-01.0
           +-01.2-[01]----00.0
           +-02.0
           +-02.4-[02]----00.0
           +-03.0
           +-03.1-[03-32]--
           +-04.0
           +-04.1-[33-62]--
           +-08.0
           +-08.1-[63]--+-00.0
           |            +-00.1
           |            +-00.2
           |            +-00.3
           |            +-00.4
           |            +-00.5
           |            +-00.6
           |            \-00.7
           +-08.3-[64]--+-00.0
           |            +-00.3
           |            +-00.4
           |            +-00.5
           |            \-00.6
           +-14.0
           +-14.3
           +-18.0
           +-18.1
           +-18.2
           +-18.3
           +-18.4
           +-18.5
           +-18.6
           \-18.7

How about in this function to have two cases:
* the one that looks at links
* and if no links then the logic you have in place?

> +		if (!device_iommu_mapped(&pdev->dev))
> +			return;
> +
> +		if (!pdev->untrusted) {
> +			dev_info(&nhi->pdev->dev,
> +				 "Assuming unreliable Kernel DMA
> protection\n");
> +			return;
> +		}
> +		port_ok = true;
> +	}
> +	nhi->iommu_dma_protection = port_ok;
> +}
> +
>  static int nhi_init_msi(struct tb_nhi *nhi)
>  {
>  	struct pci_dev *pdev = nhi->pdev;
> @@ -1219,6 +1253,7 @@ static int nhi_probe(struct pci_dev *pdev, const
> struct pci_device_id *id)
>  		return -ENOMEM;
> 
>  	nhi_check_quirks(nhi);
> +	nhi_check_iommu(nhi);
> 
>  	res = nhi_init_msi(nhi);
>  	if (res) {
> diff --git a/include/linux/thunderbolt.h b/include/linux/thunderbolt.h
> index 124e13cb1469..7a8ad984e651 100644
> --- a/include/linux/thunderbolt.h
> +++ b/include/linux/thunderbolt.h
> @@ -465,6 +465,7 @@ static inline struct tb_xdomain
> *tb_service_parent(struct tb_service *svc)
>   * @msix_ida: Used to allocate MSI-X vectors for rings
>   * @going_away: The host controller device is about to disappear so when
>   *		this flag is set, avoid touching the hardware anymore.
> + * @iommu_dma_protection: An IOMMU will isolate external-facing ports.
>   * @interrupt_work: Work scheduled to handle ring interrupt when no
>   *		    MSI-X is used.
>   * @hop_count: Number of rings (end point hops) supported by NHI.
> @@ -479,6 +480,7 @@ struct tb_nhi {
>  	struct tb_ring **rx_rings;
>  	struct ida msix_ida;
>  	bool going_away;
> +	bool iommu_dma_protection;
>  	struct work_struct interrupt_work;
>  	u32 hop_count;
>  	unsigned long quirks;
> --
> 2.28.0.dirty
Mario Limonciello March 17, 2022, 8:36 p.m. UTC | #2
[Public]



> -----Original Message-----
> From: Limonciello, Mario
> Sent: Thursday, March 17, 2022 12:09
> To: Robin Murphy <robin.murphy@arm.com>; andreas.noever@gmail.com;
> michael.jamet@intel.com; mika.westerberg@linux.intel.com;
> YehezkelShB@gmail.com
> Cc: linux-usb@vger.kernel.org; linux-kernel@vger.kernel.org;
> iommu@lists.linux-foundation.org; linux-pci@vger.kernel.org
> Subject: RE: [PATCH] thunderbolt: Make iommu_dma_protection more
> accurate
> 
> [Public]
> 
> 
> 
> > -----Original Message-----
> > From: Robin Murphy <robin.murphy@arm.com>
> > Sent: Thursday, March 17, 2022 11:17
> > To: andreas.noever@gmail.com; michael.jamet@intel.com;
> > mika.westerberg@linux.intel.com; YehezkelShB@gmail.com
> > Cc: linux-usb@vger.kernel.org; linux-kernel@vger.kernel.org;
> > iommu@lists.linux-foundation.org; linux-pci@vger.kernel.org; Limonciello,
> > Mario <Mario.Limonciello@amd.com>
> > Subject: [PATCH] thunderbolt: Make iommu_dma_protection more
> accurate
> >
> > Between me trying to get rid of iommu_present() and Mario wanting to
> > support the AMD equivalent of DMAR_PLATFORM_OPT_IN, scrutiny has
> > shown
> > that the iommu_dma_protection attribute is being far too optimistic.
> > Even if an IOMMU might be present for some PCI segment in the system,
> > that doesn't necessarily mean it provides translation for the device(s)
> > we care about. Furthermore, all that DMAR_PLATFORM_OPT_IN really
> does
> > is tell us that memory was protected before the kernel was loaded, and
> > prevent the user from disabling the intel-iommu driver entirely. What
> > actually matters is whether we trust individual devices, based on the
> > "external facing" property that we expect firmware to describe for
> > Thunderbolt ports.
> >
> > Avoid false positives by looking as close as possible to the same PCI
> > topology that the IOMMU layer will consider once a Thunderbolt endpoint
> > appears. Crucially, we can't assume that IOMMU translation being enabled
> > for any reason is sufficient on its own; full (expensive) DMA protection
> > will still only be imposed on untrusted devices.
> >
> > CC: Mario Limonciello <mario.limonciello@amd.com>
> > Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> > ---
> >
> > This supersedes my previous attempt just trying to replace
> > iommu_present() at [1], further to the original discussion at [2].
> >
> > [1]
> >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.
> > kernel.org%2Flinux-
> >
> iommu%2FBL1PR12MB515799C0BE396377DBBEF055E2119%40BL1PR12MB515
> >
> 7.namprd12.prod.outlook.com%2FT%2F&amp;data=04%7C01%7Cmario.limo
> >
> nciello%40amd.com%7C14f5afbba9624b4d0ef508da08319b2a%7C3dd8961fe4
> >
> 884e608e11a82d994e183d%7C0%7C0%7C637831306409535091%7CUnknown
> > %7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1ha
> >
> WwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=9xJ9bNT3pR3YhqOOqiJtGv94ln2
> > IJSvrXllbPZjTI6M%3D&amp;reserved=0
> > [2]
> >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.
> > kernel.org%2Flinux-iommu%2F202203160844.lKviWR1Q-
> >
> lkp%40intel.com%2FT%2F&amp;data=04%7C01%7Cmario.limonciello%40amd
> >
> .com%7C14f5afbba9624b4d0ef508da08319b2a%7C3dd8961fe4884e608e11a8
> >
> 2d994e183d%7C0%7C0%7C637831306409535091%7CUnknown%7CTWFpbGZs
> >
> b3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn
> >
> 0%3D%7C3000&amp;sdata=wSbpjpPQk8ulX8ifOTt%2BNMdO5svwQceQthyca
> > txzScI%3D&amp;reserved=0
> >
> >  drivers/thunderbolt/domain.c | 12 +++---------
> >  drivers/thunderbolt/nhi.c    | 35
> > +++++++++++++++++++++++++++++++++++
> >  include/linux/thunderbolt.h  |  2 ++
> >  3 files changed, 40 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/thunderbolt/domain.c b/drivers/thunderbolt/domain.c
> > index 7018d959f775..d5c825e84ac8 100644
> > --- a/drivers/thunderbolt/domain.c
> > +++ b/drivers/thunderbolt/domain.c
> > @@ -7,9 +7,7 @@
> >   */
> >
> >  #include <linux/device.h>
> > -#include <linux/dmar.h>
> >  #include <linux/idr.h>
> > -#include <linux/iommu.h>
> >  #include <linux/module.h>
> >  #include <linux/pm_runtime.h>
> >  #include <linux/slab.h>
> > @@ -257,13 +255,9 @@ static ssize_t iommu_dma_protection_show(struct
> > device *dev,
> >  					 struct device_attribute *attr,
> >  					 char *buf)
> >  {
> > -	/*
> > -	 * Kernel DMA protection is a feature where Thunderbolt security is
> > -	 * handled natively using IOMMU. It is enabled when IOMMU is
> > -	 * enabled and ACPI DMAR table has DMAR_PLATFORM_OPT_IN set.
> > -	 */
> > -	return sprintf(buf, "%d\n",
> > -		       iommu_present(&pci_bus_type) &&
> > dmar_platform_optin());
> > +	struct tb *tb = container_of(dev, struct tb, dev);
> > +
> > +	return sprintf(buf, "%d\n", tb->nhi->iommu_dma_protection);
> >  }
> >  static DEVICE_ATTR_RO(iommu_dma_protection);
> >
> > diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c
> > index c73da0532be4..e12c2e266741 100644
> > --- a/drivers/thunderbolt/nhi.c
> > +++ b/drivers/thunderbolt/nhi.c
> > @@ -14,6 +14,7 @@
> >  #include <linux/errno.h>
> >  #include <linux/pci.h>
> >  #include <linux/interrupt.h>
> > +#include <linux/iommu.h>
> >  #include <linux/module.h>
> >  #include <linux/delay.h>
> >  #include <linux/property.h>
> > @@ -1102,6 +1103,39 @@ static void nhi_check_quirks(struct tb_nhi *nhi)
> >  		nhi->quirks |= QUIRK_AUTO_CLEAR_INT;
> >  }
> >
> > +static void nhi_check_iommu(struct tb_nhi *nhi)
> > +{
> > +	struct pci_dev *pdev;
> > +	bool port_ok = false;
> > +
> > +	/*
> > +	 * Check for sibling devices that look like they should be our
> > +	 * tunnelled ports. We can reasonably assume that if an IOMMU is
> > +	 * managing the bridge it will manage any future devices beyond it
> > +	 * too. If firmware has described a port as external-facing as
> > +	 * expected then we can trust the IOMMU layer to enforce isolation;
> > +	 * otherwise even if translation is enabled for existing devices it
> > +	 * may potentially be overridden for a future tunnelled endpoint.
> > +	 */
> > +	for_each_pci_bridge(pdev, nhi->pdev->bus) {
> > +		if (!pci_is_pcie(pdev) ||
> > +		    !(pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT ||
> > +		      pci_pcie_type(pdev) == PCI_EXP_TYPE_DOWNSTREAM))
> > +			continue;
> > +
> 
> Unfortunately I don't think this logic holds for the topology I see.
> 
> Here is the NHI on a system I have here:
> $ lspci -vvv -s 64:00.5
> 64:00.5 USB controller: Advanced Micro Devices, Inc. [AMD] Device 162e
> (prog-if 40)
>         Subsystem: Advanced Micro Devices, Inc. [AMD] Device 162e
>         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr- Stepping- SERR- FastB2B- DisINTx+
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR- INTx-
>         Latency: 0, Cache Line Size: 32 bytes
>         Interrupt: pin A routed to IRQ 42
>         Region 0: Memory at b0500000 (64-bit, non-prefetchable) [size=512K]
>         Capabilities: <access denied>
>         Kernel driver in use: thunderbolt
>         Kernel modules: thunderbolt
> 
> The links it makes (from those _DSD) are:
> $ ls /sys/bus/pci/drivers/thunderbolt/0000\:64\:00.5/ | grep consumer
> consumer:pci:0000:00:03.1
> consumer:pci:0000:64:00.3
> 
> $ lspci -s 64:00.3
> 64:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Device 15d6
> $ lspci -s 00:03.1
> 00:03.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 14cd
> 
> Looking at the topology the PCIE root port for tunneling (00:03.1) isn't actually
> on the same bridge.
> $ lspci -t
> -[0000:00]-+-00.0
>            +-00.2
>            +-01.0
>            +-01.2-[01]----00.0
>            +-02.0
>            +-02.4-[02]----00.0
>            +-03.0
>            +-03.1-[03-32]--
>            +-04.0
>            +-04.1-[33-62]--
>            +-08.0
>            +-08.1-[63]--+-00.0
>            |            +-00.1
>            |            +-00.2
>            |            +-00.3
>            |            +-00.4
>            |            +-00.5
>            |            +-00.6
>            |            \-00.7
>            +-08.3-[64]--+-00.0
>            |            +-00.3
>            |            +-00.4
>            |            +-00.5
>            |            \-00.6
>            +-14.0
>            +-14.3
>            +-18.0
>            +-18.1
>            +-18.2
>            +-18.3
>            +-18.4
>            +-18.5
>            +-18.6
>            \-18.7
> 
> How about in this function to have two cases:
> * the one that looks at links
> * and if no links then the logic you have in place?

Here is a proposal on top of what you did for this.  
The idea being check the ports right when the links are made if they exist 
(all the new USB4 stuff) and then check all siblings on TBT3 stuff.

diff --git a/drivers/thunderbolt/acpi.c b/drivers/thunderbolt/acpi.c
index 79b5abf9d042..89432456dbea 100644
--- a/drivers/thunderbolt/acpi.c
+++ b/drivers/thunderbolt/acpi.c
@@ -14,6 +14,7 @@
 static acpi_status tb_acpi_add_link(acpi_handle handle, u32 level, void *data,
                                    void **return_value)
 {
+       enum nhi_iommu_status iommu_status = IOMMU_UNKNOWN;
        struct fwnode_reference_args args;
        struct fwnode_handle *fwnode;
        struct tb_nhi *nhi = data;
@@ -91,6 +92,8 @@ static acpi_status tb_acpi_add_link(acpi_handle handle, u32 level, void *data,
                if (link) {
                        dev_dbg(&nhi->pdev->dev, "created link from %s\n",
                                dev_name(&pdev->dev));
+                       if (iommu_status != IOMMU_DISABLED)
+                               iommu_status = nhi_check_iommu_for_port(pdev);
                } else {
                        dev_warn(&nhi->pdev->dev, "device link creation from %s failed\n",
                                 dev_name(&pdev->dev));
@@ -101,6 +104,7 @@ static acpi_status tb_acpi_add_link(acpi_handle handle, u32 level, void *data,

 out_put:
        fwnode_handle_put(args.fwnode);
+       nhi->iommu_dma_protection = (iommu_status == IOMMU_ENABLED);
        return AE_OK;
 }

diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c
index e12c2e266741..b5eb0cab392f 100644
--- a/drivers/thunderbolt/nhi.c
+++ b/drivers/thunderbolt/nhi.c
@@ -1103,10 +1103,30 @@ static void nhi_check_quirks(struct tb_nhi *nhi)
                nhi->quirks |= QUIRK_AUTO_CLEAR_INT;
 }

+enum nhi_iommu_status nhi_check_iommu_for_port(struct pci_dev *pdev)
+{
+       if (!pci_is_pcie(pdev) ||
+           !(pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT ||
+            pci_pcie_type(pdev) == PCI_EXP_TYPE_DOWNSTREAM)) {
+               return IOMMU_UNKNOWN;
+       }
+
+       if (!device_iommu_mapped(&pdev->dev)) {
+               return IOMMU_DISABLED;
+       }
+
+       if (!pdev->untrusted) {
+               dev_info(&pdev->dev,
+                       "Assuming unreliable Kernel DMA protection\n");
+               return IOMMU_DISABLED;
+       }
+       return IOMMU_ENABLED;
+}
+
 static void nhi_check_iommu(struct tb_nhi *nhi)
 {
-       struct pci_dev *pdev;
-       bool port_ok = false;
+       enum nhi_iommu_status iommu_status = nhi->iommu_dma_protection ?
+                                       IOMMU_ENABLED : IOMMU_UNKNOWN;

        /*
         * Check for sibling devices that look like they should be our
@@ -1117,23 +1137,13 @@ static void nhi_check_iommu(struct tb_nhi *nhi)
         * otherwise even if translation is enabled for existing devices it
         * may potentially be overridden for a future tunnelled endpoint.
         */
-       for_each_pci_bridge(pdev, nhi->pdev->bus) {
-               if (!pci_is_pcie(pdev) ||
-                   !(pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT ||
-                     pci_pcie_type(pdev) == PCI_EXP_TYPE_DOWNSTREAM))
-                       continue;
-
-               if (!device_iommu_mapped(&pdev->dev))
-                       return;
-
-               if (!pdev->untrusted) {
-                       dev_info(&nhi->pdev->dev,
-                                "Assuming unreliable Kernel DMA protection\n");
-                       return;
-               }
-               port_ok = true;
+       if (iommu_status == IOMMU_UNKNOWN) {
+               struct pci_dev *pdev;
+               for_each_pci_bridge(pdev, nhi->pdev->bus)
+                       if (iommu_status != IOMMU_DISABLED)
+                               iommu_status = nhi_check_iommu_for_port(pdev);
        }
-       nhi->iommu_dma_protection = port_ok;
+       nhi->iommu_dma_protection = (iommu_status == IOMMU_ENABLED);
 }

 static int nhi_init_msi(struct tb_nhi *nhi)

diff --git a/drivers/thunderbolt/nhi.h b/drivers/thunderbolt/nhi.h
index 69083aab2736..1622d49b1763 100644
--- a/drivers/thunderbolt/nhi.h
+++ b/drivers/thunderbolt/nhi.h
@@ -11,6 +11,13 @@

 #include <linux/thunderbolt.h>

+enum nhi_iommu_status {
+       IOMMU_UNKNOWN,
+       IOMMU_DISABLED,
+       IOMMU_ENABLED,
+};
+enum nhi_iommu_status nhi_check_iommu_for_port(struct pci_dev *pdev);
+
 enum nhi_fw_mode {
        NHI_FW_SAFE_MODE,
        NHI_FW_AUTH_MODE,

> 
> > +		if (!device_iommu_mapped(&pdev->dev))
> > +			return;
> > +
> > +		if (!pdev->untrusted) {
> > +			dev_info(&nhi->pdev->dev,
> > +				 "Assuming unreliable Kernel DMA
> > protection\n");
> > +			return;
> > +		}
> > +		port_ok = true;
> > +	}
> > +	nhi->iommu_dma_protection = port_ok;
> > +}
> > +
> >  static int nhi_init_msi(struct tb_nhi *nhi)
> >  {
> >  	struct pci_dev *pdev = nhi->pdev;
> > @@ -1219,6 +1253,7 @@ static int nhi_probe(struct pci_dev *pdev, const
> > struct pci_device_id *id)
> >  		return -ENOMEM;
> >
> >  	nhi_check_quirks(nhi);
> > +	nhi_check_iommu(nhi);
> >
> >  	res = nhi_init_msi(nhi);
> >  	if (res) {
> > diff --git a/include/linux/thunderbolt.h b/include/linux/thunderbolt.h
> > index 124e13cb1469..7a8ad984e651 100644
> > --- a/include/linux/thunderbolt.h
> > +++ b/include/linux/thunderbolt.h
> > @@ -465,6 +465,7 @@ static inline struct tb_xdomain
> > *tb_service_parent(struct tb_service *svc)
> >   * @msix_ida: Used to allocate MSI-X vectors for rings
> >   * @going_away: The host controller device is about to disappear so when
> >   *		this flag is set, avoid touching the hardware anymore.
> > + * @iommu_dma_protection: An IOMMU will isolate external-facing
> ports.
> >   * @interrupt_work: Work scheduled to handle ring interrupt when no
> >   *		    MSI-X is used.
> >   * @hop_count: Number of rings (end point hops) supported by NHI.
> > @@ -479,6 +480,7 @@ struct tb_nhi {
> >  	struct tb_ring **rx_rings;
> >  	struct ida msix_ida;
> >  	bool going_away;
> > +	bool iommu_dma_protection;
> >  	struct work_struct interrupt_work;
> >  	u32 hop_count;
> >  	unsigned long quirks;
> > --
> > 2.28.0.dirty
Mika Westerberg March 18, 2022, 6:44 a.m. UTC | #3
Hi Robin,

Thanks for working on this!

On Thu, Mar 17, 2022 at 04:17:07PM +0000, Robin Murphy wrote:
> Between me trying to get rid of iommu_present() and Mario wanting to
> support the AMD equivalent of DMAR_PLATFORM_OPT_IN, scrutiny has shown
> that the iommu_dma_protection attribute is being far too optimistic.
> Even if an IOMMU might be present for some PCI segment in the system,
> that doesn't necessarily mean it provides translation for the device(s)
> we care about. Furthermore, all that DMAR_PLATFORM_OPT_IN really does
> is tell us that memory was protected before the kernel was loaded, and
> prevent the user from disabling the intel-iommu driver entirely. What
> actually matters is whether we trust individual devices, based on the
> "external facing" property that we expect firmware to describe for
> Thunderbolt ports.

We still want to know that DMAR_PLATFORM_OPT_IN is set by the firmware
because that tells us that none of the devices connected before OS got
control had the ability to perform DMA outside of the RMRR regions. If
they did then all this is pointless because they could have modified the
system memory as they wished.

> Avoid false positives by looking as close as possible to the same PCI
> topology that the IOMMU layer will consider once a Thunderbolt endpoint
> appears. Crucially, we can't assume that IOMMU translation being enabled
> for any reason is sufficient on its own; full (expensive) DMA protection
> will still only be imposed on untrusted devices.
> 
> CC: Mario Limonciello <mario.limonciello@amd.com>
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
> 
> This supersedes my previous attempt just trying to replace
> iommu_present() at [1], further to the original discussion at [2].
> 
> [1] https://lore.kernel.org/linux-iommu/BL1PR12MB515799C0BE396377DBBEF055E2119@BL1PR12MB5157.namprd12.prod.outlook.com/T/
> [2] https://lore.kernel.org/linux-iommu/202203160844.lKviWR1Q-lkp@intel.com/T/
> 
>  drivers/thunderbolt/domain.c | 12 +++---------
>  drivers/thunderbolt/nhi.c    | 35 +++++++++++++++++++++++++++++++++++
>  include/linux/thunderbolt.h  |  2 ++
>  3 files changed, 40 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/thunderbolt/domain.c b/drivers/thunderbolt/domain.c
> index 7018d959f775..d5c825e84ac8 100644
> --- a/drivers/thunderbolt/domain.c
> +++ b/drivers/thunderbolt/domain.c
> @@ -7,9 +7,7 @@
>   */
>  
>  #include <linux/device.h>
> -#include <linux/dmar.h>
>  #include <linux/idr.h>
> -#include <linux/iommu.h>
>  #include <linux/module.h>
>  #include <linux/pm_runtime.h>
>  #include <linux/slab.h>
> @@ -257,13 +255,9 @@ static ssize_t iommu_dma_protection_show(struct device *dev,
>  					 struct device_attribute *attr,
>  					 char *buf)
>  {
> -	/*
> -	 * Kernel DMA protection is a feature where Thunderbolt security is
> -	 * handled natively using IOMMU. It is enabled when IOMMU is
> -	 * enabled and ACPI DMAR table has DMAR_PLATFORM_OPT_IN set.
> -	 */
> -	return sprintf(buf, "%d\n",
> -		       iommu_present(&pci_bus_type) && dmar_platform_optin());
> +	struct tb *tb = container_of(dev, struct tb, dev);
> +
> +	return sprintf(buf, "%d\n", tb->nhi->iommu_dma_protection);
>  }
>  static DEVICE_ATTR_RO(iommu_dma_protection);
>  
> diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c
> index c73da0532be4..e12c2e266741 100644
> --- a/drivers/thunderbolt/nhi.c
> +++ b/drivers/thunderbolt/nhi.c
> @@ -14,6 +14,7 @@
>  #include <linux/errno.h>
>  #include <linux/pci.h>
>  #include <linux/interrupt.h>
> +#include <linux/iommu.h>
>  #include <linux/module.h>
>  #include <linux/delay.h>
>  #include <linux/property.h>
> @@ -1102,6 +1103,39 @@ static void nhi_check_quirks(struct tb_nhi *nhi)
>  		nhi->quirks |= QUIRK_AUTO_CLEAR_INT;
>  }
>  
> +static void nhi_check_iommu(struct tb_nhi *nhi)
> +{
> +	struct pci_dev *pdev;
> +	bool port_ok = false;


So for here somewhere we should call dmar_platform_optin() first. I
think this is something we want to move inside the IOMMU API (alongside
with the below checks).

> +
> +	/*
> +	 * Check for sibling devices that look like they should be our
> +	 * tunnelled ports. We can reasonably assume that if an IOMMU is
> +	 * managing the bridge it will manage any future devices beyond it
> +	 * too. If firmware has described a port as external-facing as
> +	 * expected then we can trust the IOMMU layer to enforce isolation;
> +	 * otherwise even if translation is enabled for existing devices it
> +	 * may potentially be overridden for a future tunnelled endpoint.
> +	 */
> +	for_each_pci_bridge(pdev, nhi->pdev->bus) {
> +		if (!pci_is_pcie(pdev) ||
> +		    !(pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT ||
> +		      pci_pcie_type(pdev) == PCI_EXP_TYPE_DOWNSTREAM))
> +			continue;
> +
> +		if (!device_iommu_mapped(&pdev->dev))
> +			return;
> +
> +		if (!pdev->untrusted) {
> +			dev_info(&nhi->pdev->dev,
> +				 "Assuming unreliable Kernel DMA protection\n");
> +			return;
> +		}
> +		port_ok = true;
> +	}
> +	nhi->iommu_dma_protection = port_ok;
> +}
> +
>  static int nhi_init_msi(struct tb_nhi *nhi)
>  {
>  	struct pci_dev *pdev = nhi->pdev;
> @@ -1219,6 +1253,7 @@ static int nhi_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  		return -ENOMEM;
>  
>  	nhi_check_quirks(nhi);
> +	nhi_check_iommu(nhi);
>  
>  	res = nhi_init_msi(nhi);
>  	if (res) {
> diff --git a/include/linux/thunderbolt.h b/include/linux/thunderbolt.h
> index 124e13cb1469..7a8ad984e651 100644
> --- a/include/linux/thunderbolt.h
> +++ b/include/linux/thunderbolt.h
> @@ -465,6 +465,7 @@ static inline struct tb_xdomain *tb_service_parent(struct tb_service *svc)
>   * @msix_ida: Used to allocate MSI-X vectors for rings
>   * @going_away: The host controller device is about to disappear so when
>   *		this flag is set, avoid touching the hardware anymore.
> + * @iommu_dma_protection: An IOMMU will isolate external-facing ports.
>   * @interrupt_work: Work scheduled to handle ring interrupt when no
>   *		    MSI-X is used.
>   * @hop_count: Number of rings (end point hops) supported by NHI.
> @@ -479,6 +480,7 @@ struct tb_nhi {
>  	struct tb_ring **rx_rings;
>  	struct ida msix_ida;
>  	bool going_away;
> +	bool iommu_dma_protection;
>  	struct work_struct interrupt_work;
>  	u32 hop_count;
>  	unsigned long quirks;
> -- 
> 2.28.0.dirty
Mika Westerberg March 18, 2022, 11:38 a.m. UTC | #4
Hi Mario,

On Thu, Mar 17, 2022 at 08:36:13PM +0000, Limonciello, Mario wrote:
> Here is a proposal on top of what you did for this.  
> The idea being check the ports right when the links are made if they exist 
> (all the new USB4 stuff) and then check all siblings on TBT3 stuff.
> 
> diff --git a/drivers/thunderbolt/acpi.c b/drivers/thunderbolt/acpi.c
> index 79b5abf9d042..89432456dbea 100644
> --- a/drivers/thunderbolt/acpi.c
> +++ b/drivers/thunderbolt/acpi.c
> @@ -14,6 +14,7 @@
>  static acpi_status tb_acpi_add_link(acpi_handle handle, u32 level, void *data,
>                                     void **return_value)
>  {
> +       enum nhi_iommu_status iommu_status = IOMMU_UNKNOWN;
>         struct fwnode_reference_args args;
>         struct fwnode_handle *fwnode;
>         struct tb_nhi *nhi = data;
> @@ -91,6 +92,8 @@ static acpi_status tb_acpi_add_link(acpi_handle handle, u32 level, void *data,
>                 if (link) {
>                         dev_dbg(&nhi->pdev->dev, "created link from %s\n",
>                                 dev_name(&pdev->dev));
> +                       if (iommu_status != IOMMU_DISABLED)
> +                               iommu_status = nhi_check_iommu_for_port(pdev);
>                 } else {
>                         dev_warn(&nhi->pdev->dev, "device link creation from %s failed\n",
>                                  dev_name(&pdev->dev));
> @@ -101,6 +104,7 @@ static acpi_status tb_acpi_add_link(acpi_handle handle, u32 level, void *data,
> 
>  out_put:
>         fwnode_handle_put(args.fwnode);
> +       nhi->iommu_dma_protection = (iommu_status == IOMMU_ENABLED);
>         return AE_OK;
>  }
> 
> diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c
> index e12c2e266741..b5eb0cab392f 100644
> --- a/drivers/thunderbolt/nhi.c
> +++ b/drivers/thunderbolt/nhi.c
> @@ -1103,10 +1103,30 @@ static void nhi_check_quirks(struct tb_nhi *nhi)
>                 nhi->quirks |= QUIRK_AUTO_CLEAR_INT;
>  }
> 
> +enum nhi_iommu_status nhi_check_iommu_for_port(struct pci_dev *pdev)
> +{
> +       if (!pci_is_pcie(pdev) ||
> +           !(pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT ||
> +            pci_pcie_type(pdev) == PCI_EXP_TYPE_DOWNSTREAM)) {
> +               return IOMMU_UNKNOWN;
> +       }
> +
> +       if (!device_iommu_mapped(&pdev->dev)) {
> +               return IOMMU_DISABLED;
> +       }
> +
> +       if (!pdev->untrusted) {
> +               dev_info(&pdev->dev,
> +                       "Assuming unreliable Kernel DMA protection\n");
> +               return IOMMU_DISABLED;
> +       }
> +       return IOMMU_ENABLED;
> +}
> +
>  static void nhi_check_iommu(struct tb_nhi *nhi)
>  {
> -       struct pci_dev *pdev;
> -       bool port_ok = false;
> +       enum nhi_iommu_status iommu_status = nhi->iommu_dma_protection ?
> +                                       IOMMU_ENABLED : IOMMU_UNKNOWN;
> 
>         /*
>          * Check for sibling devices that look like they should be our
> @@ -1117,23 +1137,13 @@ static void nhi_check_iommu(struct tb_nhi *nhi)
>          * otherwise even if translation is enabled for existing devices it
>          * may potentially be overridden for a future tunnelled endpoint.
>          */
> -       for_each_pci_bridge(pdev, nhi->pdev->bus) {
> -               if (!pci_is_pcie(pdev) ||
> -                   !(pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT ||
> -                     pci_pcie_type(pdev) == PCI_EXP_TYPE_DOWNSTREAM))
> -                       continue;
> -
> -               if (!device_iommu_mapped(&pdev->dev))
> -                       return;
> -
> -               if (!pdev->untrusted) {
> -                       dev_info(&nhi->pdev->dev,
> -                                "Assuming unreliable Kernel DMA protection\n");
> -                       return;
> -               }
> -               port_ok = true;
> +       if (iommu_status == IOMMU_UNKNOWN) {
> +               struct pci_dev *pdev;
> +               for_each_pci_bridge(pdev, nhi->pdev->bus)
> +                       if (iommu_status != IOMMU_DISABLED)
> +                               iommu_status = nhi_check_iommu_for_port(pdev);
>         }
> -       nhi->iommu_dma_protection = port_ok;
> +       nhi->iommu_dma_protection = (iommu_status == IOMMU_ENABLED);
>  }
> 
>  static int nhi_init_msi(struct tb_nhi *nhi)
> 
> diff --git a/drivers/thunderbolt/nhi.h b/drivers/thunderbolt/nhi.h
> index 69083aab2736..1622d49b1763 100644
> --- a/drivers/thunderbolt/nhi.h
> +++ b/drivers/thunderbolt/nhi.h
> @@ -11,6 +11,13 @@
> 
>  #include <linux/thunderbolt.h>
> 
> +enum nhi_iommu_status {
> +       IOMMU_UNKNOWN,
> +       IOMMU_DISABLED,
> +       IOMMU_ENABLED,
> +};
> +enum nhi_iommu_status nhi_check_iommu_for_port(struct pci_dev *pdev);
> +

This adds quite a lot code and complexity, and honestly I would like to
keep it as simple as possible (and this is not enough because we need to
make sure the DMAR bit is there so that none of the possible connected
devices were able to overwrite our memory already).
Robin Murphy March 18, 2022, 11:54 a.m. UTC | #5
On 2022-03-18 06:44, Mika Westerberg wrote:
> Hi Robin,
> 
> Thanks for working on this!
> 
> On Thu, Mar 17, 2022 at 04:17:07PM +0000, Robin Murphy wrote:
>> Between me trying to get rid of iommu_present() and Mario wanting to
>> support the AMD equivalent of DMAR_PLATFORM_OPT_IN, scrutiny has shown
>> that the iommu_dma_protection attribute is being far too optimistic.
>> Even if an IOMMU might be present for some PCI segment in the system,
>> that doesn't necessarily mean it provides translation for the device(s)
>> we care about. Furthermore, all that DMAR_PLATFORM_OPT_IN really does
>> is tell us that memory was protected before the kernel was loaded, and
>> prevent the user from disabling the intel-iommu driver entirely. What
>> actually matters is whether we trust individual devices, based on the
>> "external facing" property that we expect firmware to describe for
>> Thunderbolt ports.
> 
> We still want to know that DMAR_PLATFORM_OPT_IN is set by the firmware
> because that tells us that none of the devices connected before OS got
> control had the ability to perform DMA outside of the RMRR regions. If
> they did then all this is pointless because they could have modified the
> system memory as they wished.

Ah, right, it does still matter in terms of whether we can trust our 
*own* integrity. I was thinking that it's OK since by this point the 
IOMMU driver should have reset the hardware and allocated new 
configuration tables etc. such that any previous modifications wouldn't 
matter, but of course the theoretical worst case is that the malicious 
device manages to corrupt the kernel code in memory just before that 
point to subvert the IOMMU driver itself. Got it, thanks again!

So yes, I think we *do* want to abstract that through the IOMMU API (and 
I'll follow up with our ACPI folks internally to check whether anyone's 
thought about this for Arm-based systems yet). Gosh, this is becoming 
quite the adventure, but it's all for good in the end :)

Robin.

>> Avoid false positives by looking as close as possible to the same PCI
>> topology that the IOMMU layer will consider once a Thunderbolt endpoint
>> appears. Crucially, we can't assume that IOMMU translation being enabled
>> for any reason is sufficient on its own; full (expensive) DMA protection
>> will still only be imposed on untrusted devices.
>>
>> CC: Mario Limonciello <mario.limonciello@amd.com>
>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
>> ---
>>
>> This supersedes my previous attempt just trying to replace
>> iommu_present() at [1], further to the original discussion at [2].
>>
>> [1] https://lore.kernel.org/linux-iommu/BL1PR12MB515799C0BE396377DBBEF055E2119@BL1PR12MB5157.namprd12.prod.outlook.com/T/
>> [2] https://lore.kernel.org/linux-iommu/202203160844.lKviWR1Q-lkp@intel.com/T/
>>
>>   drivers/thunderbolt/domain.c | 12 +++---------
>>   drivers/thunderbolt/nhi.c    | 35 +++++++++++++++++++++++++++++++++++
>>   include/linux/thunderbolt.h  |  2 ++
>>   3 files changed, 40 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/thunderbolt/domain.c b/drivers/thunderbolt/domain.c
>> index 7018d959f775..d5c825e84ac8 100644
>> --- a/drivers/thunderbolt/domain.c
>> +++ b/drivers/thunderbolt/domain.c
>> @@ -7,9 +7,7 @@
>>    */
>>   
>>   #include <linux/device.h>
>> -#include <linux/dmar.h>
>>   #include <linux/idr.h>
>> -#include <linux/iommu.h>
>>   #include <linux/module.h>
>>   #include <linux/pm_runtime.h>
>>   #include <linux/slab.h>
>> @@ -257,13 +255,9 @@ static ssize_t iommu_dma_protection_show(struct device *dev,
>>   					 struct device_attribute *attr,
>>   					 char *buf)
>>   {
>> -	/*
>> -	 * Kernel DMA protection is a feature where Thunderbolt security is
>> -	 * handled natively using IOMMU. It is enabled when IOMMU is
>> -	 * enabled and ACPI DMAR table has DMAR_PLATFORM_OPT_IN set.
>> -	 */
>> -	return sprintf(buf, "%d\n",
>> -		       iommu_present(&pci_bus_type) && dmar_platform_optin());
>> +	struct tb *tb = container_of(dev, struct tb, dev);
>> +
>> +	return sprintf(buf, "%d\n", tb->nhi->iommu_dma_protection);
>>   }
>>   static DEVICE_ATTR_RO(iommu_dma_protection);
>>   
>> diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c
>> index c73da0532be4..e12c2e266741 100644
>> --- a/drivers/thunderbolt/nhi.c
>> +++ b/drivers/thunderbolt/nhi.c
>> @@ -14,6 +14,7 @@
>>   #include <linux/errno.h>
>>   #include <linux/pci.h>
>>   #include <linux/interrupt.h>
>> +#include <linux/iommu.h>
>>   #include <linux/module.h>
>>   #include <linux/delay.h>
>>   #include <linux/property.h>
>> @@ -1102,6 +1103,39 @@ static void nhi_check_quirks(struct tb_nhi *nhi)
>>   		nhi->quirks |= QUIRK_AUTO_CLEAR_INT;
>>   }
>>   
>> +static void nhi_check_iommu(struct tb_nhi *nhi)
>> +{
>> +	struct pci_dev *pdev;
>> +	bool port_ok = false;
> 
> 
> So for here somewhere we should call dmar_platform_optin() first. I
> think this is something we want to move inside the IOMMU API (alongside
> with the below checks).
> 
>> +
>> +	/*
>> +	 * Check for sibling devices that look like they should be our
>> +	 * tunnelled ports. We can reasonably assume that if an IOMMU is
>> +	 * managing the bridge it will manage any future devices beyond it
>> +	 * too. If firmware has described a port as external-facing as
>> +	 * expected then we can trust the IOMMU layer to enforce isolation;
>> +	 * otherwise even if translation is enabled for existing devices it
>> +	 * may potentially be overridden for a future tunnelled endpoint.
>> +	 */
>> +	for_each_pci_bridge(pdev, nhi->pdev->bus) {
>> +		if (!pci_is_pcie(pdev) ||
>> +		    !(pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT ||
>> +		      pci_pcie_type(pdev) == PCI_EXP_TYPE_DOWNSTREAM))
>> +			continue;
>> +
>> +		if (!device_iommu_mapped(&pdev->dev))
>> +			return;
>> +
>> +		if (!pdev->untrusted) {
>> +			dev_info(&nhi->pdev->dev,
>> +				 "Assuming unreliable Kernel DMA protection\n");
>> +			return;
>> +		}
>> +		port_ok = true;
>> +	}
>> +	nhi->iommu_dma_protection = port_ok;
>> +}
>> +
>>   static int nhi_init_msi(struct tb_nhi *nhi)
>>   {
>>   	struct pci_dev *pdev = nhi->pdev;
>> @@ -1219,6 +1253,7 @@ static int nhi_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>>   		return -ENOMEM;
>>   
>>   	nhi_check_quirks(nhi);
>> +	nhi_check_iommu(nhi);
>>   
>>   	res = nhi_init_msi(nhi);
>>   	if (res) {
>> diff --git a/include/linux/thunderbolt.h b/include/linux/thunderbolt.h
>> index 124e13cb1469..7a8ad984e651 100644
>> --- a/include/linux/thunderbolt.h
>> +++ b/include/linux/thunderbolt.h
>> @@ -465,6 +465,7 @@ static inline struct tb_xdomain *tb_service_parent(struct tb_service *svc)
>>    * @msix_ida: Used to allocate MSI-X vectors for rings
>>    * @going_away: The host controller device is about to disappear so when
>>    *		this flag is set, avoid touching the hardware anymore.
>> + * @iommu_dma_protection: An IOMMU will isolate external-facing ports.
>>    * @interrupt_work: Work scheduled to handle ring interrupt when no
>>    *		    MSI-X is used.
>>    * @hop_count: Number of rings (end point hops) supported by NHI.
>> @@ -479,6 +480,7 @@ struct tb_nhi {
>>   	struct tb_ring **rx_rings;
>>   	struct ida msix_ida;
>>   	bool going_away;
>> +	bool iommu_dma_protection;
>>   	struct work_struct interrupt_work;
>>   	u32 hop_count;
>>   	unsigned long quirks;
>> -- 
>> 2.28.0.dirty
Robin Murphy March 18, 2022, 12:01 p.m. UTC | #6
On 2022-03-18 11:38, mika.westerberg@linux.intel.com wrote:
> Hi Mario,
> 
> On Thu, Mar 17, 2022 at 08:36:13PM +0000, Limonciello, Mario wrote:
>> Here is a proposal on top of what you did for this.
>> The idea being check the ports right when the links are made if they exist
>> (all the new USB4 stuff) and then check all siblings on TBT3 stuff.
>>
>> diff --git a/drivers/thunderbolt/acpi.c b/drivers/thunderbolt/acpi.c
>> index 79b5abf9d042..89432456dbea 100644
>> --- a/drivers/thunderbolt/acpi.c
>> +++ b/drivers/thunderbolt/acpi.c
>> @@ -14,6 +14,7 @@
>>   static acpi_status tb_acpi_add_link(acpi_handle handle, u32 level, void *data,
>>                                      void **return_value)
>>   {
>> +       enum nhi_iommu_status iommu_status = IOMMU_UNKNOWN;
>>          struct fwnode_reference_args args;
>>          struct fwnode_handle *fwnode;
>>          struct tb_nhi *nhi = data;
>> @@ -91,6 +92,8 @@ static acpi_status tb_acpi_add_link(acpi_handle handle, u32 level, void *data,
>>                  if (link) {
>>                          dev_dbg(&nhi->pdev->dev, "created link from %s\n",
>>                                  dev_name(&pdev->dev));
>> +                       if (iommu_status != IOMMU_DISABLED)
>> +                               iommu_status = nhi_check_iommu_for_port(pdev);
>>                  } else {
>>                          dev_warn(&nhi->pdev->dev, "device link creation from %s failed\n",
>>                                   dev_name(&pdev->dev));
>> @@ -101,6 +104,7 @@ static acpi_status tb_acpi_add_link(acpi_handle handle, u32 level, void *data,
>>
>>   out_put:
>>          fwnode_handle_put(args.fwnode);
>> +       nhi->iommu_dma_protection = (iommu_status == IOMMU_ENABLED);
>>          return AE_OK;
>>   }
>>
>> diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c
>> index e12c2e266741..b5eb0cab392f 100644
>> --- a/drivers/thunderbolt/nhi.c
>> +++ b/drivers/thunderbolt/nhi.c
>> @@ -1103,10 +1103,30 @@ static void nhi_check_quirks(struct tb_nhi *nhi)
>>                  nhi->quirks |= QUIRK_AUTO_CLEAR_INT;
>>   }
>>
>> +enum nhi_iommu_status nhi_check_iommu_for_port(struct pci_dev *pdev)
>> +{
>> +       if (!pci_is_pcie(pdev) ||
>> +           !(pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT ||
>> +            pci_pcie_type(pdev) == PCI_EXP_TYPE_DOWNSTREAM)) {
>> +               return IOMMU_UNKNOWN;
>> +       }
>> +
>> +       if (!device_iommu_mapped(&pdev->dev)) {
>> +               return IOMMU_DISABLED;
>> +       }
>> +
>> +       if (!pdev->untrusted) {
>> +               dev_info(&pdev->dev,
>> +                       "Assuming unreliable Kernel DMA protection\n");
>> +               return IOMMU_DISABLED;
>> +       }
>> +       return IOMMU_ENABLED;
>> +}
>> +
>>   static void nhi_check_iommu(struct tb_nhi *nhi)
>>   {
>> -       struct pci_dev *pdev;
>> -       bool port_ok = false;
>> +       enum nhi_iommu_status iommu_status = nhi->iommu_dma_protection ?
>> +                                       IOMMU_ENABLED : IOMMU_UNKNOWN;
>>
>>          /*
>>           * Check for sibling devices that look like they should be our
>> @@ -1117,23 +1137,13 @@ static void nhi_check_iommu(struct tb_nhi *nhi)
>>           * otherwise even if translation is enabled for existing devices it
>>           * may potentially be overridden for a future tunnelled endpoint.
>>           */
>> -       for_each_pci_bridge(pdev, nhi->pdev->bus) {
>> -               if (!pci_is_pcie(pdev) ||
>> -                   !(pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT ||
>> -                     pci_pcie_type(pdev) == PCI_EXP_TYPE_DOWNSTREAM))
>> -                       continue;
>> -
>> -               if (!device_iommu_mapped(&pdev->dev))
>> -                       return;
>> -
>> -               if (!pdev->untrusted) {
>> -                       dev_info(&nhi->pdev->dev,
>> -                                "Assuming unreliable Kernel DMA protection\n");
>> -                       return;
>> -               }
>> -               port_ok = true;
>> +       if (iommu_status == IOMMU_UNKNOWN) {
>> +               struct pci_dev *pdev;
>> +               for_each_pci_bridge(pdev, nhi->pdev->bus)
>> +                       if (iommu_status != IOMMU_DISABLED)
>> +                               iommu_status = nhi_check_iommu_for_port(pdev);
>>          }
>> -       nhi->iommu_dma_protection = port_ok;
>> +       nhi->iommu_dma_protection = (iommu_status == IOMMU_ENABLED);
>>   }
>>
>>   static int nhi_init_msi(struct tb_nhi *nhi)
>>
>> diff --git a/drivers/thunderbolt/nhi.h b/drivers/thunderbolt/nhi.h
>> index 69083aab2736..1622d49b1763 100644
>> --- a/drivers/thunderbolt/nhi.h
>> +++ b/drivers/thunderbolt/nhi.h
>> @@ -11,6 +11,13 @@
>>
>>   #include <linux/thunderbolt.h>
>>
>> +enum nhi_iommu_status {
>> +       IOMMU_UNKNOWN,
>> +       IOMMU_DISABLED,
>> +       IOMMU_ENABLED,
>> +};
>> +enum nhi_iommu_status nhi_check_iommu_for_port(struct pci_dev *pdev);
>> +
> 
> This adds quite a lot code and complexity, and honestly I would like to
> keep it as simple as possible (and this is not enough because we need to
> make sure the DMAR bit is there so that none of the possible connected
> devices were able to overwrite our memory already).

Shall we forget the standalone sibling check and just make the 
pdev->untrusted check directly in tb_acpi_add_link() then? On reflection 
I guess the DMAR bit makes iommu_dma_protection functionally dependent 
on ACPI already, so we don't actually lose anything (and anyone can come 
back and revisit firmware-agnostic methods later if a need appears).

Robin.
Mika Westerberg March 18, 2022, 1:25 p.m. UTC | #7
Hi Robin,

On Fri, Mar 18, 2022 at 12:01:42PM +0000, Robin Murphy wrote:
> > This adds quite a lot code and complexity, and honestly I would like to
> > keep it as simple as possible (and this is not enough because we need to
> > make sure the DMAR bit is there so that none of the possible connected
> > devices were able to overwrite our memory already).
> 
> Shall we forget the standalone sibling check and just make the
> pdev->untrusted check directly in tb_acpi_add_link() then?

I think we should leave tb_acpi_add_link() untouched if possible ;-)
This is because it is used to add the device links from firmware
description that we need for proper power management of the tunneled
devices. It has little to do with the identification of the external
facing DMA-capable PCIe ports.

Furthermore these links only exists in USB4 software connection manager
systems so we do not have those in the existing Thunderbolt 3/4 systems
that use firmware based connection manager (pretty much all out there).

> On reflection I guess the DMAR bit makes iommu_dma_protection
> functionally dependent on ACPI already, so we don't actually lose
> anything (and anyone can come back and revisit firmware-agnostic
> methods later if a need appears).

I agree.
Robin Murphy March 18, 2022, 2:08 p.m. UTC | #8
On 2022-03-18 13:25, mika.westerberg@linux.intel.com wrote:
> Hi Robin,
> 
> On Fri, Mar 18, 2022 at 12:01:42PM +0000, Robin Murphy wrote:
>>> This adds quite a lot code and complexity, and honestly I would like to
>>> keep it as simple as possible (and this is not enough because we need to
>>> make sure the DMAR bit is there so that none of the possible connected
>>> devices were able to overwrite our memory already).
>>
>> Shall we forget the standalone sibling check and just make the
>> pdev->untrusted check directly in tb_acpi_add_link() then?
> 
> I think we should leave tb_acpi_add_link() untouched if possible ;-)
> This is because it is used to add the device links from firmware
> description that we need for proper power management of the tunneled
> devices. It has little to do with the identification of the external
> facing DMA-capable PCIe ports.
> 
> Furthermore these links only exists in USB4 software connection manager
> systems so we do not have those in the existing Thunderbolt 3/4 systems
> that use firmware based connection manager (pretty much all out there).
> 
>> On reflection I guess the DMAR bit makes iommu_dma_protection
>> functionally dependent on ACPI already, so we don't actually lose
>> anything (and anyone can come back and revisit firmware-agnostic
>> methods later if a need appears).
> 
> I agree.

OK, so do we have any realistic options for identifying the correct PCI 
devices, if USB4 PCIe adapters might be anywhere relative to their 
associated NHI? Short of maintaining a list of known IDs, the only 
thought I have left is that if we walk the whole PCI segment looking 
specifically for hotplug-capable Gen1 ports, any system modern enough to 
have Thunderbolt is *probably* not going to have any real PCIe Gen1 
hotplug slots, so maybe false negatives might be tolerable, but it still 
feels like a bit of a sketchy heuristic.

I suppose we could just look to see if any device anywhere is marked as 
external-facing, and hope that if firmware's done that much then it's 
done everything right. That's still at least slightly better than what 
we have today, but AFAICS still carries significant risk of a false 
positive for an add-in card that firmware didn't recognise.

I'm satisfied that we've come round to the right conclusion on the DMAR 
opt-in - I'm in the middle or writing up patches for that now - but even 
Microsoft's spec gives that as a separate requirement from the flagging 
of external ports, with both being necessary for Kernel DMA Protection.

Robin.
Mario Limonciello March 18, 2022, 2:22 p.m. UTC | #9
[Public]

> -----Original Message-----
> From: Robin Murphy <robin.murphy@arm.com>
> Sent: Friday, March 18, 2022 09:08
> To: mika.westerberg@linux.intel.com
> Cc: Limonciello, Mario <Mario.Limonciello@amd.com>;
> andreas.noever@gmail.com; michael.jamet@intel.com;
> YehezkelShB@gmail.com; linux-usb@vger.kernel.org; linux-
> kernel@vger.kernel.org; iommu@lists.linux-foundation.org; linux-
> pci@vger.kernel.org
> Subject: Re: [PATCH] thunderbolt: Make iommu_dma_protection more
> accurate
> 
> On 2022-03-18 13:25, mika.westerberg@linux.intel.com wrote:
> > Hi Robin,
> >
> > On Fri, Mar 18, 2022 at 12:01:42PM +0000, Robin Murphy wrote:
> >>> This adds quite a lot code and complexity, and honestly I would like to
> >>> keep it as simple as possible (and this is not enough because we need to
> >>> make sure the DMAR bit is there so that none of the possible connected
> >>> devices were able to overwrite our memory already).
> >>
> >> Shall we forget the standalone sibling check and just make the
> >> pdev->untrusted check directly in tb_acpi_add_link() then?
> >
> > I think we should leave tb_acpi_add_link() untouched if possible ;-)
> > This is because it is used to add the device links from firmware
> > description that we need for proper power management of the tunneled
> > devices. It has little to do with the identification of the external
> > facing DMA-capable PCIe ports.
> >
> > Furthermore these links only exists in USB4 software connection manager
> > systems so we do not have those in the existing Thunderbolt 3/4 systems
> > that use firmware based connection manager (pretty much all out there).
> >
> >> On reflection I guess the DMAR bit makes iommu_dma_protection
> >> functionally dependent on ACPI already, so we don't actually lose
> >> anything (and anyone can come back and revisit firmware-agnostic
> >> methods later if a need appears).
> >
> > I agree.
> 
> OK, so do we have any realistic options for identifying the correct PCI
> devices, if USB4 PCIe adapters might be anywhere relative to their
> associated NHI? Short of maintaining a list of known IDs, the only
> thought I have left is that if we walk the whole PCI segment looking
> specifically for hotplug-capable Gen1 ports, any system modern enough to
> have Thunderbolt is *probably* not going to have any real PCIe Gen1
> hotplug slots, so maybe false negatives might be tolerable, but it still
> feels like a bit of a sketchy heuristic.
> 
> I suppose we could just look to see if any device anywhere is marked as
> external-facing, and hope that if firmware's done that much then it's
> done everything right. That's still at least slightly better than what
> we have today, but AFAICS still carries significant risk of a false
> positive for an add-in card that firmware didn't recognise.
> 
> I'm satisfied that we've come round to the right conclusion on the DMAR
> opt-in - I'm in the middle or writing up patches for that now - but even
> Microsoft's spec gives that as a separate requirement from the flagging
> of external ports, with both being necessary for Kernel DMA Protection.
> 
> Robin.

The thunderbolt driver already has a good idea whether it's using software
CM or firmware CM.  How about if we use that to make a decision?

If running firmware CM presumably that is a fixed quantity of machines that will
dwindle over time as OEMs release new HW with SW CM designs. So maybe
leave things "as is" for those - opt in bit is sufficient for this check.

And then if running software CM then make it mandatory that any created links
are set untrusted AND some pre-boot bits are set to get iommu dma protection
set.  We effectively end up with the same requirements as Microsoft has in
their spec for new hardware then, and don't break any old hardware that
doesn't have the links made and relies on firmware for the CM.
Mika Westerberg March 18, 2022, 2:47 p.m. UTC | #10
On Fri, Mar 18, 2022 at 02:08:16PM +0000, Robin Murphy wrote:
> On 2022-03-18 13:25, mika.westerberg@linux.intel.com wrote:
> > Hi Robin,
> > 
> > On Fri, Mar 18, 2022 at 12:01:42PM +0000, Robin Murphy wrote:
> > > > This adds quite a lot code and complexity, and honestly I would like to
> > > > keep it as simple as possible (and this is not enough because we need to
> > > > make sure the DMAR bit is there so that none of the possible connected
> > > > devices were able to overwrite our memory already).
> > > 
> > > Shall we forget the standalone sibling check and just make the
> > > pdev->untrusted check directly in tb_acpi_add_link() then?
> > 
> > I think we should leave tb_acpi_add_link() untouched if possible ;-)
> > This is because it is used to add the device links from firmware
> > description that we need for proper power management of the tunneled
> > devices. It has little to do with the identification of the external
> > facing DMA-capable PCIe ports.
> > 
> > Furthermore these links only exists in USB4 software connection manager
> > systems so we do not have those in the existing Thunderbolt 3/4 systems
> > that use firmware based connection manager (pretty much all out there).
> > 
> > > On reflection I guess the DMAR bit makes iommu_dma_protection
> > > functionally dependent on ACPI already, so we don't actually lose
> > > anything (and anyone can come back and revisit firmware-agnostic
> > > methods later if a need appears).
> > 
> > I agree.
> 
> OK, so do we have any realistic options for identifying the correct PCI
> devices, if USB4 PCIe adapters might be anywhere relative to their
> associated NHI? Short of maintaining a list of known IDs, the only thought I
> have left is that if we walk the whole PCI segment looking specifically for
> hotplug-capable Gen1 ports, any system modern enough to have Thunderbolt is
> *probably* not going to have any real PCIe Gen1 hotplug slots, so maybe
> false negatives might be tolerable, but it still feels like a bit of a
> sketchy heuristic.

Indeed.

> I suppose we could just look to see if any device anywhere is marked as
> external-facing, and hope that if firmware's done that much then it's done
> everything right. That's still at least slightly better than what we have
> today, but AFAICS still carries significant risk of a false positive for an
> add-in card that firmware didn't recognise.

The port in this case, that is marked as external facing, is the PCIe
root port that the add-in-card is connected to and that is known for the
firmware in advance. 

> I'm satisfied that we've come round to the right conclusion on the DMAR
> opt-in - I'm in the middle or writing up patches for that now - but even
> Microsoft's spec gives that as a separate requirement from the flagging of
> external ports, with both being necessary for Kernel DMA Protection.

Is the problem that we are here trying to solve the fact that user can
disable the IOMMU protection from the command line? Or the fact that the
firmware might not declare all the ports properly so we may end up in a
situation that some of the ports do not get the full IOMMU protection.

These are Microsoft requirements for the OEMs in order to pass their
firmware test suite so here I would not expect to have issues. Otherwise
they simply cannot ship the thing with Windows installed.

IMHO we should just trust the firmare provided information here
(otherwise we are screwed anyway as there is no way to tell if the
devices connected prior the OS can still do DMA), and use the external
facing port indicator to idenfity the ports that need DMA protection.
Lukas Wunner March 18, 2022, 2:51 p.m. UTC | #11
On Fri, Mar 18, 2022 at 02:08:16PM +0000, Robin Murphy wrote:
> OK, so do we have any realistic options for identifying the correct PCI
> devices, if USB4 PCIe adapters might be anywhere relative to their
> associated NHI? Short of maintaining a list of known IDs, the only thought I
> have left is that if we walk the whole PCI segment looking specifically for
> hotplug-capable Gen1 ports, any system modern enough to have Thunderbolt is
> *probably* not going to have any real PCIe Gen1 hotplug slots, so maybe
> false negatives might be tolerable, but it still feels like a bit of a
> sketchy heuristic.

The Thunderbolt Device ROM contains the PCI slot number, so you can
correlate the Thunderbolt switch ports with PCIe downstream ports
and know exactly where PCIe tunnels are terminated.

Code is here:
* thunderbolt: Obtain PCI slot number from DROM
  https://github.com/l1k/linux/commit/756f7148bc10
* thunderbolt: Move upstream_port to struct tb
  https://github.com/l1k/linux/commit/58f16e7dd431
* thunderbolt: Correlate PCI devices with Thunderbolt ports
  https://github.com/l1k/linux/commit/f53ea40a7487

I implemented that in 2018, so it won't apply cleanly to current
mainline.  But I kept forward-porting it on my private branch and
could push that to GitHub if anyone is interested.

I don't know if this will work out-of-the-box for SoC-integrated
Thunderbolt controllers.  It was developed with the discrete
controllers in mind, which was the only thing available back then.

Thanks,

Lukas
Lukas Wunner March 18, 2022, 3:10 p.m. UTC | #12
On Fri, Mar 18, 2022 at 03:51:21PM +0100, Lukas Wunner wrote:
> On Fri, Mar 18, 2022 at 02:08:16PM +0000, Robin Murphy wrote:
> > OK, so do we have any realistic options for identifying the correct PCI
> > devices, if USB4 PCIe adapters might be anywhere relative to their
> > associated NHI? Short of maintaining a list of known IDs, the only thought I
> > have left is that if we walk the whole PCI segment looking specifically for
> > hotplug-capable Gen1 ports, any system modern enough to have Thunderbolt is
> > *probably* not going to have any real PCIe Gen1 hotplug slots, so maybe
> > false negatives might be tolerable, but it still feels like a bit of a
> > sketchy heuristic.
> 
> The Thunderbolt Device ROM contains the PCI slot number, so you can
> correlate the Thunderbolt switch ports with PCIe downstream ports
> and know exactly where PCIe tunnels are terminated.
[...]
> I implemented that in 2018, so it won't apply cleanly to current
> mainline.  But I kept forward-porting it on my private branch and
> could push that to GitHub if anyone is interested.

FWIW, here's the most recent forward-port I've done:

https://github.com/l1k/linux/commits/thunderbolt_correlate_5.13
Mika Westerberg March 18, 2022, 3:11 p.m. UTC | #13
Hi Lukas,

On Fri, Mar 18, 2022 at 03:51:21PM +0100, Lukas Wunner wrote:
> On Fri, Mar 18, 2022 at 02:08:16PM +0000, Robin Murphy wrote:
> > OK, so do we have any realistic options for identifying the correct PCI
> > devices, if USB4 PCIe adapters might be anywhere relative to their
> > associated NHI? Short of maintaining a list of known IDs, the only thought I
> > have left is that if we walk the whole PCI segment looking specifically for
> > hotplug-capable Gen1 ports, any system modern enough to have Thunderbolt is
> > *probably* not going to have any real PCIe Gen1 hotplug slots, so maybe
> > false negatives might be tolerable, but it still feels like a bit of a
> > sketchy heuristic.
> 
> The Thunderbolt Device ROM contains the PCI slot number, so you can
> correlate the Thunderbolt switch ports with PCIe downstream ports
> and know exactly where PCIe tunnels are terminated.
> 
> Code is here:
> * thunderbolt: Obtain PCI slot number from DROM
>   https://github.com/l1k/linux/commit/756f7148bc10
> * thunderbolt: Move upstream_port to struct tb
>   https://github.com/l1k/linux/commit/58f16e7dd431
> * thunderbolt: Correlate PCI devices with Thunderbolt ports
>   https://github.com/l1k/linux/commit/f53ea40a7487
> 
> I implemented that in 2018, so it won't apply cleanly to current
> mainline.  But I kept forward-porting it on my private branch and
> could push that to GitHub if anyone is interested.
> 
> I don't know if this will work out-of-the-box for SoC-integrated
> Thunderbolt controllers.  It was developed with the discrete
> controllers in mind, which was the only thing available back then.

That DROM entry is completely optional and so is the whole DROM for the
host routers (this is the root of the USB4/TBT topology) so
unfortunately we cannot use it here.
Robin Murphy March 18, 2022, 3:15 p.m. UTC | #14
On 2022-03-18 14:47, mika.westerberg@linux.intel.com wrote:
> On Fri, Mar 18, 2022 at 02:08:16PM +0000, Robin Murphy wrote:
>> On 2022-03-18 13:25, mika.westerberg@linux.intel.com wrote:
>>> Hi Robin,
>>>
>>> On Fri, Mar 18, 2022 at 12:01:42PM +0000, Robin Murphy wrote:
>>>>> This adds quite a lot code and complexity, and honestly I would like to
>>>>> keep it as simple as possible (and this is not enough because we need to
>>>>> make sure the DMAR bit is there so that none of the possible connected
>>>>> devices were able to overwrite our memory already).
>>>>
>>>> Shall we forget the standalone sibling check and just make the
>>>> pdev->untrusted check directly in tb_acpi_add_link() then?
>>>
>>> I think we should leave tb_acpi_add_link() untouched if possible ;-)
>>> This is because it is used to add the device links from firmware
>>> description that we need for proper power management of the tunneled
>>> devices. It has little to do with the identification of the external
>>> facing DMA-capable PCIe ports.
>>>
>>> Furthermore these links only exists in USB4 software connection manager
>>> systems so we do not have those in the existing Thunderbolt 3/4 systems
>>> that use firmware based connection manager (pretty much all out there).
>>>
>>>> On reflection I guess the DMAR bit makes iommu_dma_protection
>>>> functionally dependent on ACPI already, so we don't actually lose
>>>> anything (and anyone can come back and revisit firmware-agnostic
>>>> methods later if a need appears).
>>>
>>> I agree.
>>
>> OK, so do we have any realistic options for identifying the correct PCI
>> devices, if USB4 PCIe adapters might be anywhere relative to their
>> associated NHI? Short of maintaining a list of known IDs, the only thought I
>> have left is that if we walk the whole PCI segment looking specifically for
>> hotplug-capable Gen1 ports, any system modern enough to have Thunderbolt is
>> *probably* not going to have any real PCIe Gen1 hotplug slots, so maybe
>> false negatives might be tolerable, but it still feels like a bit of a
>> sketchy heuristic.
> 
> Indeed.
> 
>> I suppose we could just look to see if any device anywhere is marked as
>> external-facing, and hope that if firmware's done that much then it's done
>> everything right. That's still at least slightly better than what we have
>> today, but AFAICS still carries significant risk of a false positive for an
>> add-in card that firmware didn't recognise.
> 
> The port in this case, that is marked as external facing, is the PCIe
> root port that the add-in-card is connected to and that is known for the
> firmware in advance.
> 
>> I'm satisfied that we've come round to the right conclusion on the DMAR
>> opt-in - I'm in the middle or writing up patches for that now - but even
>> Microsoft's spec gives that as a separate requirement from the flagging of
>> external ports, with both being necessary for Kernel DMA Protection.
> 
> Is the problem that we are here trying to solve the fact that user can
> disable the IOMMU protection from the command line? Or the fact that the
> firmware might not declare all the ports properly so we may end up in a
> situation that some of the ports do not get the full IOMMU protection.

It's about knowing whether or not firmware has declared the ports at 
all. If it hasn't then the system is vulnerable to *some* degree of DMA 
attacks regardless of anything else (the exact degree depending on 
kernel config and user overrides). Complete mitigation is simply too 
expensive to apply by default to every device the IOMMU layer is unsure 
about. The Thunderbolt driver cannot be confident that protection is in 
place unless it can somehow know that the IOMMU layer has seen that 
untrusted property on the relevant ports.

> These are Microsoft requirements for the OEMs in order to pass their
> firmware test suite so here I would not expect to have issues. Otherwise
> they simply cannot ship the thing with Windows installed.
> 
> IMHO we should just trust the firmare provided information here
> (otherwise we are screwed anyway as there is no way to tell if the
> devices connected prior the OS can still do DMA), and use the external
> facing port indicator to idenfity the ports that need DMA protection.

Indeed that's exactly what I want to do, but it begs the question of how 
we *find* the firmware-provided information in the first place!

I seem to have already started writing the dumb version that will walk 
the whole PCI segment and assume the presence of any external-facing 
port implies that we're good. Let me know if I should stop ;)

Cheers,
Robin.
Mika Westerberg March 18, 2022, 3:23 p.m. UTC | #15
Hi Robin,

On Fri, Mar 18, 2022 at 03:15:19PM +0000, Robin Murphy wrote:
> > IMHO we should just trust the firmare provided information here
> > (otherwise we are screwed anyway as there is no way to tell if the
> > devices connected prior the OS can still do DMA), and use the external
> > facing port indicator to idenfity the ports that need DMA protection.
> 
> Indeed that's exactly what I want to do, but it begs the question of how we
> *find* the firmware-provided information in the first place!

Oh, right :) Its the combination of ACPI _DSD "ExternalFacingPort"
(which we already set, dev->external_facing, dev->untrusted for the
devices behind these ports IIRC) and the DMAR opt-in bit. All these are
already read by the kernel.

> I seem to have already started writing the dumb version that will walk the
> whole PCI segment and assume the presence of any external-facing port
> implies that we're good. Let me know if I should stop ;)

That sounds good to me, so don't stop just yet ;-)
diff mbox series

Patch

diff --git a/drivers/thunderbolt/domain.c b/drivers/thunderbolt/domain.c
index 7018d959f775..d5c825e84ac8 100644
--- a/drivers/thunderbolt/domain.c
+++ b/drivers/thunderbolt/domain.c
@@ -7,9 +7,7 @@ 
  */
 
 #include <linux/device.h>
-#include <linux/dmar.h>
 #include <linux/idr.h>
-#include <linux/iommu.h>
 #include <linux/module.h>
 #include <linux/pm_runtime.h>
 #include <linux/slab.h>
@@ -257,13 +255,9 @@  static ssize_t iommu_dma_protection_show(struct device *dev,
 					 struct device_attribute *attr,
 					 char *buf)
 {
-	/*
-	 * Kernel DMA protection is a feature where Thunderbolt security is
-	 * handled natively using IOMMU. It is enabled when IOMMU is
-	 * enabled and ACPI DMAR table has DMAR_PLATFORM_OPT_IN set.
-	 */
-	return sprintf(buf, "%d\n",
-		       iommu_present(&pci_bus_type) && dmar_platform_optin());
+	struct tb *tb = container_of(dev, struct tb, dev);
+
+	return sprintf(buf, "%d\n", tb->nhi->iommu_dma_protection);
 }
 static DEVICE_ATTR_RO(iommu_dma_protection);
 
diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c
index c73da0532be4..e12c2e266741 100644
--- a/drivers/thunderbolt/nhi.c
+++ b/drivers/thunderbolt/nhi.c
@@ -14,6 +14,7 @@ 
 #include <linux/errno.h>
 #include <linux/pci.h>
 #include <linux/interrupt.h>
+#include <linux/iommu.h>
 #include <linux/module.h>
 #include <linux/delay.h>
 #include <linux/property.h>
@@ -1102,6 +1103,39 @@  static void nhi_check_quirks(struct tb_nhi *nhi)
 		nhi->quirks |= QUIRK_AUTO_CLEAR_INT;
 }
 
+static void nhi_check_iommu(struct tb_nhi *nhi)
+{
+	struct pci_dev *pdev;
+	bool port_ok = false;
+
+	/*
+	 * Check for sibling devices that look like they should be our
+	 * tunnelled ports. We can reasonably assume that if an IOMMU is
+	 * managing the bridge it will manage any future devices beyond it
+	 * too. If firmware has described a port as external-facing as
+	 * expected then we can trust the IOMMU layer to enforce isolation;
+	 * otherwise even if translation is enabled for existing devices it
+	 * may potentially be overridden for a future tunnelled endpoint.
+	 */
+	for_each_pci_bridge(pdev, nhi->pdev->bus) {
+		if (!pci_is_pcie(pdev) ||
+		    !(pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT ||
+		      pci_pcie_type(pdev) == PCI_EXP_TYPE_DOWNSTREAM))
+			continue;
+
+		if (!device_iommu_mapped(&pdev->dev))
+			return;
+
+		if (!pdev->untrusted) {
+			dev_info(&nhi->pdev->dev,
+				 "Assuming unreliable Kernel DMA protection\n");
+			return;
+		}
+		port_ok = true;
+	}
+	nhi->iommu_dma_protection = port_ok;
+}
+
 static int nhi_init_msi(struct tb_nhi *nhi)
 {
 	struct pci_dev *pdev = nhi->pdev;
@@ -1219,6 +1253,7 @@  static int nhi_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 		return -ENOMEM;
 
 	nhi_check_quirks(nhi);
+	nhi_check_iommu(nhi);
 
 	res = nhi_init_msi(nhi);
 	if (res) {
diff --git a/include/linux/thunderbolt.h b/include/linux/thunderbolt.h
index 124e13cb1469..7a8ad984e651 100644
--- a/include/linux/thunderbolt.h
+++ b/include/linux/thunderbolt.h
@@ -465,6 +465,7 @@  static inline struct tb_xdomain *tb_service_parent(struct tb_service *svc)
  * @msix_ida: Used to allocate MSI-X vectors for rings
  * @going_away: The host controller device is about to disappear so when
  *		this flag is set, avoid touching the hardware anymore.
+ * @iommu_dma_protection: An IOMMU will isolate external-facing ports.
  * @interrupt_work: Work scheduled to handle ring interrupt when no
  *		    MSI-X is used.
  * @hop_count: Number of rings (end point hops) supported by NHI.
@@ -479,6 +480,7 @@  struct tb_nhi {
 	struct tb_ring **rx_rings;
 	struct ida msix_ida;
 	bool going_away;
+	bool iommu_dma_protection;
 	struct work_struct interrupt_work;
 	u32 hop_count;
 	unsigned long quirks;