Message ID | 1624271242-111890-5-git-send-email-liudongdong3@huawei.com |
---|---|
State | New |
Headers | show |
Series | PCI: Enable 10-Bit tag support for PCIe devices | expand |
[+cc Logan] On Mon, Jun 21, 2021 at 06:27:20PM +0800, Dongdong Liu wrote: > 10-Bit Tag capability, introduced in PCIe-4.0 increases the total Tag > field size from 8 bits to 10 bits. > > For platforms where the RC supports 10-Bit Tag Completer capability, > it is highly recommended for platform firmware or operating software Recommended by whom? If the spec recommends it, we should provide the citation. > that configures PCIe hierarchies to Set the 10-Bit Tag Requester Enable > bit automatically in Endpoints with 10-Bit Tag Requester capability. This > enables the important class of 10-Bit Tag capable adapters that send > Memory Read Requests only to host memory. What is the implication for P2PDMA? What happens if we enable 10-bit tags for device A, and A generates Mem Read Requests to device B, which does not support 10-bit tags? > Signed-off-by: Dongdong Liu <liudongdong3@huawei.com> > Reviewed-by: Christoph Hellwig <hch@lst.de> > --- > drivers/pci/probe.c | 33 +++++++++++++++++++++++++++++++++ > include/linux/pci.h | 2 ++ > 2 files changed, 35 insertions(+) > > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c > index 0208865..33241fb 100644 > --- a/drivers/pci/probe.c > +++ b/drivers/pci/probe.c > @@ -2048,6 +2048,38 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign) > return 0; > } > > +static void pci_configure_10bit_tags(struct pci_dev *dev) > +{ > + struct pci_dev *bridge; > + > + if (!(dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP)) > + return; > + > + if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) { > + dev->ext_10bit_tag = 1; > + return; > + } > + > + bridge = pci_upstream_bridge(dev); > + if (bridge && bridge->ext_10bit_tag) > + dev->ext_10bit_tag = 1; > + > + /* > + * 10-Bit Tag Requester Enable in Device Control 2 Register is RsvdP > + * for VF. > + */ > + if (dev->is_virtfn) > + return; > + > + if (pci_pcie_type(dev) == PCI_EXP_TYPE_ENDPOINT && > + dev->ext_10bit_tag == 1 && > + (dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_REQ)) { > + pci_dbg(dev, "enabling 10-Bit Tag Requester\n"); > + pcie_capability_set_word(dev, PCI_EXP_DEVCTL2, > + PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN); > + } > +} > + > /** > * pcie_relaxed_ordering_enabled - Probe for PCIe relaxed ordering enable > * @dev: PCI device to query > @@ -2184,6 +2216,7 @@ static void pci_configure_device(struct pci_dev *dev) > { > pci_configure_mps(dev); > pci_configure_extended_tags(dev, NULL); > + pci_configure_10bit_tags(dev); I think 10-bit tag support should be integrated with extended (8-bit) tag support instead of having two separate functions. If we have "no_ext_tags" set because some device doesn't support 8-bit tags correctly, we probably shouldn't try to enable 10-bit tags either. > pci_configure_relaxed_ordering(dev); > pci_configure_ltr(dev); > pci_configure_eetlp_prefix(dev); > diff --git a/include/linux/pci.h b/include/linux/pci.h > index de1fc24..445d102 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -393,6 +393,8 @@ struct pci_dev { > #endif > unsigned int eetlp_prefix_path:1; /* End-to-End TLP Prefix */ > > + unsigned int ext_10bit_tag:1; /* 10-Bit Tag Completer Supported > + from root to here */ > pci_channel_state_t error_state; /* Current connectivity state */ > struct device dev; /* Generic device interface */ > > -- > 2.7.4 >
Hi Bjorn Many thanks for your review. On 2021/7/16 1:23, Bjorn Helgaas wrote: > [+cc Logan] > > On Mon, Jun 21, 2021 at 06:27:20PM +0800, Dongdong Liu wrote: >> 10-Bit Tag capability, introduced in PCIe-4.0 increases the total Tag >> field size from 8 bits to 10 bits. >> >> For platforms where the RC supports 10-Bit Tag Completer capability, >> it is highly recommended for platform firmware or operating software > > Recommended by whom? If the spec recommends it, we should provide the > citation. PCIe spec 5.0 r1.0 section 2.2.6.2 IMPLEMENTATION NOTE says that. Will fix. > >> that configures PCIe hierarchies to Set the 10-Bit Tag Requester Enable >> bit automatically in Endpoints with 10-Bit Tag Requester capability. This >> enables the important class of 10-Bit Tag capable adapters that send >> Memory Read Requests only to host memory. > > What is the implication for P2PDMA? What happens if we enable 10-bit > tags for device A, and A generates Mem Read Requests to device B, > which does not support 10-bit tags? PCIe spec 5.0 r1.0 section 2.2.6.2 says If an Endpoint supports sending Requests to other Endpoints (as opposed to host memory), the Endpoint must not send 10-Bit Tag Requests to another given Endpoint unless an implementation-specific mechanism determines that the Endpoint supports 10-Bit Tag Completer capability. Not sending 10-Bit Tag Requests to other Endpoints at all may be acceptable for some implementations. More sophisticated mechanisms are outside the scope of this specification. Not sending 10-Bit Tag Requests to other Endpoints at all seems simple. Add kernel parameter pci=pcie_bus_peer2peer when boot kernel with P2PDMA, then do not config 10-BIT Tag. if (pcie_bus_config != PCIE_BUS_PEER2PEER) pci_configure_10bit_tags(dev); Bjorn and Logan, any suggestion? > >> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com> >> Reviewed-by: Christoph Hellwig <hch@lst.de> >> --- >> drivers/pci/probe.c | 33 +++++++++++++++++++++++++++++++++ >> include/linux/pci.h | 2 ++ >> 2 files changed, 35 insertions(+) >> >> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c >> index 0208865..33241fb 100644 >> --- a/drivers/pci/probe.c >> +++ b/drivers/pci/probe.c >> @@ -2048,6 +2048,38 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign) >> return 0; >> } >> >> +static void pci_configure_10bit_tags(struct pci_dev *dev) >> +{ >> + struct pci_dev *bridge; >> + >> + if (!(dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP)) >> + return; >> + >> + if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) { >> + dev->ext_10bit_tag = 1; >> + return; >> + } >> + >> + bridge = pci_upstream_bridge(dev); >> + if (bridge && bridge->ext_10bit_tag) >> + dev->ext_10bit_tag = 1; >> + >> + /* >> + * 10-Bit Tag Requester Enable in Device Control 2 Register is RsvdP >> + * for VF. >> + */ >> + if (dev->is_virtfn) >> + return; >> + >> + if (pci_pcie_type(dev) == PCI_EXP_TYPE_ENDPOINT && >> + dev->ext_10bit_tag == 1 && >> + (dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_REQ)) { >> + pci_dbg(dev, "enabling 10-Bit Tag Requester\n"); >> + pcie_capability_set_word(dev, PCI_EXP_DEVCTL2, >> + PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN); >> + } >> +} >> + >> /** >> * pcie_relaxed_ordering_enabled - Probe for PCIe relaxed ordering enable >> * @dev: PCI device to query >> @@ -2184,6 +2216,7 @@ static void pci_configure_device(struct pci_dev *dev) >> { >> pci_configure_mps(dev); >> pci_configure_extended_tags(dev, NULL); >> + pci_configure_10bit_tags(dev); > > I think 10-bit tag support should be integrated with extended (8-bit) > tag support instead of having two separate functions. > > If we have "no_ext_tags" set because some device doesn't support 8-bit > tags correctly, we probably shouldn't try to enable 10-bit tags > either. Looks good, will fix. Thanks Dongdong > >> pci_configure_relaxed_ordering(dev); >> pci_configure_ltr(dev); >> pci_configure_eetlp_prefix(dev); >> diff --git a/include/linux/pci.h b/include/linux/pci.h >> index de1fc24..445d102 100644 >> --- a/include/linux/pci.h >> +++ b/include/linux/pci.h >> @@ -393,6 +393,8 @@ struct pci_dev { >> #endif >> unsigned int eetlp_prefix_path:1; /* End-to-End TLP Prefix */ >> >> + unsigned int ext_10bit_tag:1; /* 10-Bit Tag Completer Supported >> + from root to here */ >> pci_channel_state_t error_state; /* Current connectivity state */ >> struct device dev; /* Generic device interface */ >> >> -- >> 2.7.4 >> > . >
On 2021-07-16 5:12 a.m., Dongdong Liu wrote: > Hi Bjorn > > Many thanks for your review. > > On 2021/7/16 1:23, Bjorn Helgaas wrote: >> [+cc Logan] >> >> On Mon, Jun 21, 2021 at 06:27:20PM +0800, Dongdong Liu wrote: >>> 10-Bit Tag capability, introduced in PCIe-4.0 increases the total Tag >>> field size from 8 bits to 10 bits. >>> >>> For platforms where the RC supports 10-Bit Tag Completer capability, >>> it is highly recommended for platform firmware or operating software >> >> Recommended by whom? If the spec recommends it, we should provide the >> citation. > PCIe spec 5.0 r1.0 section 2.2.6.2 IMPLEMENTATION NOTE says that. > Will fix. >> >>> that configures PCIe hierarchies to Set the 10-Bit Tag Requester Enable >>> bit automatically in Endpoints with 10-Bit Tag Requester capability. This >>> enables the important class of 10-Bit Tag capable adapters that send >>> Memory Read Requests only to host memory. >> >> What is the implication for P2PDMA? What happens if we enable 10-bit >> tags for device A, and A generates Mem Read Requests to device B, >> which does not support 10-bit tags? > PCIe spec 5.0 r1.0 section 2.2.6.2 says > If an Endpoint supports sending Requests to other Endpoints (as opposed > to host memory), the Endpoint must not send 10-Bit Tag Requests to > another given Endpoint unless an implementation-specific mechanism > determines that the Endpoint supports 10-Bit Tag Completer capability. > Not sending 10-Bit Tag Requests to other Endpoints at all > may be acceptable for some implementations. More sophisticated > mechanisms are outside the scope of this specification. > > Not sending 10-Bit Tag Requests to other Endpoints at all seems simple. > Add kernel parameter pci=pcie_bus_peer2peer when boot kernel with > P2PDMA, then do not config 10-BIT Tag. > > if (pcie_bus_config != PCIE_BUS_PEER2PEER) > pci_configure_10bit_tags(dev); > > Bjorn and Logan, any suggestion? I think we need a check in the P2PDMA code to ensure that a device with 10bit tags doesn't interact with a device that has no 10bit tags. Before that happens, the kernel should emit a warning saying to enable a specific kernel parameter. Though a parameter with a bit more granularity might be appropriate. See what was done for disable_acs_redir where it affects only the devices specified in the list. Thanks, Logan
On 2021/7/16 23:51, Logan Gunthorpe wrote: > > > On 2021-07-16 5:12 a.m., Dongdong Liu wrote: >> Hi Bjorn >> >> Many thanks for your review. >> >> On 2021/7/16 1:23, Bjorn Helgaas wrote: >>> [+cc Logan] >>> >>> On Mon, Jun 21, 2021 at 06:27:20PM +0800, Dongdong Liu wrote: >>>> 10-Bit Tag capability, introduced in PCIe-4.0 increases the total Tag >>>> field size from 8 bits to 10 bits. >>>> >>>> For platforms where the RC supports 10-Bit Tag Completer capability, >>>> it is highly recommended for platform firmware or operating software >>> >>> Recommended by whom? If the spec recommends it, we should provide the >>> citation. >> PCIe spec 5.0 r1.0 section 2.2.6.2 IMPLEMENTATION NOTE says that. >> Will fix. >>> >>>> that configures PCIe hierarchies to Set the 10-Bit Tag Requester Enable >>>> bit automatically in Endpoints with 10-Bit Tag Requester capability. This >>>> enables the important class of 10-Bit Tag capable adapters that send >>>> Memory Read Requests only to host memory. >>> >>> What is the implication for P2PDMA? What happens if we enable 10-bit >>> tags for device A, and A generates Mem Read Requests to device B, >>> which does not support 10-bit tags? >> PCIe spec 5.0 r1.0 section 2.2.6.2 says >> If an Endpoint supports sending Requests to other Endpoints (as opposed >> to host memory), the Endpoint must not send 10-Bit Tag Requests to >> another given Endpoint unless an implementation-specific mechanism >> determines that the Endpoint supports 10-Bit Tag Completer capability. >> Not sending 10-Bit Tag Requests to other Endpoints at all >> may be acceptable for some implementations. More sophisticated >> mechanisms are outside the scope of this specification. >> >> Not sending 10-Bit Tag Requests to other Endpoints at all seems simple. >> Add kernel parameter pci=pcie_bus_peer2peer when boot kernel with >> P2PDMA, then do not config 10-BIT Tag. >> >> if (pcie_bus_config != PCIE_BUS_PEER2PEER) >> pci_configure_10bit_tags(dev); >> >> Bjorn and Logan, any suggestion? > > I think we need a check in the P2PDMA code to ensure that a device with > 10bit tags doesn't interact with a device that has no 10bit tags. Before > that happens, the kernel should emit a warning saying to enable a > specific kernel parameter. Seems reasonable. > > Though a parameter with a bit more granularity might be appropriate. See > what was done for disable_acs_redir where it affects only the devices > specified in the list. Many Thanks for your suggestion. I will investigate more about this. It seems P2PDMA also does not consider MPS safe issue if not use "pci=pcie_bus_peer2peer". Thanks, Dongdong > > Thanks, > > Logan > . >
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 0208865..33241fb 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -2048,6 +2048,38 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign) return 0; } +static void pci_configure_10bit_tags(struct pci_dev *dev) +{ + struct pci_dev *bridge; + + if (!(dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP)) + return; + + if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) { + dev->ext_10bit_tag = 1; + return; + } + + bridge = pci_upstream_bridge(dev); + if (bridge && bridge->ext_10bit_tag) + dev->ext_10bit_tag = 1; + + /* + * 10-Bit Tag Requester Enable in Device Control 2 Register is RsvdP + * for VF. + */ + if (dev->is_virtfn) + return; + + if (pci_pcie_type(dev) == PCI_EXP_TYPE_ENDPOINT && + dev->ext_10bit_tag == 1 && + (dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_REQ)) { + pci_dbg(dev, "enabling 10-Bit Tag Requester\n"); + pcie_capability_set_word(dev, PCI_EXP_DEVCTL2, + PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN); + } +} + /** * pcie_relaxed_ordering_enabled - Probe for PCIe relaxed ordering enable * @dev: PCI device to query @@ -2184,6 +2216,7 @@ static void pci_configure_device(struct pci_dev *dev) { pci_configure_mps(dev); pci_configure_extended_tags(dev, NULL); + pci_configure_10bit_tags(dev); pci_configure_relaxed_ordering(dev); pci_configure_ltr(dev); pci_configure_eetlp_prefix(dev); diff --git a/include/linux/pci.h b/include/linux/pci.h index de1fc24..445d102 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -393,6 +393,8 @@ struct pci_dev { #endif unsigned int eetlp_prefix_path:1; /* End-to-End TLP Prefix */ + unsigned int ext_10bit_tag:1; /* 10-Bit Tag Completer Supported + from root to here */ pci_channel_state_t error_state; /* Current connectivity state */ struct device dev; /* Generic device interface */