Message ID | 20201007164851.1218-1-david.e.box@linux.intel.com |
---|---|
State | New |
Headers | show |
Series | PCI: Disable PTM during suspend on Intel PCI bridges | expand |
On Wed, Oct 7, 2020 at 6:49 PM David E. Box <david.e.box@linux.intel.com> wrote: > > On Intel Platform Controller Hubs (PCH) since Cannon Lake, the Precision > Time Measurement (PTM) capability can prevent PCIe root ports from power > gating during suspend-to-idle, causing increased power consumption on > systems that suspend using Low Power S0 Idle [1]. The issue is yet to be > root caused but believed to be coming from a race condition in the suspend > flow as the incidence rate varies for different platforms on Linux but the > issue does not occur at all in other operating systems. For now, disable > the feature on suspend on all Intel root ports and enable again on resume. IMV it should also be noted that there is no particular reason why PTM would need to be enabled while the whole system is suspended. At least it doesn't seem to be particularly useful in that state. > Link: https://www.uefi.org/sites/default/files/resources/Intel_ACPI_Low_Power_S0_Idle.pdf > Bug: https://bugzilla.kernel.org/show_bug.cgi?id=209361 > Tested-by: Len Brown <len.brown@intel.com> > Signed-off-by: David E. Box <david.e.box@linux.intel.com> > --- > drivers/pci/quirks.c | 57 ++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 57 insertions(+) > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > index bdf9b52567e0..e82b1f60c7a1 100644 > --- a/drivers/pci/quirks.c > +++ b/drivers/pci/quirks.c > @@ -5632,3 +5632,60 @@ static void apex_pci_fixup_class(struct pci_dev *pdev) > } > DECLARE_PCI_FIXUP_CLASS_HEADER(0x1ac1, 0x089a, > PCI_CLASS_NOT_DEFINED, 8, apex_pci_fixup_class); > + > +#ifdef CONFIG_PCIE_PTM > +/* > + * On Intel Platform Controller Hubs (PCH) since Cannon Lake, the Precision > + * Time Measurement (PTM) capability can prevent the PCIe root port from > + * power gating during suspend-to-idle, causing increased power consumption. > + * So disable the feature on suspend on all Intel root ports and enable > + * again on resume. > + */ > +static void quirk_intel_ptm_disable_suspend(struct pci_dev *dev) > +{ > + int pos; > + u32 ctrl; > + > + if (!(dev->ptm_enabled && dev->ptm_root)) > + return; > + > + pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_PTM); > + if (!pos) > + return; > + > + pci_dbg(dev, "quirk: disabling PTM\n"); > + > + dev->ptm_enabled = 0; > + dev->ptm_root = 0; > + > + pci_read_config_dword(dev, pos + PCI_PTM_CTRL, &ctrl); > + ctrl &= ~(PCI_PTM_CTRL_ENABLE | PCI_PTM_CTRL_ROOT); > + pci_write_config_dword(dev, pos + PCI_PTM_CTRL, ctrl); > +} > + > +static void quirk_intel_ptm_enable_resume(struct pci_dev *dev) > +{ > + int pos; > + u32 ctrl; > + > + pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_PTM); > + if (!pos) > + return; > + > + pci_dbg(dev, "quirk: re-enabling PTM\n"); > + > + pci_read_config_dword(dev, pos + PCI_PTM_CTRL, &ctrl); > + ctrl |= PCI_PTM_CTRL_ENABLE | PCI_PTM_CTRL_ROOT; > + pci_write_config_dword(dev, pos + PCI_PTM_CTRL, ctrl); > + > + dev->ptm_enabled = 1; > + dev->ptm_root = 1; > +} > + > +DECLARE_PCI_FIXUP_CLASS_SUSPEND(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, > + PCI_CLASS_BRIDGE_PCI, 8, > + quirk_intel_ptm_disable_suspend) > +DECLARE_PCI_FIXUP_CLASS_RESUME(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, > + PCI_CLASS_BRIDGE_PCI, 8, > + quirk_intel_ptm_enable_resume) > +#endif > -- > 2.20.1 >
On Wed, Oct 07, 2020 at 06:53:16PM +0200, Rafael J. Wysocki wrote: > On Wed, Oct 7, 2020 at 6:49 PM David E. Box <david.e.box@linux.intel.com> wrote: > > > > On Intel Platform Controller Hubs (PCH) since Cannon Lake, the Precision > > Time Measurement (PTM) capability can prevent PCIe root ports from power > > gating during suspend-to-idle, causing increased power consumption on > > systems that suspend using Low Power S0 Idle [1]. The issue is yet to be > > root caused but believed to be coming from a race condition in the suspend > > flow as the incidence rate varies for different platforms on Linux but the > > issue does not occur at all in other operating systems. For now, disable > > the feature on suspend on all Intel root ports and enable again on resume. > > IMV it should also be noted that there is no particular reason why PTM > would need to be enabled while the whole system is suspended. At > least it doesn't seem to be particularly useful in that state. Is this a hardware erratum? If not, and this is working as designed, it sounds like we'd need to apply this quirk to every device that supports PTM. That's not really practical. The bugzilla says "there is no erratum as this does not affect Windows," but that doesn't answer the question. What I want to know is whether this is a *hardware* defect and whether it will be fixed in future hardware. If it's a "wont-fix" hardware issue, we can just disable PTM completely on Intel hardware and we won't have to worry about it during suspend. > > Link: https://www.uefi.org/sites/default/files/resources/Intel_ACPI_Low_Power_S0_Idle.pdf > > Bug: https://bugzilla.kernel.org/show_bug.cgi?id=209361 > > Tested-by: Len Brown <len.brown@intel.com> > > Signed-off-by: David E. Box <david.e.box@linux.intel.com> > > --- > > drivers/pci/quirks.c | 57 ++++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 57 insertions(+) > > > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > > index bdf9b52567e0..e82b1f60c7a1 100644 > > --- a/drivers/pci/quirks.c > > +++ b/drivers/pci/quirks.c > > @@ -5632,3 +5632,60 @@ static void apex_pci_fixup_class(struct pci_dev *pdev) > > } > > DECLARE_PCI_FIXUP_CLASS_HEADER(0x1ac1, 0x089a, > > PCI_CLASS_NOT_DEFINED, 8, apex_pci_fixup_class); > > + > > +#ifdef CONFIG_PCIE_PTM > > +/* > > + * On Intel Platform Controller Hubs (PCH) since Cannon Lake, the Precision > > + * Time Measurement (PTM) capability can prevent the PCIe root port from > > + * power gating during suspend-to-idle, causing increased power consumption. > > + * So disable the feature on suspend on all Intel root ports and enable > > + * again on resume. > > + */ > > +static void quirk_intel_ptm_disable_suspend(struct pci_dev *dev) > > +{ > > + int pos; > > + u32 ctrl; > > + > > + if (!(dev->ptm_enabled && dev->ptm_root)) > > + return; > > + > > + pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_PTM); > > + if (!pos) > > + return; > > + > > + pci_dbg(dev, "quirk: disabling PTM\n"); > > + > > + dev->ptm_enabled = 0; > > + dev->ptm_root = 0; > > + > > + pci_read_config_dword(dev, pos + PCI_PTM_CTRL, &ctrl); > > + ctrl &= ~(PCI_PTM_CTRL_ENABLE | PCI_PTM_CTRL_ROOT); > > + pci_write_config_dword(dev, pos + PCI_PTM_CTRL, ctrl); > > +} > > + > > +static void quirk_intel_ptm_enable_resume(struct pci_dev *dev) > > +{ > > + int pos; > > + u32 ctrl; > > + > > + pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_PTM); > > + if (!pos) > > + return; > > + > > + pci_dbg(dev, "quirk: re-enabling PTM\n"); > > + > > + pci_read_config_dword(dev, pos + PCI_PTM_CTRL, &ctrl); > > + ctrl |= PCI_PTM_CTRL_ENABLE | PCI_PTM_CTRL_ROOT; > > + pci_write_config_dword(dev, pos + PCI_PTM_CTRL, ctrl); > > + > > + dev->ptm_enabled = 1; > > + dev->ptm_root = 1; > > +} > > + > > +DECLARE_PCI_FIXUP_CLASS_SUSPEND(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, > > + PCI_CLASS_BRIDGE_PCI, 8, > > + quirk_intel_ptm_disable_suspend) > > +DECLARE_PCI_FIXUP_CLASS_RESUME(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, > > + PCI_CLASS_BRIDGE_PCI, 8, > > + quirk_intel_ptm_enable_resume) > > +#endif > > -- > > 2.20.1 > >
On Wed, Oct 7, 2020 at 7:10 PM Bjorn Helgaas <helgaas@kernel.org> wrote: > > On Wed, Oct 07, 2020 at 06:53:16PM +0200, Rafael J. Wysocki wrote: > > On Wed, Oct 7, 2020 at 6:49 PM David E. Box <david.e.box@linux.intel.com> wrote: > > > > > > On Intel Platform Controller Hubs (PCH) since Cannon Lake, the Precision > > > Time Measurement (PTM) capability can prevent PCIe root ports from power > > > gating during suspend-to-idle, causing increased power consumption on > > > systems that suspend using Low Power S0 Idle [1]. The issue is yet to be > > > root caused but believed to be coming from a race condition in the suspend > > > flow as the incidence rate varies for different platforms on Linux but the > > > issue does not occur at all in other operating systems. For now, disable > > > the feature on suspend on all Intel root ports and enable again on resume. > > > > IMV it should also be noted that there is no particular reason why PTM > > would need to be enabled while the whole system is suspended. At > > least it doesn't seem to be particularly useful in that state. > > Is this a hardware erratum? If not, and this is working as designed, > it sounds like we'd need to apply this quirk to every device that > supports PTM. That's not really practical. Why not? It looks like the capability should be saved by pci_save_state() (it isn't ATM, which appears to be a mistake) and restored by pci_restore_state(), so if that is implemented, the saving can be combined with the disabling in principle. > The bugzilla says "there is no erratum as this does not affect > Windows," but that doesn't answer the question. What I want to know > is whether this is a *hardware* defect and whether it will be fixed in > future hardware. I cannot answer this question, sorry. ATM we only know that certain SoCs may not enter the deepest idle state if PTM is enabled on some PCIe root ports during suspend. Disabling PTM on those ports while suspending helps and hence the patch. It doesn't appear to qualify as a "hardware defect". > If it's a "wont-fix" hardware issue, we can just disable PTM > completely on Intel hardware and we won't have to worry about it > during suspend. I'm not following the logic here, sorry again. First of all, there are systems that never suspend, so why would they be affected by the remedy (whatever it is)? Second, it is not about the suspend failing entirely. It's about being able to make the system draw less power while suspended. Generally, if someone said "I can make the system draw less power while suspended if I disable PCIe feature X during suspend", would you disregard that?
On Mon, Nov 16, 2020 at 06:53:09PM +0100, Rafael J. Wysocki wrote: > On Wed, Oct 7, 2020 at 7:10 PM Bjorn Helgaas <helgaas@kernel.org> wrote: > > On Wed, Oct 07, 2020 at 06:53:16PM +0200, Rafael J. Wysocki wrote: > > > On Wed, Oct 7, 2020 at 6:49 PM David E. Box <david.e.box@linux.intel.com> wrote: > > > > > > > > On Intel Platform Controller Hubs (PCH) since Cannon Lake, the Precision > > > > Time Measurement (PTM) capability can prevent PCIe root ports from power > > > > gating during suspend-to-idle, causing increased power consumption on > > > > systems that suspend using Low Power S0 Idle [1]. The issue is yet to be > > > > root caused but believed to be coming from a race condition in the suspend > > > > flow as the incidence rate varies for different platforms on Linux but the > > > > issue does not occur at all in other operating systems. For now, disable > > > > the feature on suspend on all Intel root ports and enable again on resume. > > > > > > IMV it should also be noted that there is no particular reason why PTM > > > would need to be enabled while the whole system is suspended. At > > > least it doesn't seem to be particularly useful in that state. > > > > Is this a hardware erratum? If not, and this is working as designed, > > it sounds like we'd need to apply this quirk to every device that > > supports PTM. That's not really practical. > > Why not? My objection was that the original patch is a quirk that applies only to Intel devices. If this is a generic thing that should be done for *all* devices that support PTM, that's fine, but it should not be a quirk, and it should not involve a list of Vendor or Device IDs. > It looks like the capability should be saved by pci_save_state() (it > isn't ATM, which appears to be a mistake) and restored by > pci_restore_state(), so if that is implemented, the saving can be > combined with the disabling in principle. Yup, looks like a mistake. Maybe David can fix that at the same time (probably a separate patch, though). I don't have a way to test it, but he probably does. > > The bugzilla says "there is no erratum as this does not affect > > Windows," but that doesn't answer the question. What I want to know > > is whether this is a *hardware* defect and whether it will be fixed in > > future hardware. > > I cannot answer this question, sorry. > > ATM we only know that certain SoCs may not enter the deepest idle > state if PTM is enabled on some PCIe root ports during suspend. > > Disabling PTM on those ports while suspending helps and hence the patch. > > It doesn't appear to qualify as a "hardware defect". > > > If it's a "wont-fix" hardware issue, we can just disable PTM > > completely on Intel hardware and we won't have to worry about it > > during suspend. > > I'm not following the logic here, sorry again. > > First of all, there are systems that never suspend, so why would they > be affected by the remedy (whatever it is)? > > Second, it is not about the suspend failing entirely. It's about > being able to make the system draw less power while suspended. > > Generally, if someone said "I can make the system draw less power > while suspended if I disable PCIe feature X during suspend", would you > disregard that? My questions were all prompted by the Intel-specific nature of the original patch, which suggests an ongoing maintenance burden. If it can be done generically, I have no problem with it. Bjorn
On Mon, 2020-11-16 at 13:23 -0600, Bjorn Helgaas wrote: > On Mon, Nov 16, 2020 at 06:53:09PM +0100, Rafael J. Wysocki wrote: > > On Wed, Oct 7, 2020 at 7:10 PM Bjorn Helgaas <helgaas@kernel.org> > > wrote: > > > On Wed, Oct 07, 2020 at 06:53:16PM +0200, Rafael J. Wysocki > > > wrote: > > > > On Wed, Oct 7, 2020 at 6:49 PM David E. Box < > > > > david.e.box@linux.intel.com> wrote: > > > > > On Intel Platform Controller Hubs (PCH) since Cannon Lake, > > > > > the Precision > > > > > Time Measurement (PTM) capability can prevent PCIe root ports > > > > > from power > > > > > gating during suspend-to-idle, causing increased power > > > > > consumption on > > > > > systems that suspend using Low Power S0 Idle [1]. The issue > > > > > is yet to be > > > > > root caused but believed to be coming from a race condition > > > > > in the suspend > > > > > flow as the incidence rate varies for different platforms on > > > > > Linux but the > > > > > issue does not occur at all in other operating systems. For > > > > > now, disable > > > > > the feature on suspend on all Intel root ports and enable > > > > > again on resume. > > > > > > > > IMV it should also be noted that there is no particular reason > > > > why PTM > > > > would need to be enabled while the whole system is > > > > suspended. At > > > > least it doesn't seem to be particularly useful in that state. > > > > > > Is this a hardware erratum? If not, and this is working as > > > designed, > > > it sounds like we'd need to apply this quirk to every device that > > > supports PTM. That's not really practical. > > > > Why not? > > My objection was that the original patch is a quirk that applies only > to Intel devices. > > If this is a generic thing that should be done for *all* devices that > support PTM, that's fine, but it should not be a quirk, and it should > not involve a list of Vendor or Device IDs. > > > It looks like the capability should be saved by pci_save_state() > > (it > > isn't ATM, which appears to be a mistake) and restored by > > pci_restore_state(), so if that is implemented, the saving can be > > combined with the disabling in principle. > > Yup, looks like a mistake. Maybe David can fix that at the same time > (probably a separate patch, though). I don't have a way to test it, > but he probably does. Yes, I can test save/restore of the PTM capability and submit a patch. > > > > The bugzilla says "there is no erratum as this does not affect > > > Windows," but that doesn't answer the question. What I want to > > > know > > > is whether this is a *hardware* defect and whether it will be > > > fixed in > > > future hardware. > > > > I cannot answer this question, sorry. > > > > ATM we only know that certain SoCs may not enter the deepest idle > > state if PTM is enabled on some PCIe root ports during suspend. > > > > Disabling PTM on those ports while suspending helps and hence the > > patch. > > > > It doesn't appear to qualify as a "hardware defect". > > > > > If it's a "wont-fix" hardware issue, we can just disable PTM > > > completely on Intel hardware and we won't have to worry about it > > > during suspend. > > > > I'm not following the logic here, sorry again. > > > > First of all, there are systems that never suspend, so why would > > they > > be affected by the remedy (whatever it is)? > > > > Second, it is not about the suspend failing entirely. It's about > > being able to make the system draw less power while suspended. > > > > Generally, if someone said "I can make the system draw less power > > while suspended if I disable PCIe feature X during suspend", would > > you > > disregard that? > > My questions were all prompted by the Intel-specific nature of the > original patch, which suggests an ongoing maintenance burden. If it > can be done generically, I have no problem with it. Okay. I'll add this to the save/restore patch then with the comment that it saves power on some Intel platforms. David
On Mon, Nov 16, 2020 at 9:58 PM David E. Box <david.e.box@linux.intel.com> wrote: > > On Mon, 2020-11-16 at 13:23 -0600, Bjorn Helgaas wrote: > > On Mon, Nov 16, 2020 at 06:53:09PM +0100, Rafael J. Wysocki wrote: > > > On Wed, Oct 7, 2020 at 7:10 PM Bjorn Helgaas <helgaas@kernel.org> > > > wrote: > > > > On Wed, Oct 07, 2020 at 06:53:16PM +0200, Rafael J. Wysocki > > > > wrote: > > > > > On Wed, Oct 7, 2020 at 6:49 PM David E. Box < > > > > > david.e.box@linux.intel.com> wrote: > > > > > > On Intel Platform Controller Hubs (PCH) since Cannon Lake, > > > > > > the Precision > > > > > > Time Measurement (PTM) capability can prevent PCIe root ports > > > > > > from power > > > > > > gating during suspend-to-idle, causing increased power > > > > > > consumption on > > > > > > systems that suspend using Low Power S0 Idle [1]. The issue > > > > > > is yet to be > > > > > > root caused but believed to be coming from a race condition > > > > > > in the suspend > > > > > > flow as the incidence rate varies for different platforms on > > > > > > Linux but the > > > > > > issue does not occur at all in other operating systems. For > > > > > > now, disable > > > > > > the feature on suspend on all Intel root ports and enable > > > > > > again on resume. > > > > > > > > > > IMV it should also be noted that there is no particular reason > > > > > why PTM > > > > > would need to be enabled while the whole system is > > > > > suspended. At > > > > > least it doesn't seem to be particularly useful in that state. > > > > > > > > Is this a hardware erratum? If not, and this is working as > > > > designed, > > > > it sounds like we'd need to apply this quirk to every device that > > > > supports PTM. That's not really practical. > > > > > > Why not? > > > > My objection was that the original patch is a quirk that applies only > > to Intel devices. > > > > If this is a generic thing that should be done for *all* devices that > > support PTM, that's fine, but it should not be a quirk, and it should > > not involve a list of Vendor or Device IDs. > > > > > It looks like the capability should be saved by pci_save_state() > > > (it > > > isn't ATM, which appears to be a mistake) and restored by > > > pci_restore_state(), so if that is implemented, the saving can be > > > combined with the disabling in principle. > > > > Yup, looks like a mistake. Maybe David can fix that at the same time > > (probably a separate patch, though). I don't have a way to test it, > > but he probably does. > > Yes, I can test save/restore of the PTM capability and submit a patch. > > > > > > > The bugzilla says "there is no erratum as this does not affect > > > > Windows," but that doesn't answer the question. What I want to > > > > know > > > > is whether this is a *hardware* defect and whether it will be > > > > fixed in > > > > future hardware. > > > > > > I cannot answer this question, sorry. > > > > > > ATM we only know that certain SoCs may not enter the deepest idle > > > state if PTM is enabled on some PCIe root ports during suspend. > > > > > > Disabling PTM on those ports while suspending helps and hence the > > > patch. > > > > > > It doesn't appear to qualify as a "hardware defect". > > > > > > > If it's a "wont-fix" hardware issue, we can just disable PTM > > > > completely on Intel hardware and we won't have to worry about it > > > > during suspend. > > > > > > I'm not following the logic here, sorry again. > > > > > > First of all, there are systems that never suspend, so why would > > > they > > > be affected by the remedy (whatever it is)? > > > > > > Second, it is not about the suspend failing entirely. It's about > > > being able to make the system draw less power while suspended. > > > > > > Generally, if someone said "I can make the system draw less power > > > while suspended if I disable PCIe feature X during suspend", would > > > you > > > disregard that? > > > > My questions were all prompted by the Intel-specific nature of the > > original patch, which suggests an ongoing maintenance burden. If it > > can be done generically, I have no problem with it. > > Okay. I'll add this to the save/restore patch then with the comment > that it saves power on some Intel platforms. I'd suggest doing two patches, then, one to save/restore the PTM capability and the other to add disabling it to the "save" path (with a comment as appropriate).
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index bdf9b52567e0..e82b1f60c7a1 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5632,3 +5632,60 @@ static void apex_pci_fixup_class(struct pci_dev *pdev) } DECLARE_PCI_FIXUP_CLASS_HEADER(0x1ac1, 0x089a, PCI_CLASS_NOT_DEFINED, 8, apex_pci_fixup_class); + +#ifdef CONFIG_PCIE_PTM +/* + * On Intel Platform Controller Hubs (PCH) since Cannon Lake, the Precision + * Time Measurement (PTM) capability can prevent the PCIe root port from + * power gating during suspend-to-idle, causing increased power consumption. + * So disable the feature on suspend on all Intel root ports and enable + * again on resume. + */ +static void quirk_intel_ptm_disable_suspend(struct pci_dev *dev) +{ + int pos; + u32 ctrl; + + if (!(dev->ptm_enabled && dev->ptm_root)) + return; + + pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_PTM); + if (!pos) + return; + + pci_dbg(dev, "quirk: disabling PTM\n"); + + dev->ptm_enabled = 0; + dev->ptm_root = 0; + + pci_read_config_dword(dev, pos + PCI_PTM_CTRL, &ctrl); + ctrl &= ~(PCI_PTM_CTRL_ENABLE | PCI_PTM_CTRL_ROOT); + pci_write_config_dword(dev, pos + PCI_PTM_CTRL, ctrl); +} + +static void quirk_intel_ptm_enable_resume(struct pci_dev *dev) +{ + int pos; + u32 ctrl; + + pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_PTM); + if (!pos) + return; + + pci_dbg(dev, "quirk: re-enabling PTM\n"); + + pci_read_config_dword(dev, pos + PCI_PTM_CTRL, &ctrl); + ctrl |= PCI_PTM_CTRL_ENABLE | PCI_PTM_CTRL_ROOT; + pci_write_config_dword(dev, pos + PCI_PTM_CTRL, ctrl); + + dev->ptm_enabled = 1; + dev->ptm_root = 1; +} + +DECLARE_PCI_FIXUP_CLASS_SUSPEND(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, + PCI_CLASS_BRIDGE_PCI, 8, + quirk_intel_ptm_disable_suspend) +DECLARE_PCI_FIXUP_CLASS_RESUME(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, + PCI_CLASS_BRIDGE_PCI, 8, + quirk_intel_ptm_enable_resume) +#endif