Message ID | 1626202242-14984-1-git-send-email-loberman@redhat.com |
---|---|
State | New |
Headers | show |
Series | usb: hcd: Revert 306c54d0edb6ba94d39877524dddebaad7770cf2: Try MSI interrupts on PCI devices | expand |
On Tue, 2021-07-13 at 15:15 -0400, Alan Stern wrote: > On Tue, Jul 13, 2021 at 02:50:42PM -0400, Laurence Oberman wrote: > > Customers have been reporting that the I/O is radically being > > slowed down to HPE virtual USB ILO served DVD images during > > installation. > > > > Lots of investigation by the Red Hat lab has found that the issue > > is > > because MSI edge interrupts do not work properly for these > > ILO USB devices. > > We start fast and then drop to polling mode and its unusable. > > > > The issue exists currently upstream on 5.13 as tested by Red Hat, > > and reverting the mentioned patch corrects this upstream. > > > > David Jeffery has this explanation: > > > > The problem with the patch turning on MSI appears to be that the > > ehci > > driver (and possibly other usb controller types too) wasn't written > > to > > support edge-triggered interrupts. > > The ehci_irq routine appears to be written in such a way that it > > will > > be racy with multiple interrupt source bits. > > With a level-triggered interrupt, it gets called another time and > > cleans > > up other interrupt sources. > > But with MSI edge, the interrupt state staying high results in no > > new interrupt and ehci has to run based on polling. > > > > static irqreturn_t ehci_irq (struct usb_hcd *hcd) > > { > > ... > > status = ehci_readl(ehci, &ehci->regs->status); > > > > /* e.g. cardbus physical eject */ > > if (status == ~(u32) 0) { > > ehci_dbg (ehci, "device removed\n"); > > goto dead; > > } > > > > /* > > * We don't use STS_FLR, but some controllers don't like it > > to > > * remain on, so mask it out along with the other status > > bits. > > */ > > masked_status = status & (INTR_MASK | STS_FLR); > > > > /* Shared IRQ? */ > > if (!masked_status || unlikely(ehci->rh_state == > > EHCI_RH_HALTED)) { > > spin_unlock_irqrestore(&ehci->lock, flags); > > return IRQ_NONE; > > } > > > > /* clear (just) interrupts */ > > ehci_writel(ehci, masked_status, &ehci->regs->status); > > ... > > > > ehci_irq() reads the interrupt status register and then writes the > > active > > interrupt-related bits back out to ack the interrupt cause. > > But with an edge interrupt, this is racy as another source of > > interrupt > > could be raised by ehci between the read and the write reaching > > the > > hardware. > > e.g. If STS_IAA was set during the initial read, but some other > > bit like > > STS_INT gets raised by the hardware between the read and the write > > to the > > interrupt status register, the interrupt signal state won't drop. > > The interrupt state says high, and since it is now edged triggered > > with > > MSI, no new invocation of the interrupt handler gets triggered. > > Wouldn't it be better to change these other PCI drivers by adding > proper MSI support? I don't know what would be involved, but > presumably it wouldn't be very hard. (Just run the handler in a > loop > until all the interrupt status bits are off?) > > Alan Stern > Hello Agree with you that is a big hammer approach, but it's such a key piece of the massive number of HPE servers out there and we have many affected customers. While I did all the test work and discovery etc, I am definitely not a USB kernel guy very often, I spend most of my time in storage. I will listen for the other replies to see how the folks who know the subsystem better than I would want this reolved. Thanks Laurence
Hi Laurence,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on usb/usb-testing]
[also build test WARNING on peter.chen-usb/for-usb-next balbi-usb/testing/next v5.14-rc1 next-20210713]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Laurence-Oberman/usb-hcd-Revert-306c54d0edb6ba94d39877524dddebaad7770cf2-Try-MSI-interrupts-on-PCI-devices/20210714-025312
base: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git usb-testing
config: arc-allyesconfig (attached as .config)
compiler: arceb-elf-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/3ea2a748176f21120e150f0645bc3c22e1cea48f
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Laurence-Oberman/usb-hcd-Revert-306c54d0edb6ba94d39877524dddebaad7770cf2-Try-MSI-interrupts-on-PCI-devices/20210714-025312
git checkout 3ea2a748176f21120e150f0645bc3c22e1cea48f
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arc
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All warnings (new ones prefixed by >>):
drivers/usb/core/hcd-pci.c: In function 'usb_hcd_pci_remove':
>> drivers/usb/core/hcd-pci.c:316:8: warning: variable 'hcd_driver_flags' set but not used [-Wunused-but-set-variable]
316 | int hcd_driver_flags;
| ^~~~~~~~~~~~~~~~
vim +/hcd_driver_flags +316 drivers/usb/core/hcd-pci.c
^1da177e4c3f41 Linus Torvalds 2005-04-16 300
^1da177e4c3f41 Linus Torvalds 2005-04-16 301 /**
^1da177e4c3f41 Linus Torvalds 2005-04-16 302 * usb_hcd_pci_remove - shutdown processing for PCI-based HCDs
^1da177e4c3f41 Linus Torvalds 2005-04-16 303 * @dev: USB Host Controller being removed
41631d3616c363 Ahmed S. Darwish 2020-10-19 304 *
41631d3616c363 Ahmed S. Darwish 2020-10-19 305 * Context: task context, might sleep
^1da177e4c3f41 Linus Torvalds 2005-04-16 306 *
^1da177e4c3f41 Linus Torvalds 2005-04-16 307 * Reverses the effect of usb_hcd_pci_probe(), first invoking
^1da177e4c3f41 Linus Torvalds 2005-04-16 308 * the HCD's stop() method. It is always called from a thread
^1da177e4c3f41 Linus Torvalds 2005-04-16 309 * context, normally "rmmod", "apmd", or something similar.
^1da177e4c3f41 Linus Torvalds 2005-04-16 310 *
^1da177e4c3f41 Linus Torvalds 2005-04-16 311 * Store this function in the HCD's struct pci_driver as remove().
^1da177e4c3f41 Linus Torvalds 2005-04-16 312 */
^1da177e4c3f41 Linus Torvalds 2005-04-16 313 void usb_hcd_pci_remove(struct pci_dev *dev)
^1da177e4c3f41 Linus Torvalds 2005-04-16 314 {
^1da177e4c3f41 Linus Torvalds 2005-04-16 315 struct usb_hcd *hcd;
7b2816dd293031 Andy Shevchenko 2020-08-14 @316 int hcd_driver_flags;
^1da177e4c3f41 Linus Torvalds 2005-04-16 317
^1da177e4c3f41 Linus Torvalds 2005-04-16 318 hcd = pci_get_drvdata(dev);
^1da177e4c3f41 Linus Torvalds 2005-04-16 319 if (!hcd)
^1da177e4c3f41 Linus Torvalds 2005-04-16 320 return;
^1da177e4c3f41 Linus Torvalds 2005-04-16 321
7b2816dd293031 Andy Shevchenko 2020-08-14 322 hcd_driver_flags = hcd->driver->flags;
7b2816dd293031 Andy Shevchenko 2020-08-14 323
3da7cff4e79e4a Alan Stern 2010-06-25 324 if (pci_dev_run_wake(dev))
3da7cff4e79e4a Alan Stern 2010-06-25 325 pm_runtime_get_noresume(&dev->dev);
3da7cff4e79e4a Alan Stern 2010-06-25 326
c548795abe0d35 Alan Stern 2010-06-09 327 /* Fake an interrupt request in order to give the driver a chance
c548795abe0d35 Alan Stern 2010-06-09 328 * to test whether the controller hardware has been removed (e.g.,
c548795abe0d35 Alan Stern 2010-06-09 329 * cardbus physical eject).
c548795abe0d35 Alan Stern 2010-06-09 330 */
c548795abe0d35 Alan Stern 2010-06-09 331 local_irq_disable();
c548795abe0d35 Alan Stern 2010-06-09 332 usb_hcd_irq(0, hcd);
c548795abe0d35 Alan Stern 2010-06-09 333 local_irq_enable();
c548795abe0d35 Alan Stern 2010-06-09 334
05768918b9a122 Alan Stern 2013-03-28 335 /* Note: dev_set_drvdata must be called while holding the rwsem */
05768918b9a122 Alan Stern 2013-03-28 336 if (dev->class == CL_EHCI) {
05768918b9a122 Alan Stern 2013-03-28 337 down_write(&companions_rwsem);
05768918b9a122 Alan Stern 2013-03-28 338 for_each_companion(dev, hcd, ehci_remove);
05768918b9a122 Alan Stern 2013-03-28 339 usb_remove_hcd(hcd);
05768918b9a122 Alan Stern 2013-03-28 340 dev_set_drvdata(&dev->dev, NULL);
05768918b9a122 Alan Stern 2013-03-28 341 up_write(&companions_rwsem);
05768918b9a122 Alan Stern 2013-03-28 342 } else {
05768918b9a122 Alan Stern 2013-03-28 343 /* Not EHCI; just clear the companion pointer */
05768918b9a122 Alan Stern 2013-03-28 344 down_read(&companions_rwsem);
05768918b9a122 Alan Stern 2013-03-28 345 hcd->self.hs_companion = NULL;
^1da177e4c3f41 Linus Torvalds 2005-04-16 346 usb_remove_hcd(hcd);
05768918b9a122 Alan Stern 2013-03-28 347 dev_set_drvdata(&dev->dev, NULL);
05768918b9a122 Alan Stern 2013-03-28 348 up_read(&companions_rwsem);
05768918b9a122 Alan Stern 2013-03-28 349 }
^1da177e4c3f41 Linus Torvalds 2005-04-16 350 usb_put_hcd(hcd);
^1da177e4c3f41 Linus Torvalds 2005-04-16 351 pci_disable_device(dev);
^1da177e4c3f41 Linus Torvalds 2005-04-16 352 }
782e70c6fc2290 Greg Kroah-Hartman 2008-01-25 353 EXPORT_SYMBOL_GPL(usb_hcd_pci_remove);
^1da177e4c3f41 Linus Torvalds 2005-04-16 354
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
On Tue, Jul 13, 2021 at 02:50:42PM -0400, Laurence Oberman wrote: > Customers have been reporting that the I/O is radically being > slowed down to HPE virtual USB ILO served DVD images during installation. Side note: Simple changing - retval = pci_alloc_irq_vectors(dev, 1, 1, PCI_IRQ_LEGACY | PCI_IRQ_MSI); + retval = pci_alloc_irq_vectors(dev, 1, 1, PCI_IRQ_LEGACY); should have the same effect without touching tons of a good code.
On Tue, 2021-07-13 at 23:33 +0300, Andy Shevchenko wrote: > On Tue, Jul 13, 2021 at 02:50:42PM -0400, Laurence Oberman wrote: > > Customers have been reporting that the I/O is radically being > > slowed down to HPE virtual USB ILO served DVD images during > > installation. > > Side note: > > Simple changing > > - retval = pci_alloc_irq_vectors(dev, 1, 1, > PCI_IRQ_LEGACY | PCI_IRQ_MSI); > + retval = pci_alloc_irq_vectors(dev, 1, 1, > PCI_IRQ_LEGACY); > > should have the same effect without touching tons of a good code. > Andy and Alan, I tested this and as expected it worked as well. However David has given me a possible ehci fix which would be the best outcome you wanted, I am testing that now and will follow up tomorrow. Regards Laurence
diff --git a/drivers/usb/core/hcd-pci.c b/drivers/usb/core/hcd-pci.c index d630ccc..522a179 100644 --- a/drivers/usb/core/hcd-pci.c +++ b/drivers/usb/core/hcd-pci.c @@ -195,21 +195,20 @@ int usb_hcd_pci_probe(struct pci_dev *dev, const struct pci_device_id *id, * make sure irq setup is not touched for xhci in generic hcd code */ if ((driver->flags & HCD_MASK) < HCD_USB3) { - retval = pci_alloc_irq_vectors(dev, 1, 1, PCI_IRQ_LEGACY | PCI_IRQ_MSI); - if (retval < 0) { + if (!dev->irq) { dev_err(&dev->dev, "Found HC with no IRQ. Check BIOS/PCI %s setup!\n", pci_name(dev)); retval = -ENODEV; goto disable_pci; } - hcd_irq = pci_irq_vector(dev, 0); + hcd_irq = dev->irq; } hcd = usb_create_hcd(driver, &dev->dev, pci_name(dev)); if (!hcd) { retval = -ENOMEM; - goto free_irq_vectors; + goto disable_pci; } hcd->amd_resume_bug = (usb_hcd_amd_remote_wakeup_quirk(dev) && @@ -288,9 +287,6 @@ int usb_hcd_pci_probe(struct pci_dev *dev, const struct pci_device_id *id, put_hcd: usb_put_hcd(hcd); -free_irq_vectors: - if ((driver->flags & HCD_MASK) < HCD_USB3) - pci_free_irq_vectors(dev); disable_pci: pci_disable_device(dev); dev_err(&dev->dev, "init %s fail, %d\n", pci_name(dev), retval); @@ -352,8 +348,6 @@ void usb_hcd_pci_remove(struct pci_dev *dev) up_read(&companions_rwsem); } usb_put_hcd(hcd); - if ((hcd_driver_flags & HCD_MASK) < HCD_USB3) - pci_free_irq_vectors(dev); pci_disable_device(dev); } EXPORT_SYMBOL_GPL(usb_hcd_pci_remove); @@ -465,7 +459,7 @@ static int suspend_common(struct device *dev, bool do_wakeup) * synchronized here. */ if (!hcd->msix_enabled) - synchronize_irq(pci_irq_vector(pci_dev, 0)); + synchronize_irq(pci_dev->irq); /* Downstream ports from this root hub should already be quiesced, so * there will be no DMA activity. Now we can shut down the upstream