Message ID | 20240424195951.3749388-1-linux@roeck-us.net |
---|---|
State | Superseded |
Headers | show |
Series | [v2] usb: ohci: Prevent missed ohci interrupts | expand |
On Wed, Apr 24, 2024 at 06:30:06PM -0400, Alan Stern wrote: > On Wed, Apr 24, 2024 at 12:59:51PM -0700, Guenter Roeck wrote: > > Testing ohci functionality with qemu's pci-ohci emulation often results > > in ohci interface stalls, resulting in hung task timeouts. > > > > The problem is caused by lost interrupts between the emulation and the > > Linux kernel code. Additional interrupts raised while the ohci interrupt > > handler in Linux is running and before the handler clears the interrupt > > status are not handled. The fix for a similar problem in ehci suggests > > that the problem is likely caused by edge-triggered MSI interrupts. See > > commit 0b60557230ad ("usb: ehci: Prevent missed ehci interrupts with > > edge-triggered MSI") for details. > > > > Ensure that the ohci interrupt code handles all pending interrupts before > > returning to solve the problem. > > > > Cc: Gerd Hoffmann <kraxel@redhat.com> > > Fixes: 306c54d0edb6 ("usb: hcd: Try MSI interrupts on PCI devices") > > Signed-off-by: Guenter Roeck <linux@roeck-us.net> > > --- > > v2: Only repeat if the interface is still active > > Reviewed-by: Alan Stern <stern@rowland.harvard.edu> > > Greg might insist that the patch be CC'ed to stable@vger.kernel.org since > it contains a Fixes: tag. I'll add that by hand, no worries.
On 4/24/24 15:49, Greg Kroah-Hartman wrote: > On Wed, Apr 24, 2024 at 06:30:06PM -0400, Alan Stern wrote: >> On Wed, Apr 24, 2024 at 12:59:51PM -0700, Guenter Roeck wrote: >>> Testing ohci functionality with qemu's pci-ohci emulation often results >>> in ohci interface stalls, resulting in hung task timeouts. >>> >>> The problem is caused by lost interrupts between the emulation and the >>> Linux kernel code. Additional interrupts raised while the ohci interrupt >>> handler in Linux is running and before the handler clears the interrupt >>> status are not handled. The fix for a similar problem in ehci suggests >>> that the problem is likely caused by edge-triggered MSI interrupts. See >>> commit 0b60557230ad ("usb: ehci: Prevent missed ehci interrupts with >>> edge-triggered MSI") for details. >>> >>> Ensure that the ohci interrupt code handles all pending interrupts before >>> returning to solve the problem. >>> >>> Cc: Gerd Hoffmann <kraxel@redhat.com> >>> Fixes: 306c54d0edb6 ("usb: hcd: Try MSI interrupts on PCI devices") >>> Signed-off-by: Guenter Roeck <linux@roeck-us.net> >>> --- >>> v2: Only repeat if the interface is still active >> >> Reviewed-by: Alan Stern <stern@rowland.harvard.edu> >> >> Greg might insist that the patch be CC'ed to stable@vger.kernel.org since >> it contains a Fixes: tag. > > I'll add that by hand, no worries. Thanks! Guenter
From: Guenter Roeck > Sent: 24 April 2024 21:00 > > Testing ohci functionality with qemu's pci-ohci emulation often results > in ohci interface stalls, resulting in hung task timeouts. > > The problem is caused by lost interrupts between the emulation and the > Linux kernel code. Additional interrupts raised while the ohci interrupt > handler in Linux is running and before the handler clears the interrupt > status are not handled. The fix for a similar problem in ehci suggests > that the problem is likely caused by edge-triggered MSI interrupts. See > commit 0b60557230ad ("usb: ehci: Prevent missed ehci interrupts with > edge-triggered MSI") for details. > > Ensure that the ohci interrupt code handles all pending interrupts before > returning to solve the problem. > > Cc: Gerd Hoffmann <kraxel@redhat.com> > Fixes: 306c54d0edb6 ("usb: hcd: Try MSI interrupts on PCI devices") > Signed-off-by: Guenter Roeck <linux@roeck-us.net> > --- > v2: Only repeat if the interface is still active > > drivers/usb/host/ohci-hcd.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/drivers/usb/host/ohci-hcd.c b/drivers/usb/host/ohci-hcd.c > index 4f9982ecfb58..bb6b50b4a356 100644 > --- a/drivers/usb/host/ohci-hcd.c > +++ b/drivers/usb/host/ohci-hcd.c > @@ -888,6 +888,7 @@ static irqreturn_t ohci_irq (struct usb_hcd *hcd) > /* Check for an all 1's result which is a typical consequence > * of dead, unclocked, or unplugged (CardBus...) devices > */ > +again: > if (ints == ~(u32)0) { > ohci->rh_state = OHCI_RH_HALTED; > ohci_dbg (ohci, "device removed!\n"); > @@ -982,6 +983,13 @@ static irqreturn_t ohci_irq (struct usb_hcd *hcd) > } > spin_unlock(&ohci->lock); > > + /* repeat until all enabled interrupts are handled */ > + if (ohci->rh_state != OHCI_RH_HALTED) { > + ints = ohci_readl(ohci, ®s->intrstatus); > + if (ints & ohci_readl(ohci, ®s->intrenable)) Doesn't the driver know which interrupts are enabled? So it should be able to avoid doing two (likely) slow io reads? (PCIe reads are pretty much guaranteed to be high latency.) David > + goto again; > + } > + > return IRQ_HANDLED; > } > > -- > 2.39.2 > - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On 4/27/24 14:00, David Laight wrote: > From: Guenter Roeck >> Sent: 24 April 2024 21:00 >> >> Testing ohci functionality with qemu's pci-ohci emulation often results >> in ohci interface stalls, resulting in hung task timeouts. >> >> The problem is caused by lost interrupts between the emulation and the >> Linux kernel code. Additional interrupts raised while the ohci interrupt >> handler in Linux is running and before the handler clears the interrupt >> status are not handled. The fix for a similar problem in ehci suggests >> that the problem is likely caused by edge-triggered MSI interrupts. See >> commit 0b60557230ad ("usb: ehci: Prevent missed ehci interrupts with >> edge-triggered MSI") for details. >> >> Ensure that the ohci interrupt code handles all pending interrupts before >> returning to solve the problem. >> >> Cc: Gerd Hoffmann <kraxel@redhat.com> >> Fixes: 306c54d0edb6 ("usb: hcd: Try MSI interrupts on PCI devices") >> Signed-off-by: Guenter Roeck <linux@roeck-us.net> >> --- >> v2: Only repeat if the interface is still active >> >> drivers/usb/host/ohci-hcd.c | 8 ++++++++ >> 1 file changed, 8 insertions(+) >> >> diff --git a/drivers/usb/host/ohci-hcd.c b/drivers/usb/host/ohci-hcd.c >> index 4f9982ecfb58..bb6b50b4a356 100644 >> --- a/drivers/usb/host/ohci-hcd.c >> +++ b/drivers/usb/host/ohci-hcd.c >> @@ -888,6 +888,7 @@ static irqreturn_t ohci_irq (struct usb_hcd *hcd) >> /* Check for an all 1's result which is a typical consequence >> * of dead, unclocked, or unplugged (CardBus...) devices >> */ >> +again: >> if (ints == ~(u32)0) { >> ohci->rh_state = OHCI_RH_HALTED; >> ohci_dbg (ohci, "device removed!\n"); >> @@ -982,6 +983,13 @@ static irqreturn_t ohci_irq (struct usb_hcd *hcd) >> } >> spin_unlock(&ohci->lock); >> >> + /* repeat until all enabled interrupts are handled */ >> + if (ohci->rh_state != OHCI_RH_HALTED) { >> + ints = ohci_readl(ohci, ®s->intrstatus); >> + if (ints & ohci_readl(ohci, ®s->intrenable)) > > Doesn't the driver know which interrupts are enabled? > So it should be able to avoid doing two (likely) slow io reads? > (PCIe reads are pretty much guaranteed to be high latency.) > No, the driver does not cache intrenable. Guenter
Hi, > > > + /* repeat until all enabled interrupts are handled */ > > > + if (ohci->rh_state != OHCI_RH_HALTED) { > > > + ints = ohci_readl(ohci, ®s->intrstatus); > > > + if (ints & ohci_readl(ohci, ®s->intrenable)) > > > > Doesn't the driver know which interrupts are enabled? > > So it should be able to avoid doing two (likely) slow io reads? > > (PCIe reads are pretty much guaranteed to be high latency.) > > No, the driver does not cache intrenable. Does the driver ever change intrenable after initialization? PCIe reads are expensive, especially in virtual machines where this goes vmexit to qemu, so doing that for a piece of information the driver should have (or is able to calculate) should indeed better be avoided. take care, Gerd
On 4/28/24 23:49, Gerd Hoffmann wrote: > Hi, > >>>> + /* repeat until all enabled interrupts are handled */ >>>> + if (ohci->rh_state != OHCI_RH_HALTED) { >>>> + ints = ohci_readl(ohci, ®s->intrstatus); >>>> + if (ints & ohci_readl(ohci, ®s->intrenable)) >>> >>> Doesn't the driver know which interrupts are enabled? >>> So it should be able to avoid doing two (likely) slow io reads? >>> (PCIe reads are pretty much guaranteed to be high latency.) >> >> No, the driver does not cache intrenable. > > Does the driver ever change intrenable after initialization? > $ git grep -e intrenable -e intrdisable drivers/usb/host/*ohci*c | grep ohci_writel drivers/usb/host/ohci-hcd.c: ohci_writel(ohci, (u32) ~0, &ohci->regs->intrdisable); drivers/usb/host/ohci-hcd.c: ohci_writel (ohci, OHCI_INTR_OC, &ohci->regs->intrenable); drivers/usb/host/ohci-hcd.c: ohci_writel (ohci, OHCI_INTR_MIE, &ohci->regs->intrdisable); drivers/usb/host/ohci-hcd.c: ohci_writel (ohci, mask, &ohci->regs->intrenable); drivers/usb/host/ohci-hcd.c: ohci_writel (ohci, OHCI_INTR_UE, ®s->intrdisable); drivers/usb/host/ohci-hcd.c: ohci_writel(ohci, OHCI_INTR_RHSC, ®s->intrdisable); drivers/usb/host/ohci-hcd.c: ohci_writel (ohci, OHCI_INTR_SF, ®s->intrdisable); drivers/usb/host/ohci-hcd.c: ohci_writel (ohci, OHCI_INTR_MIE, ®s->intrenable); drivers/usb/host/ohci-hcd.c: ohci_writel (ohci, OHCI_INTR_MIE, &ohci->regs->intrdisable); drivers/usb/host/ohci-hcd.c: ohci_writel(ohci, OHCI_INTR_MIE, &ohci->regs->intrdisable); drivers/usb/host/ohci-hcd.c: ohci_writel(ohci, OHCI_INTR_MIE, &ohci->regs->intrenable); drivers/usb/host/ohci-hub.c: ohci_writel(ohci, OHCI_INTR_SF, &ohci->regs->intrdisable); drivers/usb/host/ohci-hub.c: ohci_writel (ohci, OHCI_INTR_INIT, &ohci->regs->intrenable); drivers/usb/host/ohci-hub.c: ohci_writel (ohci, OHCI_INTR_SF, &ohci->regs->intrenable); drivers/usb/host/ohci-hub.c: ohci_writel(ohci, rhsc_enable, &ohci->regs->intrenable); drivers/usb/host/ohci-hub.c: ohci_writel(ohci, OHCI_INTR_RHSC, &ohci->regs->intrenable); drivers/usb/host/ohci-q.c: ohci_writel (ohci, OHCI_INTR_SF, &ohci->regs->intrenable); > PCIe reads are expensive, especially in virtual machines where this > goes vmexit to qemu, so doing that for a piece of information the > driver should have (or is able to calculate) should indeed better > be avoided. > I would agree, but I really think that should be a separate patch if implemented. Guenter
From: Guenter Roeck > Sent: 29 April 2024 14:34 > > On 4/28/24 23:49, Gerd Hoffmann wrote: > > Hi, > > > >>>> + /* repeat until all enabled interrupts are handled */ > >>>> + if (ohci->rh_state != OHCI_RH_HALTED) { > >>>> + ints = ohci_readl(ohci, ®s->intrstatus); > >>>> + if (ints & ohci_readl(ohci, ®s->intrenable)) > >>> > >>> Doesn't the driver know which interrupts are enabled? > >>> So it should be able to avoid doing two (likely) slow io reads? > >>> (PCIe reads are pretty much guaranteed to be high latency.) > >> > >> No, the driver does not cache intrenable. > > > > Does the driver ever change intrenable after initialization? > > > > $ git grep -e intrenable -e intrdisable drivers/usb/host/*ohci*c | grep ohci_writel > drivers/usb/host/ohci-hcd.c: ohci_writel(ohci, (u32) ~0, &ohci->regs->intrdisable); > drivers/usb/host/ohci-hcd.c: ohci_writel (ohci, OHCI_INTR_OC, &ohci->regs->intrenable); > drivers/usb/host/ohci-hcd.c: ohci_writel (ohci, OHCI_INTR_MIE, &ohci->regs->intrdisable); > drivers/usb/host/ohci-hcd.c: ohci_writel (ohci, mask, &ohci->regs->intrenable); > drivers/usb/host/ohci-hcd.c: ohci_writel (ohci, OHCI_INTR_UE, ®s->intrdisable); > drivers/usb/host/ohci-hcd.c: ohci_writel(ohci, OHCI_INTR_RHSC, ®s->intrdisable); > drivers/usb/host/ohci-hcd.c: ohci_writel (ohci, OHCI_INTR_SF, ®s->intrdisable); > drivers/usb/host/ohci-hcd.c: ohci_writel (ohci, OHCI_INTR_MIE, ®s->intrenable); > drivers/usb/host/ohci-hcd.c: ohci_writel (ohci, OHCI_INTR_MIE, &ohci->regs->intrdisable); > drivers/usb/host/ohci-hcd.c: ohci_writel(ohci, OHCI_INTR_MIE, &ohci->regs->intrdisable); > drivers/usb/host/ohci-hcd.c: ohci_writel(ohci, OHCI_INTR_MIE, &ohci->regs->intrenable); > drivers/usb/host/ohci-hub.c: ohci_writel(ohci, OHCI_INTR_SF, &ohci->regs->intrdisable); > drivers/usb/host/ohci-hub.c: ohci_writel (ohci, OHCI_INTR_INIT, &ohci->regs->intrenable); > drivers/usb/host/ohci-hub.c: ohci_writel (ohci, OHCI_INTR_SF, &ohci->regs->intrenable); > drivers/usb/host/ohci-hub.c: ohci_writel(ohci, rhsc_enable, &ohci->regs->intrenable); > drivers/usb/host/ohci-hub.c: ohci_writel(ohci, OHCI_INTR_RHSC, &ohci->regs->intrenable); > drivers/usb/host/ohci-q.c: ohci_writel (ohci, OHCI_INTR_SF, &ohci->regs->intrenable); At least the hardware has separate enable/disable registers so the driver isn't doing RMW cycles. I'd guess that the normal condition is that no interrupts are pending. So it can be held to one (slow) read by checking: if (ints && (ints & ohci_readl(ohci, ®s->intrenable))) Although maybe there are some 'never enabled' interrupts that might need masking? David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On 4/29/24 07:01, David Laight wrote: > From: Guenter Roeck >> Sent: 29 April 2024 14:34 >> >> On 4/28/24 23:49, Gerd Hoffmann wrote: >>> Hi, >>> >>>>>> + /* repeat until all enabled interrupts are handled */ >>>>>> + if (ohci->rh_state != OHCI_RH_HALTED) { >>>>>> + ints = ohci_readl(ohci, ®s->intrstatus); >>>>>> + if (ints & ohci_readl(ohci, ®s->intrenable)) >>>>> >>>>> Doesn't the driver know which interrupts are enabled? >>>>> So it should be able to avoid doing two (likely) slow io reads? >>>>> (PCIe reads are pretty much guaranteed to be high latency.) >>>> >>>> No, the driver does not cache intrenable. >>> >>> Does the driver ever change intrenable after initialization? >>> >> >> $ git grep -e intrenable -e intrdisable drivers/usb/host/*ohci*c | grep ohci_writel >> drivers/usb/host/ohci-hcd.c: ohci_writel(ohci, (u32) ~0, &ohci->regs->intrdisable); >> drivers/usb/host/ohci-hcd.c: ohci_writel (ohci, OHCI_INTR_OC, &ohci->regs->intrenable); >> drivers/usb/host/ohci-hcd.c: ohci_writel (ohci, OHCI_INTR_MIE, &ohci->regs->intrdisable); >> drivers/usb/host/ohci-hcd.c: ohci_writel (ohci, mask, &ohci->regs->intrenable); >> drivers/usb/host/ohci-hcd.c: ohci_writel (ohci, OHCI_INTR_UE, ®s->intrdisable); >> drivers/usb/host/ohci-hcd.c: ohci_writel(ohci, OHCI_INTR_RHSC, ®s->intrdisable); >> drivers/usb/host/ohci-hcd.c: ohci_writel (ohci, OHCI_INTR_SF, ®s->intrdisable); >> drivers/usb/host/ohci-hcd.c: ohci_writel (ohci, OHCI_INTR_MIE, ®s->intrenable); >> drivers/usb/host/ohci-hcd.c: ohci_writel (ohci, OHCI_INTR_MIE, &ohci->regs->intrdisable); >> drivers/usb/host/ohci-hcd.c: ohci_writel(ohci, OHCI_INTR_MIE, &ohci->regs->intrdisable); >> drivers/usb/host/ohci-hcd.c: ohci_writel(ohci, OHCI_INTR_MIE, &ohci->regs->intrenable); >> drivers/usb/host/ohci-hub.c: ohci_writel(ohci, OHCI_INTR_SF, &ohci->regs->intrdisable); >> drivers/usb/host/ohci-hub.c: ohci_writel (ohci, OHCI_INTR_INIT, &ohci->regs->intrenable); >> drivers/usb/host/ohci-hub.c: ohci_writel (ohci, OHCI_INTR_SF, &ohci->regs->intrenable); >> drivers/usb/host/ohci-hub.c: ohci_writel(ohci, rhsc_enable, &ohci->regs->intrenable); >> drivers/usb/host/ohci-hub.c: ohci_writel(ohci, OHCI_INTR_RHSC, &ohci->regs->intrenable); >> drivers/usb/host/ohci-q.c: ohci_writel (ohci, OHCI_INTR_SF, &ohci->regs->intrenable); > > At least the hardware has separate enable/disable registers > so the driver isn't doing RMW cycles. > > I'd guess that the normal condition is that no interrupts are pending. > So it can be held to one (slow) read by checking: > if (ints && (ints & ohci_readl(ohci, ®s->intrenable))) Guess that can't hurt. I'll send v3. Thanks, Guenter
diff --git a/drivers/usb/host/ohci-hcd.c b/drivers/usb/host/ohci-hcd.c index 4f9982ecfb58..bb6b50b4a356 100644 --- a/drivers/usb/host/ohci-hcd.c +++ b/drivers/usb/host/ohci-hcd.c @@ -888,6 +888,7 @@ static irqreturn_t ohci_irq (struct usb_hcd *hcd) /* Check for an all 1's result which is a typical consequence * of dead, unclocked, or unplugged (CardBus...) devices */ +again: if (ints == ~(u32)0) { ohci->rh_state = OHCI_RH_HALTED; ohci_dbg (ohci, "device removed!\n"); @@ -982,6 +983,13 @@ static irqreturn_t ohci_irq (struct usb_hcd *hcd) } spin_unlock(&ohci->lock); + /* repeat until all enabled interrupts are handled */ + if (ohci->rh_state != OHCI_RH_HALTED) { + ints = ohci_readl(ohci, ®s->intrstatus); + if (ints & ohci_readl(ohci, ®s->intrenable)) + goto again; + } + return IRQ_HANDLED; }
Testing ohci functionality with qemu's pci-ohci emulation often results in ohci interface stalls, resulting in hung task timeouts. The problem is caused by lost interrupts between the emulation and the Linux kernel code. Additional interrupts raised while the ohci interrupt handler in Linux is running and before the handler clears the interrupt status are not handled. The fix for a similar problem in ehci suggests that the problem is likely caused by edge-triggered MSI interrupts. See commit 0b60557230ad ("usb: ehci: Prevent missed ehci interrupts with edge-triggered MSI") for details. Ensure that the ohci interrupt code handles all pending interrupts before returning to solve the problem. Cc: Gerd Hoffmann <kraxel@redhat.com> Fixes: 306c54d0edb6 ("usb: hcd: Try MSI interrupts on PCI devices") Signed-off-by: Guenter Roeck <linux@roeck-us.net> --- v2: Only repeat if the interface is still active drivers/usb/host/ohci-hcd.c | 8 ++++++++ 1 file changed, 8 insertions(+)