diff mbox series

[v2] usb: ohci: Prevent missed ohci interrupts

Message ID 20240424195951.3749388-1-linux@roeck-us.net
State Superseded
Headers show
Series [v2] usb: ohci: Prevent missed ohci interrupts | expand

Commit Message

Guenter Roeck April 24, 2024, 7:59 p.m. UTC
Testing ohci functionality with qemu's pci-ohci emulation often results
in ohci interface stalls, resulting in hung task timeouts.

The problem is caused by lost interrupts between the emulation and the
Linux kernel code. Additional interrupts raised while the ohci interrupt
handler in Linux is running and before the handler clears the interrupt
status are not handled. The fix for a similar problem in ehci suggests
that the problem is likely caused by edge-triggered MSI interrupts. See
commit 0b60557230ad ("usb: ehci: Prevent missed ehci interrupts with
edge-triggered MSI") for details.

Ensure that the ohci interrupt code handles all pending interrupts before
returning to solve the problem.

Cc: Gerd Hoffmann <kraxel@redhat.com>
Fixes: 306c54d0edb6 ("usb: hcd: Try MSI interrupts on PCI devices")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
v2: Only repeat if the interface is still active

 drivers/usb/host/ohci-hcd.c | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Greg Kroah-Hartman April 24, 2024, 10:49 p.m. UTC | #1
On Wed, Apr 24, 2024 at 06:30:06PM -0400, Alan Stern wrote:
> On Wed, Apr 24, 2024 at 12:59:51PM -0700, Guenter Roeck wrote:
> > Testing ohci functionality with qemu's pci-ohci emulation often results
> > in ohci interface stalls, resulting in hung task timeouts.
> > 
> > The problem is caused by lost interrupts between the emulation and the
> > Linux kernel code. Additional interrupts raised while the ohci interrupt
> > handler in Linux is running and before the handler clears the interrupt
> > status are not handled. The fix for a similar problem in ehci suggests
> > that the problem is likely caused by edge-triggered MSI interrupts. See
> > commit 0b60557230ad ("usb: ehci: Prevent missed ehci interrupts with
> > edge-triggered MSI") for details.
> > 
> > Ensure that the ohci interrupt code handles all pending interrupts before
> > returning to solve the problem.
> > 
> > Cc: Gerd Hoffmann <kraxel@redhat.com>
> > Fixes: 306c54d0edb6 ("usb: hcd: Try MSI interrupts on PCI devices")
> > Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> > ---
> > v2: Only repeat if the interface is still active
> 
> Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
> 
> Greg might insist that the patch be CC'ed to stable@vger.kernel.org since 
> it contains a Fixes: tag.

I'll add that by hand, no worries.
Guenter Roeck April 24, 2024, 10:59 p.m. UTC | #2
On 4/24/24 15:49, Greg Kroah-Hartman wrote:
> On Wed, Apr 24, 2024 at 06:30:06PM -0400, Alan Stern wrote:
>> On Wed, Apr 24, 2024 at 12:59:51PM -0700, Guenter Roeck wrote:
>>> Testing ohci functionality with qemu's pci-ohci emulation often results
>>> in ohci interface stalls, resulting in hung task timeouts.
>>>
>>> The problem is caused by lost interrupts between the emulation and the
>>> Linux kernel code. Additional interrupts raised while the ohci interrupt
>>> handler in Linux is running and before the handler clears the interrupt
>>> status are not handled. The fix for a similar problem in ehci suggests
>>> that the problem is likely caused by edge-triggered MSI interrupts. See
>>> commit 0b60557230ad ("usb: ehci: Prevent missed ehci interrupts with
>>> edge-triggered MSI") for details.
>>>
>>> Ensure that the ohci interrupt code handles all pending interrupts before
>>> returning to solve the problem.
>>>
>>> Cc: Gerd Hoffmann <kraxel@redhat.com>
>>> Fixes: 306c54d0edb6 ("usb: hcd: Try MSI interrupts on PCI devices")
>>> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
>>> ---
>>> v2: Only repeat if the interface is still active
>>
>> Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
>>
>> Greg might insist that the patch be CC'ed to stable@vger.kernel.org since
>> it contains a Fixes: tag.
> 
> I'll add that by hand, no worries.

Thanks!

Guenter
David Laight April 27, 2024, 9 p.m. UTC | #3
From: Guenter Roeck
> Sent: 24 April 2024 21:00
> 
> Testing ohci functionality with qemu's pci-ohci emulation often results
> in ohci interface stalls, resulting in hung task timeouts.
> 
> The problem is caused by lost interrupts between the emulation and the
> Linux kernel code. Additional interrupts raised while the ohci interrupt
> handler in Linux is running and before the handler clears the interrupt
> status are not handled. The fix for a similar problem in ehci suggests
> that the problem is likely caused by edge-triggered MSI interrupts. See
> commit 0b60557230ad ("usb: ehci: Prevent missed ehci interrupts with
> edge-triggered MSI") for details.
> 
> Ensure that the ohci interrupt code handles all pending interrupts before
> returning to solve the problem.
> 
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Fixes: 306c54d0edb6 ("usb: hcd: Try MSI interrupts on PCI devices")
> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> ---
> v2: Only repeat if the interface is still active
> 
>  drivers/usb/host/ohci-hcd.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/usb/host/ohci-hcd.c b/drivers/usb/host/ohci-hcd.c
> index 4f9982ecfb58..bb6b50b4a356 100644
> --- a/drivers/usb/host/ohci-hcd.c
> +++ b/drivers/usb/host/ohci-hcd.c
> @@ -888,6 +888,7 @@ static irqreturn_t ohci_irq (struct usb_hcd *hcd)
>  	/* Check for an all 1's result which is a typical consequence
>  	 * of dead, unclocked, or unplugged (CardBus...) devices
>  	 */
> +again:
>  	if (ints == ~(u32)0) {
>  		ohci->rh_state = OHCI_RH_HALTED;
>  		ohci_dbg (ohci, "device removed!\n");
> @@ -982,6 +983,13 @@ static irqreturn_t ohci_irq (struct usb_hcd *hcd)
>  	}
>  	spin_unlock(&ohci->lock);
> 
> +	/* repeat until all enabled interrupts are handled */
> +	if (ohci->rh_state != OHCI_RH_HALTED) {
> +		ints = ohci_readl(ohci, &regs->intrstatus);
> +		if (ints & ohci_readl(ohci, &regs->intrenable))

Doesn't the driver know which interrupts are enabled?
So it should be able to avoid doing two (likely) slow io reads?
(PCIe reads are pretty much guaranteed to be high latency.)

	David

> +			goto again;
> +	}
> +
>  	return IRQ_HANDLED;
>  }
> 
> --
> 2.39.2
> 

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Guenter Roeck April 27, 2024, 10:18 p.m. UTC | #4
On 4/27/24 14:00, David Laight wrote:
> From: Guenter Roeck
>> Sent: 24 April 2024 21:00
>>
>> Testing ohci functionality with qemu's pci-ohci emulation often results
>> in ohci interface stalls, resulting in hung task timeouts.
>>
>> The problem is caused by lost interrupts between the emulation and the
>> Linux kernel code. Additional interrupts raised while the ohci interrupt
>> handler in Linux is running and before the handler clears the interrupt
>> status are not handled. The fix for a similar problem in ehci suggests
>> that the problem is likely caused by edge-triggered MSI interrupts. See
>> commit 0b60557230ad ("usb: ehci: Prevent missed ehci interrupts with
>> edge-triggered MSI") for details.
>>
>> Ensure that the ohci interrupt code handles all pending interrupts before
>> returning to solve the problem.
>>
>> Cc: Gerd Hoffmann <kraxel@redhat.com>
>> Fixes: 306c54d0edb6 ("usb: hcd: Try MSI interrupts on PCI devices")
>> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
>> ---
>> v2: Only repeat if the interface is still active
>>
>>   drivers/usb/host/ohci-hcd.c | 8 ++++++++
>>   1 file changed, 8 insertions(+)
>>
>> diff --git a/drivers/usb/host/ohci-hcd.c b/drivers/usb/host/ohci-hcd.c
>> index 4f9982ecfb58..bb6b50b4a356 100644
>> --- a/drivers/usb/host/ohci-hcd.c
>> +++ b/drivers/usb/host/ohci-hcd.c
>> @@ -888,6 +888,7 @@ static irqreturn_t ohci_irq (struct usb_hcd *hcd)
>>   	/* Check for an all 1's result which is a typical consequence
>>   	 * of dead, unclocked, or unplugged (CardBus...) devices
>>   	 */
>> +again:
>>   	if (ints == ~(u32)0) {
>>   		ohci->rh_state = OHCI_RH_HALTED;
>>   		ohci_dbg (ohci, "device removed!\n");
>> @@ -982,6 +983,13 @@ static irqreturn_t ohci_irq (struct usb_hcd *hcd)
>>   	}
>>   	spin_unlock(&ohci->lock);
>>
>> +	/* repeat until all enabled interrupts are handled */
>> +	if (ohci->rh_state != OHCI_RH_HALTED) {
>> +		ints = ohci_readl(ohci, &regs->intrstatus);
>> +		if (ints & ohci_readl(ohci, &regs->intrenable))
> 
> Doesn't the driver know which interrupts are enabled?
> So it should be able to avoid doing two (likely) slow io reads?
> (PCIe reads are pretty much guaranteed to be high latency.)
> 

No, the driver does not cache intrenable.

Guenter
Gerd Hoffmann April 29, 2024, 6:49 a.m. UTC | #5
Hi,

> > > +	/* repeat until all enabled interrupts are handled */
> > > +	if (ohci->rh_state != OHCI_RH_HALTED) {
> > > +		ints = ohci_readl(ohci, &regs->intrstatus);
> > > +		if (ints & ohci_readl(ohci, &regs->intrenable))
> > 
> > Doesn't the driver know which interrupts are enabled?
> > So it should be able to avoid doing two (likely) slow io reads?
> > (PCIe reads are pretty much guaranteed to be high latency.)
> 
> No, the driver does not cache intrenable.

Does the driver ever change intrenable after initialization?

PCIe reads are expensive, especially in virtual machines where this
goes vmexit to qemu, so doing that for a piece of information the
driver should have (or is able to calculate) should indeed better
be avoided.

take care,
  Gerd
Guenter Roeck April 29, 2024, 1:34 p.m. UTC | #6
On 4/28/24 23:49, Gerd Hoffmann wrote:
>    Hi,
> 
>>>> +	/* repeat until all enabled interrupts are handled */
>>>> +	if (ohci->rh_state != OHCI_RH_HALTED) {
>>>> +		ints = ohci_readl(ohci, &regs->intrstatus);
>>>> +		if (ints & ohci_readl(ohci, &regs->intrenable))
>>>
>>> Doesn't the driver know which interrupts are enabled?
>>> So it should be able to avoid doing two (likely) slow io reads?
>>> (PCIe reads are pretty much guaranteed to be high latency.)
>>
>> No, the driver does not cache intrenable.
> 
> Does the driver ever change intrenable after initialization?
> 

$ git grep -e intrenable -e intrdisable drivers/usb/host/*ohci*c | grep ohci_writel
drivers/usb/host/ohci-hcd.c:	ohci_writel(ohci, (u32) ~0, &ohci->regs->intrdisable);
drivers/usb/host/ohci-hcd.c:		ohci_writel (ohci, OHCI_INTR_OC, &ohci->regs->intrenable);
drivers/usb/host/ohci-hcd.c:	ohci_writel (ohci, OHCI_INTR_MIE, &ohci->regs->intrdisable);
drivers/usb/host/ohci-hcd.c:	ohci_writel (ohci, mask, &ohci->regs->intrenable);
drivers/usb/host/ohci-hcd.c:			ohci_writel (ohci, OHCI_INTR_UE, &regs->intrdisable);
drivers/usb/host/ohci-hcd.c:		ohci_writel(ohci, OHCI_INTR_RHSC, &regs->intrdisable);
drivers/usb/host/ohci-hcd.c:		ohci_writel (ohci, OHCI_INTR_SF, &regs->intrdisable);
drivers/usb/host/ohci-hcd.c:		ohci_writel (ohci, OHCI_INTR_MIE, &regs->intrenable);
drivers/usb/host/ohci-hcd.c:	ohci_writel (ohci, OHCI_INTR_MIE, &ohci->regs->intrdisable);
drivers/usb/host/ohci-hcd.c:	ohci_writel(ohci, OHCI_INTR_MIE, &ohci->regs->intrdisable);
drivers/usb/host/ohci-hcd.c:		ohci_writel(ohci, OHCI_INTR_MIE, &ohci->regs->intrenable);
drivers/usb/host/ohci-hub.c:	ohci_writel(ohci, OHCI_INTR_SF, &ohci->regs->intrdisable);
drivers/usb/host/ohci-hub.c:	ohci_writel (ohci, OHCI_INTR_INIT, &ohci->regs->intrenable);
drivers/usb/host/ohci-hub.c:		ohci_writel (ohci, OHCI_INTR_SF, &ohci->regs->intrenable);
drivers/usb/host/ohci-hub.c:			ohci_writel(ohci, rhsc_enable, &ohci->regs->intrenable);
drivers/usb/host/ohci-hub.c:	ohci_writel(ohci, OHCI_INTR_RHSC, &ohci->regs->intrenable);
drivers/usb/host/ohci-q.c:	ohci_writel (ohci, OHCI_INTR_SF, &ohci->regs->intrenable);

> PCIe reads are expensive, especially in virtual machines where this
> goes vmexit to qemu, so doing that for a piece of information the
> driver should have (or is able to calculate) should indeed better
> be avoided.
> 

I would agree, but I really think that should be a separate patch
if implemented.

Guenter
David Laight April 29, 2024, 2:01 p.m. UTC | #7
From: Guenter Roeck
> Sent: 29 April 2024 14:34
> 
> On 4/28/24 23:49, Gerd Hoffmann wrote:
> >    Hi,
> >
> >>>> +	/* repeat until all enabled interrupts are handled */
> >>>> +	if (ohci->rh_state != OHCI_RH_HALTED) {
> >>>> +		ints = ohci_readl(ohci, &regs->intrstatus);
> >>>> +		if (ints & ohci_readl(ohci, &regs->intrenable))
> >>>
> >>> Doesn't the driver know which interrupts are enabled?
> >>> So it should be able to avoid doing two (likely) slow io reads?
> >>> (PCIe reads are pretty much guaranteed to be high latency.)
> >>
> >> No, the driver does not cache intrenable.
> >
> > Does the driver ever change intrenable after initialization?
> >
> 
> $ git grep -e intrenable -e intrdisable drivers/usb/host/*ohci*c | grep ohci_writel
> drivers/usb/host/ohci-hcd.c:	ohci_writel(ohci, (u32) ~0, &ohci->regs->intrdisable);
> drivers/usb/host/ohci-hcd.c:		ohci_writel (ohci, OHCI_INTR_OC, &ohci->regs->intrenable);
> drivers/usb/host/ohci-hcd.c:	ohci_writel (ohci, OHCI_INTR_MIE, &ohci->regs->intrdisable);
> drivers/usb/host/ohci-hcd.c:	ohci_writel (ohci, mask, &ohci->regs->intrenable);
> drivers/usb/host/ohci-hcd.c:			ohci_writel (ohci, OHCI_INTR_UE, &regs->intrdisable);
> drivers/usb/host/ohci-hcd.c:		ohci_writel(ohci, OHCI_INTR_RHSC, &regs->intrdisable);
> drivers/usb/host/ohci-hcd.c:		ohci_writel (ohci, OHCI_INTR_SF, &regs->intrdisable);
> drivers/usb/host/ohci-hcd.c:		ohci_writel (ohci, OHCI_INTR_MIE, &regs->intrenable);
> drivers/usb/host/ohci-hcd.c:	ohci_writel (ohci, OHCI_INTR_MIE, &ohci->regs->intrdisable);
> drivers/usb/host/ohci-hcd.c:	ohci_writel(ohci, OHCI_INTR_MIE, &ohci->regs->intrdisable);
> drivers/usb/host/ohci-hcd.c:		ohci_writel(ohci, OHCI_INTR_MIE, &ohci->regs->intrenable);
> drivers/usb/host/ohci-hub.c:	ohci_writel(ohci, OHCI_INTR_SF, &ohci->regs->intrdisable);
> drivers/usb/host/ohci-hub.c:	ohci_writel (ohci, OHCI_INTR_INIT, &ohci->regs->intrenable);
> drivers/usb/host/ohci-hub.c:		ohci_writel (ohci, OHCI_INTR_SF, &ohci->regs->intrenable);
> drivers/usb/host/ohci-hub.c:			ohci_writel(ohci, rhsc_enable, &ohci->regs->intrenable);
> drivers/usb/host/ohci-hub.c:	ohci_writel(ohci, OHCI_INTR_RHSC, &ohci->regs->intrenable);
> drivers/usb/host/ohci-q.c:	ohci_writel (ohci, OHCI_INTR_SF, &ohci->regs->intrenable);

At least the hardware has separate enable/disable registers
so the driver isn't doing RMW cycles.

I'd guess that the normal condition is that no interrupts are pending.
So it can be held to one (slow) read by checking:
	if (ints && (ints & ohci_readl(ohci, &regs->intrenable)))
Although maybe there are some 'never enabled' interrupts that
might need masking?

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Guenter Roeck April 29, 2024, 3:23 p.m. UTC | #8
On 4/29/24 07:01, David Laight wrote:
> From: Guenter Roeck
>> Sent: 29 April 2024 14:34
>>
>> On 4/28/24 23:49, Gerd Hoffmann wrote:
>>>     Hi,
>>>
>>>>>> +	/* repeat until all enabled interrupts are handled */
>>>>>> +	if (ohci->rh_state != OHCI_RH_HALTED) {
>>>>>> +		ints = ohci_readl(ohci, &regs->intrstatus);
>>>>>> +		if (ints & ohci_readl(ohci, &regs->intrenable))
>>>>>
>>>>> Doesn't the driver know which interrupts are enabled?
>>>>> So it should be able to avoid doing two (likely) slow io reads?
>>>>> (PCIe reads are pretty much guaranteed to be high latency.)
>>>>
>>>> No, the driver does not cache intrenable.
>>>
>>> Does the driver ever change intrenable after initialization?
>>>
>>
>> $ git grep -e intrenable -e intrdisable drivers/usb/host/*ohci*c | grep ohci_writel
>> drivers/usb/host/ohci-hcd.c:	ohci_writel(ohci, (u32) ~0, &ohci->regs->intrdisable);
>> drivers/usb/host/ohci-hcd.c:		ohci_writel (ohci, OHCI_INTR_OC, &ohci->regs->intrenable);
>> drivers/usb/host/ohci-hcd.c:	ohci_writel (ohci, OHCI_INTR_MIE, &ohci->regs->intrdisable);
>> drivers/usb/host/ohci-hcd.c:	ohci_writel (ohci, mask, &ohci->regs->intrenable);
>> drivers/usb/host/ohci-hcd.c:			ohci_writel (ohci, OHCI_INTR_UE, &regs->intrdisable);
>> drivers/usb/host/ohci-hcd.c:		ohci_writel(ohci, OHCI_INTR_RHSC, &regs->intrdisable);
>> drivers/usb/host/ohci-hcd.c:		ohci_writel (ohci, OHCI_INTR_SF, &regs->intrdisable);
>> drivers/usb/host/ohci-hcd.c:		ohci_writel (ohci, OHCI_INTR_MIE, &regs->intrenable);
>> drivers/usb/host/ohci-hcd.c:	ohci_writel (ohci, OHCI_INTR_MIE, &ohci->regs->intrdisable);
>> drivers/usb/host/ohci-hcd.c:	ohci_writel(ohci, OHCI_INTR_MIE, &ohci->regs->intrdisable);
>> drivers/usb/host/ohci-hcd.c:		ohci_writel(ohci, OHCI_INTR_MIE, &ohci->regs->intrenable);
>> drivers/usb/host/ohci-hub.c:	ohci_writel(ohci, OHCI_INTR_SF, &ohci->regs->intrdisable);
>> drivers/usb/host/ohci-hub.c:	ohci_writel (ohci, OHCI_INTR_INIT, &ohci->regs->intrenable);
>> drivers/usb/host/ohci-hub.c:		ohci_writel (ohci, OHCI_INTR_SF, &ohci->regs->intrenable);
>> drivers/usb/host/ohci-hub.c:			ohci_writel(ohci, rhsc_enable, &ohci->regs->intrenable);
>> drivers/usb/host/ohci-hub.c:	ohci_writel(ohci, OHCI_INTR_RHSC, &ohci->regs->intrenable);
>> drivers/usb/host/ohci-q.c:	ohci_writel (ohci, OHCI_INTR_SF, &ohci->regs->intrenable);
> 
> At least the hardware has separate enable/disable registers
> so the driver isn't doing RMW cycles.
> 
> I'd guess that the normal condition is that no interrupts are pending.
> So it can be held to one (slow) read by checking:
> 	if (ints && (ints & ohci_readl(ohci, &regs->intrenable)))

Guess that can't hurt. I'll send v3.

Thanks,
Guenter
diff mbox series

Patch

diff --git a/drivers/usb/host/ohci-hcd.c b/drivers/usb/host/ohci-hcd.c
index 4f9982ecfb58..bb6b50b4a356 100644
--- a/drivers/usb/host/ohci-hcd.c
+++ b/drivers/usb/host/ohci-hcd.c
@@ -888,6 +888,7 @@  static irqreturn_t ohci_irq (struct usb_hcd *hcd)
 	/* Check for an all 1's result which is a typical consequence
 	 * of dead, unclocked, or unplugged (CardBus...) devices
 	 */
+again:
 	if (ints == ~(u32)0) {
 		ohci->rh_state = OHCI_RH_HALTED;
 		ohci_dbg (ohci, "device removed!\n");
@@ -982,6 +983,13 @@  static irqreturn_t ohci_irq (struct usb_hcd *hcd)
 	}
 	spin_unlock(&ohci->lock);
 
+	/* repeat until all enabled interrupts are handled */
+	if (ohci->rh_state != OHCI_RH_HALTED) {
+		ints = ohci_readl(ohci, &regs->intrstatus);
+		if (ints & ohci_readl(ohci, &regs->intrenable))
+			goto again;
+	}
+
 	return IRQ_HANDLED;
 }