diff mbox series

USB:ehci:fix an interrupt calltrace error

Message ID 1606361673-573-1-git-send-email-liulongfang@huawei.com
State New
Headers show
Series USB:ehci:fix an interrupt calltrace error | expand

Commit Message

liulongfang Nov. 26, 2020, 3:34 a.m. UTC
The system goes to suspend when using USB audio player. This causes
the USB device continuous send interrupt signal to system, When the
number of interrupts exceeds 100000, the system will forcibly close
the interrupts and output a calltrace error.

When the system goes to suspend, the last interrupt is reported to
the driver. At this time, the system has set the state to suspend.
This causes the last interrupt to not be processed by the system and
not clear the interrupt state flag. This uncleared interrupt flag
constantly triggers new interrupt event. This causing the driver to
receive more than 100,000 interrupts, which causes the system to
forcibly close the interrupt report and report the calltrace error.

so, when the driver goes to sleep and changes the system state to
suspend, the interrupt flag needs to be cleared.

Signed-off-by: Longfang Liu <liulongfang@huawei.com>
---
 drivers/usb/host/ehci-hub.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Alan Stern Nov. 27, 2020, 3:47 p.m. UTC | #1
On Fri, Nov 27, 2020 at 10:29:03AM +0800, liulongfang wrote:
> On 2020/11/27 0:08, Alan Stern Wrote:
> > On Thu, Nov 26, 2020 at 11:34:33AM +0800, Longfang Liu wrote:
> >> The system goes to suspend when using USB audio player. This causes
> >> the USB device continuous send interrupt signal to system, When the
> >> number of interrupts exceeds 100000, the system will forcibly close
> >> the interrupts and output a calltrace error.
> > 
> > This description is very confusing.  USB devices do not send interrupt 
> > signals to the host.  Do you mean that the device sends a wakeup 
> > request?  Or do you mean something else?
> The irq type is IRQ_NONE,It's counted in the note_interrupt function.
> From the analysis of the driver code, that are indeed  interrupt signals.

Above you wrote: "the USB device continuous send interrupt signal to 
system".  But that's not correct.  The interrupt signals are sent by the 
USB host controller, not by the USB audio device.

The patch description should mention that this happens only with some 
Synopsys host controllers.

> >> When the system goes to suspend, the last interrupt is reported to
> >> the driver. At this time, the system has set the state to suspend.
> >> This causes the last interrupt to not be processed by the system and
> >> not clear the interrupt state flag. This uncleared interrupt flag
> >> constantly triggers new interrupt event. This causing the driver to
> >> receive more than 100,000 interrupts, which causes the system to
> >> forcibly close the interrupt report and report the calltrace error.
> > 
> > If the driver receives an interrupt, it is supposed to process the event 
> > even if the host controller is suspended.  And when ehci_irq() runs, it 
> > clears the bits that are set in the USBSYS register.
> When the host controller is suspended, the ehci_suspend() will clear
> the HCD_FLAG_HW_ACCESSIBLE, and then usb_hcd_irq() will return IRQ_NONE
> directly without calling ehci_irq().

Yes.  But ehci_bus_suspend() runs _before_ the host controller is 
suspended.  While ehci_bus_suspend() is running, usb_hcd_irq() _will_ 
call ehci_irq(), and ehci_irq() _will_ clear the status bits.

After the host controller is suspended it is not supposed to generate 
any interrupt signals at all, because ehci_suspend() writes 0 to the 
USBINTR register, and it does this _before_ clearing 
HCD_FLAG_HW_ACCESSIBLE.

> > Why is your system getting interrupts?  That is, which bits are set in 
> > the USBSTS register?
> BIT(5) and BIT(3) are setted, STS_IAA and STS_FLR.

STS_FLR is not set in the USBINTR register, but STS_IAA is.  So that's 
the one which matters.

> >> so, when the driver goes to sleep and changes the system state to
> >> suspend, the interrupt flag needs to be cleared.
> >>
> >> Signed-off-by: Longfang Liu <liulongfang@huawei.com>
> >> ---
> >>  drivers/usb/host/ehci-hub.c | 5 +++++
> >>  1 file changed, 5 insertions(+)
> >>
> >> diff --git a/drivers/usb/host/ehci-hub.c b/drivers/usb/host/ehci-hub.c
> >> index ce0eaf7..5b13825 100644
> >> --- a/drivers/usb/host/ehci-hub.c
> >> +++ b/drivers/usb/host/ehci-hub.c
> >> @@ -348,6 +348,11 @@ static int ehci_bus_suspend (struct usb_hcd *hcd)
> >>  
> >>  	/* Any IAA cycle that started before the suspend is now invalid */
> >>  	end_iaa_cycle(ehci);
> >> +
> >> +	/* clear interrupt status */
> >> +	if (ehci->has_synopsys_hc_bug)
> >> +		ehci_writel(ehci, INTR_MASK | STS_FLR, &ehci->regs->status);
> > 
> > This is a very strange place to add your new code -- right in the middle 
> > of the IAA and unlink handling.  Why not put it in a more reasonable 
> > place?After the IAA is processed, clear the STS_IAA interrupt state flag.
> > 
> > Also, the patch description does not mention has_synopsys_hc_bug.  The 
> > meaning of this flag has no connection with the interrupt status 
> > register, so why do you use it here?
> Because of our USB IP comes from Synopsys, and the uncleared flage is also caused by
> special hardware design, in addition, we have not tested other manufacturers' USB
> controllers.We don’t know if other manufacturers’ designs have this problem,
> so this modification is only limited to this kind of design.

Clearing the STS_IAA flag won't hurt, no matter who manufactured the 
controller.  So your patch should look more like this:

+	/* Some Synopsys controllers mistakenly leave IAA turned on */
+	ehci_writel(ehci, STS_IAA, &ehci->regs->status);

And these lines should come before the "Any IAA cycle..." comment line.
Does that fix the problem?

Alan Stern
liulongfang Nov. 28, 2020, 3:22 a.m. UTC | #2
On 2020/11/27 23:47, Alan Stern Wrote:
> On Fri, Nov 27, 2020 at 10:29:03AM +0800, liulongfang wrote:

>> On 2020/11/27 0:08, Alan Stern Wrote:

>>> On Thu, Nov 26, 2020 at 11:34:33AM +0800, Longfang Liu wrote:

>>>> The system goes to suspend when using USB audio player. This causes

>>>> the USB device continuous send interrupt signal to system, When the

>>>> number of interrupts exceeds 100000, the system will forcibly close

>>>> the interrupts and output a calltrace error.

>>>

>>> This description is very confusing.  USB devices do not send interrupt 

>>> signals to the host.  Do you mean that the device sends a wakeup 

>>> request?  Or do you mean something else?

>> The irq type is IRQ_NONE,It's counted in the note_interrupt function.

>> From the analysis of the driver code, that are indeed  interrupt signals.

> 

> Above you wrote: "the USB device continuous send interrupt signal to 

> system".  But that's not correct.  The interrupt signals are sent by the 

> USB host controller, not by the USB audio device.

> 

OK, I will modify the description in the next patch.
> The patch description should mention that this happens only with some 

> Synopsys host controllers.

> 

>>>> When the system goes to suspend, the last interrupt is reported to

>>>> the driver. At this time, the system has set the state to suspend.

>>>> This causes the last interrupt to not be processed by the system and

>>>> not clear the interrupt state flag. This uncleared interrupt flag

>>>> constantly triggers new interrupt event. This causing the driver to

>>>> receive more than 100,000 interrupts, which causes the system to

>>>> forcibly close the interrupt report and report the calltrace error.

>>>

>>> If the driver receives an interrupt, it is supposed to process the event 

>>> even if the host controller is suspended.  And when ehci_irq() runs, it 

>>> clears the bits that are set in the USBSYS register.

>> When the host controller is suspended, the ehci_suspend() will clear

>> the HCD_FLAG_HW_ACCESSIBLE, and then usb_hcd_irq() will return IRQ_NONE

>> directly without calling ehci_irq().

> 

> Yes.  But ehci_bus_suspend() runs _before_ the host controller is 

> suspended.  While ehci_bus_suspend() is running, usb_hcd_irq() _will_ 

> call ehci_irq(), and ehci_irq() _will_ clear the status bits.

> 

> After the host controller is suspended it is not supposed to generate 

> any interrupt signals at all, because ehci_suspend() writes 0 to the 

> USBINTR register, and it does this _before_ clearing 

> HCD_FLAG_HW_ACCESSIBLE.

> 

According to this process, there should be no interruption storm problem,
but the current fact is that the problem has occurred, so the actual
execution process did not follow the correct process above.

>>> Why is your system getting interrupts?  That is, which bits are set in 

>>> the USBSTS register?

>> BIT(5) and BIT(3) are setted, STS_IAA and STS_FLR.

> 

> STS_FLR is not set in the USBINTR register, but STS_IAA is.  So that's 

> the one which matters.

> 

>>>> so, when the driver goes to sleep and changes the system state to

>>>> suspend, the interrupt flag needs to be cleared.

>>>>

>>>> Signed-off-by: Longfang Liu <liulongfang@huawei.com>

>>>> ---

>>>>  drivers/usb/host/ehci-hub.c | 5 +++++

>>>>  1 file changed, 5 insertions(+)

>>>>

>>>> diff --git a/drivers/usb/host/ehci-hub.c b/drivers/usb/host/ehci-hub.c

>>>> index ce0eaf7..5b13825 100644

>>>> --- a/drivers/usb/host/ehci-hub.c

>>>> +++ b/drivers/usb/host/ehci-hub.c

>>>> @@ -348,6 +348,11 @@ static int ehci_bus_suspend (struct usb_hcd *hcd)

>>>>  

>>>>  	/* Any IAA cycle that started before the suspend is now invalid */

>>>>  	end_iaa_cycle(ehci);

>>>> +

>>>> +	/* clear interrupt status */

>>>> +	if (ehci->has_synopsys_hc_bug)

>>>> +		ehci_writel(ehci, INTR_MASK | STS_FLR, &ehci->regs->status);

>>>

>>> This is a very strange place to add your new code -- right in the middle 

>>> of the IAA and unlink handling.  Why not put it in a more reasonable 

>>> place?After the IAA is processed, clear the STS_IAA interrupt state flag.

>>>

>>> Also, the patch description does not mention has_synopsys_hc_bug.  The 

>>> meaning of this flag has no connection with the interrupt status 

>>> register, so why do you use it here?

>> Because of our USB IP comes from Synopsys, and the uncleared flage is also caused by

>> special hardware design, in addition, we have not tested other manufacturers' USB

>> controllers.We don’t know if other manufacturers’ designs have this problem,

>> so this modification is only limited to this kind of design.

> 

> Clearing the STS_IAA flag won't hurt, no matter who manufactured the 

> controller.  So your patch should look more like this:

> 

> +	/* Some Synopsys controllers mistakenly leave IAA turned on */

> +	ehci_writel(ehci, STS_IAA, &ehci->regs->status);

> 

> And these lines should come before the "Any IAA cycle..." comment line.

> Does that fix the problem?

I will conduct a round of testing based on this modification
and provide the test results.
> 

> Alan Stern

> .

> 

Thanks.
Longfang Liu
liulongfang Jan. 11, 2021, 9:38 a.m. UTC | #3
On 2020/11/28 11:22, liulongfang Wrote:
> On 2020/11/27 23:47, Alan Stern Wrote:

>> On Fri, Nov 27, 2020 at 10:29:03AM +0800, liulongfang wrote:

>>> On 2020/11/27 0:08, Alan Stern Wrote:

>>>> On Thu, Nov 26, 2020 at 11:34:33AM +0800, Longfang Liu wrote:

>>>>> The system goes to suspend when using USB audio player. This causes

>>>>> the USB device continuous send interrupt signal to system, When the

>>>>> number of interrupts exceeds 100000, the system will forcibly close

>>>>> the interrupts and output a calltrace error.

>>>>

>>>> This description is very confusing.  USB devices do not send interrupt 

>>>> signals to the host.  Do you mean that the device sends a wakeup 

>>>> request?  Or do you mean something else?

>>> The irq type is IRQ_NONE,It's counted in the note_interrupt function.

>>> From the analysis of the driver code, that are indeed  interrupt signals.

>>

>> Above you wrote: "the USB device continuous send interrupt signal to 

>> system".  But that's not correct.  The interrupt signals are sent by the 

>> USB host controller, not by the USB audio device.

>>

> OK, I will modify the description in the next patch.

>> The patch description should mention that this happens only with some 

>> Synopsys host controllers.

>>

>>>>> When the system goes to suspend, the last interrupt is reported to

>>>>> the driver. At this time, the system has set the state to suspend.

>>>>> This causes the last interrupt to not be processed by the system and

>>>>> not clear the interrupt state flag. This uncleared interrupt flag

>>>>> constantly triggers new interrupt event. This causing the driver to

>>>>> receive more than 100,000 interrupts, which causes the system to

>>>>> forcibly close the interrupt report and report the calltrace error.

>>>>

>>>> If the driver receives an interrupt, it is supposed to process the event 

>>>> even if the host controller is suspended.  And when ehci_irq() runs, it 

>>>> clears the bits that are set in the USBSYS register.

>>> When the host controller is suspended, the ehci_suspend() will clear

>>> the HCD_FLAG_HW_ACCESSIBLE, and then usb_hcd_irq() will return IRQ_NONE

>>> directly without calling ehci_irq().

>>

>> Yes.  But ehci_bus_suspend() runs _before_ the host controller is 

>> suspended.  While ehci_bus_suspend() is running, usb_hcd_irq() _will_ 

>> call ehci_irq(), and ehci_irq() _will_ clear the status bits.

>>

>> After the host controller is suspended it is not supposed to generate 

>> any interrupt signals at all, because ehci_suspend() writes 0 to the 

>> USBINTR register, and it does this _before_ clearing 

>> HCD_FLAG_HW_ACCESSIBLE.

>>

> According to this process, there should be no interruption storm problem,

> but the current fact is that the problem has occurred, so the actual

> execution process did not follow the correct process above.

> 

>>>> Why is your system getting interrupts?  That is, which bits are set in 

>>>> the USBSTS register?

>>> BIT(5) and BIT(3) are setted, STS_IAA and STS_FLR.

>>

>> STS_FLR is not set in the USBINTR register, but STS_IAA is.  So that's 

>> the one which matters.

>>

>>>>> so, when the driver goes to sleep and changes the system state to

>>>>> suspend, the interrupt flag needs to be cleared.

>>>>>

>>>>> Signed-off-by: Longfang Liu <liulongfang@huawei.com>

>>>>> ---

>>>>>  drivers/usb/host/ehci-hub.c | 5 +++++

>>>>>  1 file changed, 5 insertions(+)

>>>>>

>>>>> diff --git a/drivers/usb/host/ehci-hub.c b/drivers/usb/host/ehci-hub.c

>>>>> index ce0eaf7..5b13825 100644

>>>>> --- a/drivers/usb/host/ehci-hub.c

>>>>> +++ b/drivers/usb/host/ehci-hub.c

>>>>> @@ -348,6 +348,11 @@ static int ehci_bus_suspend (struct usb_hcd *hcd)

>>>>>  

>>>>>  	/* Any IAA cycle that started before the suspend is now invalid */

>>>>>  	end_iaa_cycle(ehci);

>>>>> +

>>>>> +	/* clear interrupt status */

>>>>> +	if (ehci->has_synopsys_hc_bug)

>>>>> +		ehci_writel(ehci, INTR_MASK | STS_FLR, &ehci->regs->status);

>>>>

>>>> This is a very strange place to add your new code -- right in the middle 

>>>> of the IAA and unlink handling.  Why not put it in a more reasonable 

>>>> place?After the IAA is processed, clear the STS_IAA interrupt state flag.

>>>>

>>>> Also, the patch description does not mention has_synopsys_hc_bug.  The 

>>>> meaning of this flag has no connection with the interrupt status 

>>>> register, so why do you use it here?

>>> Because of our USB IP comes from Synopsys, and the uncleared flage is also caused by

>>> special hardware design, in addition, we have not tested other manufacturers' USB

>>> controllers.We don’t know if other manufacturers’ designs have this problem,

>>> so this modification is only limited to this kind of design.

>>

>> Clearing the STS_IAA flag won't hurt, no matter who manufactured the 

>> controller.  So your patch should look more like this:

>>

>> +	/* Some Synopsys controllers mistakenly leave IAA turned on */

>> +	ehci_writel(ehci, STS_IAA, &ehci->regs->status);

>>

>> And these lines should come before the "Any IAA cycle..." comment line.

>> Does that fix the problem?

> I will conduct a round of testing based on this modification

> and provide the test results.

>>

>> Alan Stern

>> .

>>

> Thanks.

> Longfang Liu

> .

> 

After continuous sleep and wake-up operation stress tests,
this problem can be solved by using a solution that
only cleans up STS_IAA
Thanks.
Longfang Liu
diff mbox series

Patch

diff --git a/drivers/usb/host/ehci-hub.c b/drivers/usb/host/ehci-hub.c
index ce0eaf7..5b13825 100644
--- a/drivers/usb/host/ehci-hub.c
+++ b/drivers/usb/host/ehci-hub.c
@@ -348,6 +348,11 @@  static int ehci_bus_suspend (struct usb_hcd *hcd)
 
 	/* Any IAA cycle that started before the suspend is now invalid */
 	end_iaa_cycle(ehci);
+
+	/* clear interrupt status */
+	if (ehci->has_synopsys_hc_bug)
+		ehci_writel(ehci, INTR_MASK | STS_FLR, &ehci->regs->status);
+
 	ehci_handle_start_intr_unlinks(ehci);
 	ehci_handle_intr_unlinks(ehci);
 	end_free_itds(ehci);