Message ID | 20220225071506.22012-1-henryl@nvidia.com |
---|---|
State | Superseded |
Headers | show |
Series | [v2] xhci: fix runtime PM imbalance in USB2 resume | expand |
On 25.2.2022 9.15, Henry Lin wrote: > USB2 resume starts with usb_hcd_start_port_resume() in port status > change handling for RESUME link state. usb_hcd_end_port_resume() call is > needed to keep runtime PM balance. For normal usb2 port resume the usb_hcd_end_port_resume() is called when resume has been signaled for long enough in xhci_handle_usb2_port_link_resume(). This is also where driver directs the port to go from Resume state to U0. Port can't do this without driver directing it. If there's a failure during resume signaling (disconnect, reset, error) then stale resume variables are detected in xhci_get_port_status() and usb_hcd_end_port_resume() is called. I do now see a231ec41e6f6 ("xhci: refactor U0 link state handling in get_port_status") does change order of checking and clearing stale resume variables, but this should only happen if the first port state we read is a fully enabled functional U0 state after a failed resume. Could you expand a bit how this was detected? > > Fixes: a231ec41e6f6 ("xhci: refactor U0 link state handling in get_port_status") > Signed-off-by: Henry Lin <henryl@nvidia.com> > --- > drivers/usb/host/xhci-hub.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c > index df3522dab31b..4a8b07b8ee01 100644 > --- a/drivers/usb/host/xhci-hub.c > +++ b/drivers/usb/host/xhci-hub.c > @@ -1090,6 +1090,8 @@ static void xhci_get_usb2_port_status(struct xhci_port *port, u32 *status, > if (link_state == XDEV_U0) { > bus_state->resume_done[portnum] = 0; > clear_bit(portnum, &bus_state->resuming_ports); > + usb_hcd_end_port_resume(&port->rhub->hcd->self, > + portnum); This will call usb_hcd_end_port_resume() every time port is in normal enabled U0 state even if resume was never signaled, or port suspended. -Mathias
>> USB2 resume starts with usb_hcd_start_port_resume() in port status >> change handling for RESUME link state. usb_hcd_end_port_resume() call is >> needed to keep runtime PM balance. > For normal usb2 port resume the usb_hcd_end_port_resume() is called when resume > has been signaled for long enough in xhci_handle_usb2_port_link_resume(). > > This is also where driver directs the port to go from Resume state to U0. > Port can't do this without driver directing it. > > If there's a failure during resume signaling (disconnect, reset, error) then > stale resume variables are detected in xhci_get_port_status() and > usb_hcd_end_port_resume() is called. > I do now see a231ec41e6f6 ("xhci: refactor U0 link state handling in get_port_status") > does change order of checking and clearing stale resume variables, but this should > only happen if the first port state we read is a fully enabled functional U0 state after > a failed resume. > Could you expand a bit how this was detected? We observed the racing issue when usb2 device-initiated resume occurs in system resume. If usb2 host-initiated resume for system resume directs U0 before xhci_get_usb2_port_status() see RESUME state, xhci_get_usb2_port_status() will not finish resume process in xhci_handle_usb2_port_link_resume(). Its scenario is as follows: 1. System resume starts. All driver system resume callbacks get called in order. XHCI controller is resumed by xhci_resume(). 2. USB2 root hub is resuming. hcd_bus_resume() is being executed. 3. Before xhci_bus_resume() is finished, XHCI driver receives a port status change event for an USB2 port with RESUME link state in xhci_irq(). XHCI driver starts the process to resume HS port for device-initiated resume. 4. In xhci_bus_resume(), host-initiated resume (direct U0) is performed on the same port that is resuming in step 3 in below loop: if (bus_state->bus_suspended) { spin_unlock_irqrestore(&xhci->lock, flags); msleep(USB_RESUME_TIMEOUT); spin_lock_irqsave(&xhci->lock, flags); } for_each_set_bit(port_index, &bus_state->bus_suspended, BITS_PER_LONG) { /* Clear PLC to poll it later for U0 transition */ xhci_test_and_clear_bit(xhci, ports[port_index], PORT_PLC); xhci_set_link_state(xhci, ports[port_index], XDEV_U0); } 5. Then, link state of the resuming port is observed as U0 in following xhci_get_usb2_port_status(). xhci_handle_usb2_port_link_resume() has no chance to get called on the resuming port. Thanks, Henry
On 1.3.2022 12.28, Henry Lin wrote: >>> USB2 resume starts with usb_hcd_start_port_resume() in port status >>> change handling for RESUME link state. usb_hcd_end_port_resume() call is >>> needed to keep runtime PM balance. > >> For normal usb2 port resume the usb_hcd_end_port_resume() is called when resume >> has been signaled for long enough in xhci_handle_usb2_port_link_resume(). >> >> This is also where driver directs the port to go from Resume state to U0. >> Port can't do this without driver directing it. >> >> If there's a failure during resume signaling (disconnect, reset, error) then >> stale resume variables are detected in xhci_get_port_status() and >> usb_hcd_end_port_resume() is called. > >> I do now see a231ec41e6f6 ("xhci: refactor U0 link state handling in get_port_status") >> does change order of checking and clearing stale resume variables, but this should >> only happen if the first port state we read is a fully enabled functional U0 state after >> a failed resume. > >> Could you expand a bit how this was detected? > We observed the racing issue when usb2 device-initiated resume occurs in system resume. > If usb2 host-initiated resume for system resume directs U0 before xhci_get_usb2_port_status() > see RESUME state, xhci_get_usb2_port_status() will not finish resume process in > xhci_handle_usb2_port_link_resume(). Its scenario is as follows: > > 1. System resume starts. All driver system resume callbacks get called in order. XHCI controller > is resumed by xhci_resume(). > 2. USB2 root hub is resuming. hcd_bus_resume() is being executed. > 3. Before xhci_bus_resume() is finished, XHCI driver receives a port status change event for > an USB2 port with RESUME link state in xhci_irq(). XHCI driver starts the process to resume > HS port for device-initiated resume. > 4. In xhci_bus_resume(), host-initiated resume (direct U0) is performed on the same port that is > resuming in step 3 in below loop: > > if (bus_state->bus_suspended) { > spin_unlock_irqrestore(&xhci->lock, flags); > msleep(USB_RESUME_TIMEOUT); > spin_lock_irqsave(&xhci->lock, flags); > } > for_each_set_bit(port_index, &bus_state->bus_suspended, > BITS_PER_LONG) { > /* Clear PLC to poll it later for U0 transition */ > xhci_test_and_clear_bit(xhci, ports[port_index], > PORT_PLC); > xhci_set_link_state(xhci, ports[port_index], XDEV_U0); > } > 5. Then, link state of the resuming port is observed as U0 in following > xhci_get_usb2_port_status(). xhci_handle_usb2_port_link_resume() has > no chance to get called on the resuming port. > True, thanks for the explanation. If there's a race between system resume and device-initiated resume, and port is resumed in xhci_bus_resume() then yes I see how this could happen. Maybe only call usb_hcd_end_port_resume() if xhci_irq() detected the device-initiated resume. i.e. set a value to resume_done[portnum] and called usb_hcd_start_port_resume() something like: @@ -1088,6 +1088,8 @@ static void xhci_get_usb2_port_status(struct xhci_port *port, u32 *status, if (link_state == XDEV_U2) *status |= USB_PORT_STAT_L1; if (link_state == XDEV_U0) { + if (bus_state->resume_done[portnum]) + usb_hcd_end_port_resume(&port->rhub->hcd->self, portnum); bus_state->resume_done[portnum] = 0; clear_bit(portnum, &bus_state->resuming_ports); Also xhci_bus_resume() only resumes ports that were forcefully suspended to U3 in xhci_bus_suspend(). Could be worth checking why that device wasn't already in U3 when suspend reached xhci_bus_suspend(). Thanks Mathias
> Maybe only call usb_hcd_end_port_resume() if xhci_irq() detected the device-initiated > resume. i.e. set a value to resume_done[portnum] and called usb_hcd_start_port_resume() > something like: > > @@ -1088,6 +1088,8 @@ static void xhci_get_usb2_port_status(struct xhci_port *port, u32 *status, > if (link_state == XDEV_U2) > *status |= USB_PORT_STAT_L1; > if (link_state == XDEV_U0) { > + if (bus_state->resume_done[portnum]) > + usb_hcd_end_port_resume(&port->rhub->hcd->self, portnum); > bus_state->resume_done[portnum] = 0; > clear_bit(portnum, &bus_state->resuming_ports); This looks good. I can submit a new version based on this. > Also xhci_bus_resume() only resumes ports that were forcefully suspended to U3 in xhci_bus_suspend(). > Could be worth checking why that device wasn't already in U3 when suspend reached xhci_bus_suspend(). It happens when we trigger USB device-initiated resume to bring system out of system suspend state. For example, we can connect an USB2 external hub on root port and put system into suspend state. The USB2 external hub (the port it connects to) is in U3 after xhci_bus_suspend(). Once we connect a USB device to downstream port of the USB2 external hub, the hub will trigger device initiated resume to wake up the system. During system resume, XHCI controller will report the USB2 external hub is in RESUME state. Thanks, Henry
diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c index df3522dab31b..4a8b07b8ee01 100644 --- a/drivers/usb/host/xhci-hub.c +++ b/drivers/usb/host/xhci-hub.c @@ -1090,6 +1090,8 @@ static void xhci_get_usb2_port_status(struct xhci_port *port, u32 *status, if (link_state == XDEV_U0) { bus_state->resume_done[portnum] = 0; clear_bit(portnum, &bus_state->resuming_ports); + usb_hcd_end_port_resume(&port->rhub->hcd->self, + portnum); if (bus_state->suspended_ports & (1 << portnum)) { bus_state->suspended_ports &= ~(1 << portnum); bus_state->port_c_suspend |= 1 << portnum;
USB2 resume starts with usb_hcd_start_port_resume() in port status change handling for RESUME link state. usb_hcd_end_port_resume() call is needed to keep runtime PM balance. Fixes: a231ec41e6f6 ("xhci: refactor U0 link state handling in get_port_status") Signed-off-by: Henry Lin <henryl@nvidia.com> --- drivers/usb/host/xhci-hub.c | 2 ++ 1 file changed, 2 insertions(+)