Message ID | 20210122104342.12451-1-stf_xl@wp.pl |
---|---|
State | New |
Headers | show |
Series | usb, xhci, rt2800usb: do not perform Soft Retry | expand |
On 22.01.21 at 12:56 Greg KH wrote: > On Fri, Jan 22, 2021 at 11:43:42AM +0100, stf_xl@wp.pl wrote: >> From: Stanislaw Gruszka <stf_xl@wp.pl> >> >> Since f8f80be501aa ("xhci: Use soft retry to recover faster from transaction >> errors") on some systems rt2800usb devices are unable to operate. Looks >> that due to firmware or hardware limitations of those devices, they >> require full recovery from USB Transaction Errors. >> >> To avoid the problem add URB transfer flag, that restore pre f8f80be501aa >> xhci behaviour when the flag is set. For now only add it only to rt2800usb >> driver. > > This feels like a really heavy hammer, to add a xhci flag for a single > broken device. > > Are you sure this is really needed? What does this device do on other > operating systems, do they have such a quirk for their host controller > driver? > > Or is this due to the specific host controller device hardware? Should > this be a xhci quirk for a specific pci device instead? Well, rt2800usb USB implementation does have a lot of potential for optimization since the very beginning (current throughput comparison 2 MiB/s vs 13 MiB/s with the original driver e.g.). That's why I'm using until today a self patched version (it's bound to cfg80211 meanwhile) of the original driver (rt5572sta), which doesn't have those problems at all. From my point of view, the goal should be to solve the real reason for the problem. The original driver works much better (leastwise here) and doesn't show this problem at all! But anyway, there is from my point of view a basic problem with xhci_hcd, which just seems not to be completely backward compatible to existing USB 2 drivers (see https://marc.info/?l=linux-usb&m=161130327411612&w=2) if the device is plugged to an USB 3.x interface. Thanks Andreas
On 22.1.2021 15.17, Andreas Hartmann wrote: > > On 22.01.21 at 12:56 Greg KH wrote: >> On Fri, Jan 22, 2021 at 11:43:42AM +0100, stf_xl@wp.pl wrote: >>> From: Stanislaw Gruszka <stf_xl@wp.pl> >>> >>> Since f8f80be501aa ("xhci: Use soft retry to recover faster from transaction >>> errors") on some systems rt2800usb devices are unable to operate. Looks >>> that due to firmware or hardware limitations of those devices, they >>> require full recovery from USB Transaction Errors. >>> >>> To avoid the problem add URB transfer flag, that restore pre f8f80be501aa >>> xhci behaviour when the flag is set. For now only add it only to rt2800usb >>> driver. >> >> This feels like a really heavy hammer, to add a xhci flag for a single >> broken device. >> >> Are you sure this is really needed? What does this device do on other >> operating systems, do they have such a quirk for their host controller >> driver? >> >> Or is this due to the specific host controller device hardware? Should >> this be a xhci quirk for a specific pci device instead? > > Well, rt2800usb USB implementation does have a lot of potential for > optimization since the very beginning (current throughput comparison > 2 MiB/s vs 13 MiB/s with the original driver e.g.). That's why I'm > using until today a self patched version (it's bound to cfg80211 > meanwhile) of the original driver (rt5572sta), which doesn't have those > problems at all. From my point of view, the goal should be to solve the > real reason for the problem. The original driver works much better > (leastwise here) and doesn't show this problem at all! Ok, so it could be a rt2800 driver issue, or it just hitting some unlucky sequence that triggers this. > > But anyway, there is from my point of view a basic problem with xhci_hcd, > which just seems not to be completely backward compatible to existing USB 2 > drivers (see https://marc.info/?l=linux-usb&m=161130327411612&w=2) if the > device is plugged to an USB 3.x interface. This looks like a different issue, lets keep it in its own thread. The xHCI usb host controller handles both USB 2 and USB 3 speeds. If the USB port is connected to a xHC controller then the xhci driver will be used. If the port is connected to a EHCI then the ehci driever is used. EHCI does not support USB3 speeds. It's very possible that something that worked behind a EHCI host has issues when connected to a xHCI host. -Mathias
On Fri, Jan 22, 2021 at 05:00:21PM +0200, Mathias Nyman wrote: > >> Or is this due to the specific host controller device hardware? Should > >> this be a xhci quirk for a specific pci device instead? > > Exactly, this should be checked. > Stanislaw, weren't there a few users already that saw this issue? There are 30+ users cc-ed in the bugzilla report. However I think some of them are not affected by issue originally reported by Bernhard. They just saw "WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state" message caused by different problem and added comment to this particular bug report. > Do we know what xHCI controllers they were using? What I can tell issue was reported mostly on ASMedia and AMD controllers. We can ask for exact vendor and device IDs and just add xhci->quirks flag. However I'm not entirely sure that xHCI hardware misbehave is actual root cause. I think equally probable is that connected device do not handle soft retry correctly. In that case disabling Soft Retry per device would be actually "lightest hammer" since other devices connected to the xHCI host could benefit from faster recovery. Is there way to debug/identify which side: host or device hardware misbehave when Soft Retry is performed ? Stanislaw
On Sat, Jan 23, 2021 at 11:14:19AM +0100, Stanislaw Gruszka wrote: > On Fri, Jan 22, 2021 at 05:00:21PM +0200, Mathias Nyman wrote: > > >> Or is this due to the specific host controller device hardware? Should > > >> this be a xhci quirk for a specific pci device instead? > > > > Exactly, this should be checked. > > Stanislaw, weren't there a few users already that saw this issue? > > There are 30+ users cc-ed in the bugzilla report. However I think some > of them are not affected by issue originally reported by Bernhard. > They just saw "WARN Set TR Deq Ptr cmd failed due to incorrect slot > or ep state" message caused by different problem and added comment > to this particular bug report. > > > Do we know what xHCI controllers they were using? > > What I can tell issue was reported mostly on ASMedia and AMD > controllers. We can ask for exact vendor and device IDs and > just add xhci->quirks flag. > > However I'm not entirely sure that xHCI hardware misbehave > is actual root cause. I think equally probable is that > connected device do not handle soft retry correctly. In that > case disabling Soft Retry per device would be actually > "lightest hammer" since other devices connected to the > xHCI host could benefit from faster recovery. > > Is there way to debug/identify which side: host or device > hardware misbehave when Soft Retry is performed ? So I think we do not have such possibility. If xhci quirk is something that can be accepted I'll precede with such patch. I'm going to ask bug reporters about xHCI conntroler PCI-id's ... Stanislaw
diff --git a/drivers/net/wireless/ralink/rt2x00/rt2x00usb.c b/drivers/net/wireless/ralink/rt2x00/rt2x00usb.c index e4473a551241..f1d82b3e6bba 100644 --- a/drivers/net/wireless/ralink/rt2x00/rt2x00usb.c +++ b/drivers/net/wireless/ralink/rt2x00/rt2x00usb.c @@ -214,6 +214,7 @@ void rt2x00usb_register_read_async(struct rt2x00_dev *rt2x00dev, usb_fill_control_urb(urb, usb_dev, usb_rcvctrlpipe(usb_dev, 0), (unsigned char *)(&rd->cr), &rd->reg, sizeof(rd->reg), rt2x00usb_register_read_async_cb, rd); + urb->transfer_flags |= URB_SOFT_RETRY_NOT_OK; usb_anchor_urb(urb, rt2x00dev->anchor); if (usb_submit_urb(urb, GFP_ATOMIC) < 0) { usb_unanchor_urb(urb); @@ -323,6 +324,7 @@ static bool rt2x00usb_kick_tx_entry(struct queue_entry *entry, void *data) usb_sndbulkpipe(usb_dev, entry->queue->usb_endpoint), entry->skb->data, length, rt2x00usb_interrupt_txdone, entry); + entry_priv->urb->transfer_flags |= URB_SOFT_RETRY_NOT_OK; status = usb_submit_urb(entry_priv->urb, GFP_ATOMIC); if (status) { @@ -409,6 +411,7 @@ static bool rt2x00usb_kick_rx_entry(struct queue_entry *entry, void *data) usb_rcvbulkpipe(usb_dev, entry->queue->usb_endpoint), entry->skb->data, entry->skb->len, rt2x00usb_interrupt_rxdone, entry); + entry_priv->urb->transfer_flags |= URB_SOFT_RETRY_NOT_OK; status = usb_submit_urb(entry_priv->urb, GFP_ATOMIC); if (status) { diff --git a/drivers/usb/core/urb.c b/drivers/usb/core/urb.c index 357b149b20d3..140bac59dc32 100644 --- a/drivers/usb/core/urb.c +++ b/drivers/usb/core/urb.c @@ -495,7 +495,7 @@ int usb_submit_urb(struct urb *urb, gfp_t mem_flags) /* Check against a simple/standard policy */ allowed = (URB_NO_TRANSFER_DMA_MAP | URB_NO_INTERRUPT | URB_DIR_MASK | - URB_FREE_BUFFER); + URB_SOFT_RETRY_NOT_OK | URB_FREE_BUFFER); switch (xfertype) { case USB_ENDPOINT_XFER_BULK: case USB_ENDPOINT_XFER_INT: diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 5677b81c0915..6712e1a7735c 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -2302,7 +2302,8 @@ static int process_bulk_intr_td(struct xhci_hcd *xhci, struct xhci_td *td, remaining = 0; break; case COMP_USB_TRANSACTION_ERROR: - if ((ep_ring->err_count++ > MAX_SOFT_RETRY) || + if (td->urb->transfer_flags & URB_SOFT_RETRY_NOT_OK || + (ep_ring->err_count++ > MAX_SOFT_RETRY) || le32_to_cpu(slot_ctx->tt_info) & TT_SLOT) break; *status = 0; diff --git a/include/linux/usb.h b/include/linux/usb.h index 7d72c4e0713c..dcdac2f03263 100644 --- a/include/linux/usb.h +++ b/include/linux/usb.h @@ -1329,6 +1329,7 @@ extern int usb_disabled(void); #define URB_ISO_ASAP 0x0002 /* iso-only; use the first unexpired * slot in the schedule */ #define URB_NO_TRANSFER_DMA_MAP 0x0004 /* urb->transfer_dma valid on submit */ +#define URB_SOFT_RETRY_NOT_OK 0x0008 /* Avoid XHCI Soft Retry */ #define URB_ZERO_PACKET 0x0040 /* Finish bulk OUT with short packet */ #define URB_NO_INTERRUPT 0x0080 /* HINT: no non-error interrupt * needed */