diff mbox series

[v27,01/32] xhci: add helper to stop endpoint and wait for completion

Message ID 20240912193935.1916426-2-quic_wcheng@quicinc.com
State Superseded
Headers show
Series Introduce QC USB SND audio offloading support | expand

Commit Message

Wesley Cheng Sept. 12, 2024, 7:39 p.m. UTC
From: Mathias Nyman <mathias.nyman@linux.intel.com>

Expose xhci_stop_endpoint_sync() which is a synchronous variant of
xhci_queue_stop_endpoint().  This is useful for client drivers that are
using the secondary interrupters, and need to stop/clean up the current
session.  The stop endpoint command handler will also take care of cleaning
up the ring.

Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Wesley Cheng <quic_wcheng@quicinc.com>
---
 drivers/usb/host/xhci.c | 39 +++++++++++++++++++++++++++++++++++++++
 drivers/usb/host/xhci.h |  2 ++
 2 files changed, 41 insertions(+)

Comments

Michał Pecio Sept. 13, 2024, 8:32 a.m. UTC | #1
Hi,

> Expose xhci_stop_endpoint_sync() which is a synchronous variant of
> xhci_queue_stop_endpoint().  This is useful for client drivers that are
> using the secondary interrupters, and need to stop/clean up the current
> session.  The stop endpoint command handler will also take care of
> cleaning up the ring.

I'm not entirely sure what you meant by "cleaning up the ring" (maybe a
comment would be in order?), but I see nothing being done here after the
command completes and FYI xhci-ring.c will not run the default handler if
the command is queued with a completion, like here.

At least that's the case for certain command types and there is probably
a story behind each of them. I know that xhci_stop_device() queues a
Stop EP with completion (and also a few without(?)). Maybe it's a bug...

Regards,
Michal
Wesley Cheng Sept. 13, 2024, 10:14 p.m. UTC | #2
Hi Michal,

On 9/13/2024 1:32 AM, Michał Pecio wrote:
> Hi,
>
>> Expose xhci_stop_endpoint_sync() which is a synchronous variant of
>> xhci_queue_stop_endpoint().  This is useful for client drivers that are
>> using the secondary interrupters, and need to stop/clean up the current
>> session.  The stop endpoint command handler will also take care of
>> cleaning up the ring.
> I'm not entirely sure what you meant by "cleaning up the ring" (maybe a
> comment would be in order?), but I see nothing being done here after the
> command completes and FYI xhci-ring.c will not run the default handler if
> the command is queued with a completion, like here.
>
> At least that's the case for certain command types and there is probably
> a story behind each of them. I know that xhci_stop_device() queues a
> Stop EP with completion (and also a few without(?)). Maybe it's a bug...

Maybe the last sentence is not needed.  When we are using the secondary interrupters, at least in the offload use case that I've verified with, the XHCI is completely unaware of what TDs have been queued, etc...  So technically, even if we did call the default handler (ie xhci_handle_cmd_stop_ep), most of the routines to invalidate TDs are going to be no-ops.

Thanks

Wesley Cheng
Michał Pecio Sept. 15, 2024, 7:55 a.m. UTC | #3
Hi,

> Maybe the last sentence is not needed.  When we are using the
> secondary interrupters, at least in the offload use case that I've
> verified with, the XHCI is completely unaware of what TDs have been
> queued, etc...  So technically, even if we did call the default
> handler (ie xhci_handle_cmd_stop_ep), most of the routines to
> invalidate TDs are going to be no-ops.

Yes, the cancellation machinery will return immediately if there are
no TDs queued by xhci_hcd itself.

But xhci_handle_cmd_stop_ep() does a few more things for you - it
checks if the command has actually succeeded, clears any halt condition
which may be preventing stopping the endpoint, and it sometimes retries
the command (only on "bad" chips, AFAIK).

This new code does none of the above, so in the general case it can't
even guarantee that the endpoint is stopped when it returns zero. This
should ideally be documented in some way, or fixed, before somebody is
tempted to call it with unrealistically high expectations ;)

As far as I see, it only works for you because isochronous never halts
and Qualcomm HW is (hopefully) free of those stop-after-restart bugs.
There will be problems if the SB tries to use any other endpoint type.

Regards,
Michal
Wesley Cheng Sept. 20, 2024, 11:49 p.m. UTC | #4
Hi Michal,

On 9/15/2024 12:55 AM, Michał Pecio wrote:
> Hi,
>
>> Maybe the last sentence is not needed.  When we are using the
>> secondary interrupters, at least in the offload use case that I've
>> verified with, the XHCI is completely unaware of what TDs have been
>> queued, etc...  So technically, even if we did call the default
>> handler (ie xhci_handle_cmd_stop_ep), most of the routines to
>> invalidate TDs are going to be no-ops.
> Yes, the cancellation machinery will return immediately if there are
> no TDs queued by xhci_hcd itself.
>
> But xhci_handle_cmd_stop_ep() does a few more things for you - it
> checks if the command has actually succeeded, clears any halt condition
> which may be preventing stopping the endpoint, and it sometimes retries
> the command (only on "bad" chips, AFAIK).
>
> This new code does none of the above, so in the general case it can't
> even guarantee that the endpoint is stopped when it returns zero. This
> should ideally be documented in some way, or fixed, before somebody is
> tempted to call it with unrealistically high expectations ;)
>
> As far as I see, it only works for you because isochronous never halts
> and Qualcomm HW is (hopefully) free of those stop-after-restart bugs.
> There will be problems if the SB tries to use any other endpoint type.

So what I ended up doing was to split off the context error handling into a separate helper API, which can be also called for the sync ep stop API.  From there, based on say....the helper re queuing the stop EP command, it would return a specific value to signify that it has done so.  The sync based API will then re-wait for the completion of the subsequent stop endpoint command that was queued.  In all other context error cases, it'd return the error to the caller, and its up to them to handle it accordingly.

Thanks

Wesley Cheng
Michał Pecio Sept. 22, 2024, 11:23 p.m. UTC | #5
Hi,

> So what I ended up doing was to split off the context error handling
> into a separate helper API, which can be also called for the sync ep
> stop API.  From there, based on say....the helper re queuing the stop
> EP command, it would return a specific value to signify that it has
> done so.  The sync based API will then re-wait for the completion of
> the subsequent stop endpoint command that was queued.

AFAIK retries are only necessary on buggy hardware. I don't see them on
my controllers except for two old ones, both with the same buggy chip.

> In all other context error cases, it'd return the error to the caller,
> and its up to them to handle it accordingly.

For the record, all existing callers end up ignoring this return value.

Honestly, I don't know if improving this function is worth your effort
if it's working for you as-is. There are no users except xhci-sideband
and probably shouldn't be - besides failing to fix stalled endpoints,
this function also does nothing to prevent automatic restart of the EP
when new URBs are submitted through xhci_hcd, so it is mainly relevant
for sideband users who never submit URBs the usual way.

My issue with this function is that it is simply poorly documented what
it is or isn't expected to achieve (both here and in the calling code
in xhci-sideband.c), and the changelog message is wrong to suggest that
the default completion handler will run (unless somewhere there are
patches to make it happen), making it look like this code can do things
that it really cannot do. And this is apparently a public, exported API.

Regards,
Michal
Wesley Cheng Sept. 24, 2024, 9:47 p.m. UTC | #6
Hi Michal

On 9/22/2024 4:23 PM, Michał Pecio wrote:
> Hi,
>
>> So what I ended up doing was to split off the context error handling
>> into a separate helper API, which can be also called for the sync ep
>> stop API.  From there, based on say....the helper re queuing the stop
>> EP command, it would return a specific value to signify that it has
>> done so.  The sync based API will then re-wait for the completion of
>> the subsequent stop endpoint command that was queued.
> AFAIK retries are only necessary on buggy hardware. I don't see them on
> my controllers except for two old ones, both with the same buggy chip.
>
>>  In all other context error cases, it'd return the error to the caller,
>> and its up to them to handle it accordingly.
> For the record, all existing callers end up ignoring this return value.
>
> Honestly, I don't know if improving this function is worth your effort
> if it's working for you as-is. There are no users except xhci-sideband
> and probably shouldn't be - besides failing to fix stalled endpoints,
> this function also does nothing to prevent automatic restart of the EP
> when new URBs are submitted through xhci_hcd, so it is mainly relevant
> for sideband users who never submit URBs the usual way.
>
> My issue with this function is that it is simply poorly documented what
> it is or isn't expected to achieve (both here and in the calling code
> in xhci-sideband.c), and the changelog message is wrong to suggest that
> the default completion handler will run (unless somewhere there are
> patches to make it happen), making it look like this code can do things
> that it really cannot do. And this is apparently a public, exported API.

Thanks for the clarifications.  Yes, unfortunately, I can't really test any scenarios where this would be exercised in the current path, so I will leave the code out for now, and just add some comments and updates to the commit message.  Can revisit when there is some other users for utilizing secondary interrupters.

Thanks

Wesley Cheng
diff mbox series

Patch

diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index ed1bb7ed44b0..1fafba95d407 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -2784,6 +2784,45 @@  static int xhci_reserve_bandwidth(struct xhci_hcd *xhci,
 	return -ENOMEM;
 }
 
+/*
+ * Synchronous XHCI stop endpoint helper.  Issues the stop endpoint command and
+ * waits for the command completion before returning.
+ */
+int xhci_stop_endpoint_sync(struct xhci_hcd *xhci, struct xhci_virt_ep *ep, int suspend,
+			    gfp_t gfp_flags)
+{
+	struct xhci_command *command;
+	unsigned long flags;
+	int ret;
+
+	command = xhci_alloc_command(xhci, true, gfp_flags);
+	if (!command)
+		return -ENOMEM;
+
+	spin_lock_irqsave(&xhci->lock, flags);
+	ret = xhci_queue_stop_endpoint(xhci, command, ep->vdev->slot_id,
+				       ep->ep_index, suspend);
+	if (ret < 0) {
+		spin_unlock_irqrestore(&xhci->lock, flags);
+		goto out;
+	}
+
+	xhci_ring_cmd_db(xhci);
+	spin_unlock_irqrestore(&xhci->lock, flags);
+
+	wait_for_completion(command->completion);
+
+	if (command->status == COMP_COMMAND_ABORTED ||
+	    command->status == COMP_COMMAND_RING_STOPPED) {
+		xhci_warn(xhci, "Timeout while waiting for stop endpoint command\n");
+		ret = -ETIME;
+	}
+out:
+	xhci_free_command(xhci, command);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(xhci_stop_endpoint_sync);
 
 /* Issue a configure endpoint command or evaluate context command
  * and wait for it to finish.
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index 324644165d93..51a992d8ffcf 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1917,6 +1917,8 @@  void xhci_ring_doorbell_for_active_rings(struct xhci_hcd *xhci,
 void xhci_cleanup_command_queue(struct xhci_hcd *xhci);
 void inc_deq(struct xhci_hcd *xhci, struct xhci_ring *ring);
 unsigned int count_trbs(u64 addr, u64 len);
+int xhci_stop_endpoint_sync(struct xhci_hcd *xhci, struct xhci_virt_ep *ep,
+			    int suspend, gfp_t gfp_flags);
 
 /* xHCI roothub code */
 void xhci_set_link_state(struct xhci_hcd *xhci, struct xhci_port *port,