Message ID | 20210607131442.20121-1-heikki.krogerus@linux.intel.com |
---|---|
Headers | show |
Series | usb: typec: ucsi: Polling the alt modes and PDOs | expand |
Hi Heikki, On Mon, 2021-06-07 at 16:14 +0300, Heikki Krogerus wrote: > This is the RFC series I promised [1]. Cool. > I'm sorry it took this long to prepare these. I had to concentrate on > other task for a while. > > Let me know if you still see any problems. Hmm, I am not sure this got better. I applied the patchset on top of of the 5.12.9 Fedora 34 kernel. On the machine in question (X1 Carbon 8), I see the online state getting stuck at 1 occasionally. This can happen for example when quickly plugging and unplugging a USB-C charger. Benjamin > [1] > https://lore.kernel.org/linux-usb/YKT3oEt%2F9fX8k8mw@kuha.fi.intel.com/ > > thanks, > > Heikki Krogerus (7): > usb: typec: ucsi: Always cancel the command if PPM reports BUSY > condition > usb: typec: ucsi: Don't stop alt mode registration on busy > condition > usb: typec: ucsi: Add poll worker for alternate modes > usb: typec: ucsi: acpi: Reduce the command completion timeout > usb: typec: ucsi: Process every connector change as unique > connector > state > usb: typec: ucsi: Filter out spurious events > usb: typec: ucsi: Read the PDOs in separate work > > drivers/usb/typec/ucsi/ucsi.c | 317 ++++++++++++++++----------- > -- > drivers/usb/typec/ucsi/ucsi.h | 3 +- > drivers/usb/typec/ucsi/ucsi_acpi.c | 2 +- > 3 files changed, 183 insertions(+), 139 deletions(-) >
On Mon, Jun 07, 2021 at 10:09:58PM +0200, Benjamin Berg wrote: > Hi Heikki, > > On Mon, 2021-06-07 at 16:14 +0300, Heikki Krogerus wrote: > > This is the RFC series I promised [1]. > > Cool. > > > I'm sorry it took this long to prepare these. I had to concentrate on > > other task for a while. > > > > Let me know if you still see any problems. > > Hmm, I am not sure this got better. I applied the patchset on top of of > the 5.12.9 Fedora 34 kernel. On the machine in question (X1 Carbon 8), > I see the online state getting stuck at 1 occasionally. This can happen > for example when quickly plugging and unplugging a USB-C charger. Please check does the partner device get removed. What do you have under /sys/class/typec after that happens? thanks, -- heikki
On Tue, Jun 08, 2021 at 09:42:09AM +0300, Heikki Krogerus wrote: > On Mon, Jun 07, 2021 at 10:09:58PM +0200, Benjamin Berg wrote: > > Hi Heikki, > > > > On Mon, 2021-06-07 at 16:14 +0300, Heikki Krogerus wrote: > > > This is the RFC series I promised [1]. > > > > Cool. > > > > > I'm sorry it took this long to prepare these. I had to concentrate on > > > other task for a while. > > > > > > Let me know if you still see any problems. > > > > Hmm, I am not sure this got better. I applied the patchset on top of of > > the 5.12.9 Fedora 34 kernel. On the machine in question (X1 Carbon 8), > > I see the online state getting stuck at 1 occasionally. This can happen > > for example when quickly plugging and unplugging a USB-C charger. > > Please check does the partner device get removed. What do you have > under /sys/class/typec after that happens? Oh yes. Could you also share the trace output when that happens? cd /sys/kernel/debug/tracing echo 1 > events/ucsi/enable # now reproduce the issue cat trace > ucsi.trace thanks, -- heikki
On Tue, 2021-06-08 at 09:54 +0300, Heikki Krogerus wrote: > On Tue, Jun 08, 2021 at 09:42:09AM +0300, Heikki Krogerus wrote: > > Please check does the partner device get removed. What do you have > > under /sys/class/typec after that happens? > > Oh yes. Could you also share the trace output when that happens? > > cd /sys/kernel/debug/tracing > echo 1 > events/ucsi/enable > # now reproduce the issue > cat trace > ucsi.trace So, the partner device is still there when this happens (see below). I also only see a single event in the trace for the fast plug/unplug case: kworker/u16:8-1771 [003] .... 18848.872145: ucsi_connector_change: port1 status: change=4a04, opmode=5, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 The typec port/partner states is: port1-partner/accessory_mode:none port1-partner/supports_usb_power_delivery:no port1-partner/uevent:DEVTYPE=typec_partner port1-partner/usb_power_delivery_revision:0.0 and port0/data_role:host [device] port0/power_operation_mode:default port0/power_role:source [sink] port0/supported_accessory_modes:none port0/uevent:DEVTYPE=typec_port port0/uevent:TYPEC_PORT=port0 port0/usb_power_delivery_revision:2.0 port0/usb_typec_revision:1.0 port0/vconn_source:no port1/data_role:host [device] port1/power_operation_mode:3.0A port1/power_role:source [sink] port1/supported_accessory_modes:none port1/uevent:DEVTYPE=typec_port port1/uevent:TYPEC_PORT=port1 port1/usb_power_delivery_revision:2.0 port1/usb_typec_revision:1.0 port1/vconn_source:no Note that for a normal plug I am usually getting a second event. This second event is occasionally missing though: kworker/u16:38-1800 [001] .... 19522.325885: ucsi_connector_change: port1 status: change=4a04, opmode=5, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 kworker/u16:38-1800 [004] .... 19522.552613: ucsi_connector_change: port1 status: change=0044, opmode=3, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 and a working unplug consistently looks like: kworker/u16:8-1771 [003] .... 19670.020085: ucsi_connector_change: port1 status: change=4804, opmode=0, connected=0, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=0 Benjamin
On Tue, Jun 08, 2021 at 09:32:01PM +0200, Benjamin Berg wrote: > On Tue, 2021-06-08 at 09:54 +0300, Heikki Krogerus wrote: > > On Tue, Jun 08, 2021 at 09:42:09AM +0300, Heikki Krogerus wrote: > > > Please check does the partner device get removed. What do you have > > > under /sys/class/typec after that happens? > > > > Oh yes. Could you also share the trace output when that happens? > > > > cd /sys/kernel/debug/tracing > > echo 1 > events/ucsi/enable > > # now reproduce the issue > > cat trace > ucsi.trace > > So, the partner device is still there when this happens (see below). I > also only see a single event in the trace for the fast plug/unplug > case: > kworker/u16:8-1771 [003] .... 18848.872145: ucsi_connector_change: port1 status: change=4a04, opmode=5, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 OK. Sorry I had to double check because you were only talking about the psy online state. Can you now try this HACK on top of these patches: diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c index bd39fe2cb1d0b..99f072700ce7f 100644 --- a/drivers/usb/typec/ucsi/ucsi.c +++ b/drivers/usb/typec/ucsi/ucsi.c @@ -843,7 +843,8 @@ static void ucsi_handle_connector_change(struct work_struct *work) if (!status.change) { dev_dbg(con->ucsi->dev, "con%d: spurious event\n", con->num); - goto out_ack; + /* XXX Force connection check. */ + status.change = UCSI_CONSTAT_CONNECT_CHANGE; } event = kzalloc(sizeof(*event), GFP_KERNEL); thanks, -- heikki
On Wed, Jun 09, 2021 at 02:26:00PM +0300, Heikki Krogerus wrote: > On Tue, Jun 08, 2021 at 09:32:01PM +0200, Benjamin Berg wrote: > > On Tue, 2021-06-08 at 09:54 +0300, Heikki Krogerus wrote: > > > On Tue, Jun 08, 2021 at 09:42:09AM +0300, Heikki Krogerus wrote: > > > > Please check does the partner device get removed. What do you have > > > > under /sys/class/typec after that happens? > > > > > > Oh yes. Could you also share the trace output when that happens? > > > > > > cd /sys/kernel/debug/tracing > > > echo 1 > events/ucsi/enable > > > # now reproduce the issue > > > cat trace > ucsi.trace > > > > So, the partner device is still there when this happens (see below). I > > also only see a single event in the trace for the fast plug/unplug > > case: > > kworker/u16:8-1771 [003] .... 18848.872145: ucsi_connector_change: port1 status: change=4a04, opmode=5, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 > > OK. Sorry I had to double check because you were only talking about > the psy online state. > > Can you now try this HACK on top of these patches: > > diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c > index bd39fe2cb1d0b..99f072700ce7f 100644 > --- a/drivers/usb/typec/ucsi/ucsi.c > +++ b/drivers/usb/typec/ucsi/ucsi.c > @@ -843,7 +843,8 @@ static void ucsi_handle_connector_change(struct work_struct *work) > > if (!status.change) { > dev_dbg(con->ucsi->dev, "con%d: spurious event\n", con->num); > - goto out_ack; > + /* XXX Force connection check. */ > + status.change = UCSI_CONSTAT_CONNECT_CHANGE; > } > > event = kzalloc(sizeof(*event), GFP_KERNEL); No, that's not enough. Sorry. I'm trying to get a confirmation on my suspecion that we do always actually get an event from the EC firmware, but we just end up filtering it out in this case because we are too slow in the driver. I have an idea what could be done about that, but I need to test if that really is the case. I'll prepare a new version out of this entire series. thanks, -- heikki
On Wed, Jun 09, 2021 at 03:18:04PM +0300, Heikki Krogerus wrote: > On Wed, Jun 09, 2021 at 02:26:00PM +0300, Heikki Krogerus wrote: > > On Tue, Jun 08, 2021 at 09:32:01PM +0200, Benjamin Berg wrote: > > > On Tue, 2021-06-08 at 09:54 +0300, Heikki Krogerus wrote: > > > > On Tue, Jun 08, 2021 at 09:42:09AM +0300, Heikki Krogerus wrote: > > > > > Please check does the partner device get removed. What do you have > > > > > under /sys/class/typec after that happens? > > > > > > > > Oh yes. Could you also share the trace output when that happens? > > > > > > > > cd /sys/kernel/debug/tracing > > > > echo 1 > events/ucsi/enable > > > > # now reproduce the issue > > > > cat trace > ucsi.trace > > > > > > So, the partner device is still there when this happens (see below). I > > > also only see a single event in the trace for the fast plug/unplug > > > case: > > > kworker/u16:8-1771 [003] .... 18848.872145: ucsi_connector_change: port1 status: change=4a04, opmode=5, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 > > > > OK. Sorry I had to double check because you were only talking about > > the psy online state. > > > > Can you now try this HACK on top of these patches: > > > > diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c > > index bd39fe2cb1d0b..99f072700ce7f 100644 > > --- a/drivers/usb/typec/ucsi/ucsi.c > > +++ b/drivers/usb/typec/ucsi/ucsi.c > > @@ -843,7 +843,8 @@ static void ucsi_handle_connector_change(struct work_struct *work) > > > > if (!status.change) { > > dev_dbg(con->ucsi->dev, "con%d: spurious event\n", con->num); > > - goto out_ack; > > + /* XXX Force connection check. */ > > + status.change = UCSI_CONSTAT_CONNECT_CHANGE; > > } > > > > event = kzalloc(sizeof(*event), GFP_KERNEL); > > No, that's not enough. Sorry. > > I'm trying to get a confirmation on my suspecion that we do always > actually get an event from the EC firmware, but we just end up > filtering it out in this case because we are too slow in the driver. I > have an idea what could be done about that, but I need to test if that > really is the case. > > I'll prepare a new version out of this entire series. Actually, it's easier if you could just test this attached patch on top of this series. It makes sure the every single event is considered. I'm sorry about the hassle. thanks, -- heikki From 9a5fc7d8e22ca5d0077a97345b8f079957a465b5 Mon Sep 17 00:00:00 2001 From: Heikki Krogerus <heikki.krogerus@linux.intel.com> Date: Wed, 9 Jun 2021 15:48:23 +0300 Subject: [PATCH] usb: typec: ucsi: Test patch for confirming events NOT-Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> --- drivers/usb/typec/ucsi/ucsi.c | 80 ++++++++++++----------------------- 1 file changed, 27 insertions(+), 53 deletions(-) diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c index bd39fe2cb1d0b..53718e655edbf 100644 --- a/drivers/usb/typec/ucsi/ucsi.c +++ b/drivers/usb/typec/ucsi/ucsi.c @@ -748,16 +748,34 @@ static void ucsi_connector_work(struct work_struct *work) { struct ucsi_con_event *event = container_of(work, struct ucsi_con_event, work); struct ucsi_connector *con = event->con; + struct ucsi_connector_status status; struct ucsi *ucsi = con->ucsi; enum typec_role role; enum usb_role u_role = USB_ROLE_NONE; + u64 command; int ret; + kfree(event); + + command = UCSI_GET_CONNECTOR_STATUS | UCSI_CONNECTOR_NUMBER(con->num); + ret = ucsi_send_command(ucsi, command, &status, sizeof(status)); + if (ret < 0) + dev_err(ucsi->dev, "GET_CONNECTOR_STATUS failed (%d)\n", ret); + + if (!status.change) { + dev_dbg(ucsi->dev, "con%d: spurious event\n", con->num); + /* XXX: Force connection check. */ + status.change = UCSI_CONSTAT_CONNECT_CHANGE; + } + + ret = ucsi_acknowledge_connector_change(ucsi); + if (ret) + dev_err(ucsi->dev, "%s: ACK failed (%d)", __func__, ret); + mutex_lock(&con->lock); - trace_ucsi_connector_change(con->num, &event->status); - con->status = event->status; - kfree(event); + trace_ucsi_connector_change(con->num, &status); + con->status = status; role = !!(con->status.flags & UCSI_CONSTAT_PWR_DIR); @@ -816,53 +834,6 @@ static void ucsi_connector_work(struct work_struct *work) mutex_unlock(&con->lock); } -/* - * We can not read the connector status in ucsi_connector_change() function - * below because there may be already a command pending. This work is scheduled - * separately only because of that. - * - * This function must finish fast so we do not loose the next events. Every - * event will have a separate job queued for it in the connector specific - * workqueue. That way the next event can be generated safely before the - * previous ones are fully processed. - */ -static void ucsi_handle_connector_change(struct work_struct *work) -{ - struct ucsi_connector *con = container_of(work, struct ucsi_connector, work); - struct ucsi_connector_status status; - struct ucsi_con_event *event; - u64 command; - int ret; - - command = UCSI_GET_CONNECTOR_STATUS | UCSI_CONNECTOR_NUMBER(con->num); - ret = ucsi_send_command(con->ucsi, command, &status, sizeof(status)); - if (ret < 0) { - dev_err(con->ucsi->dev, "GET_CONNECTOR_STATUS failed (%d)\n", ret); - goto out_ack; - } - - if (!status.change) { - dev_dbg(con->ucsi->dev, "con%d: spurious event\n", con->num); - goto out_ack; - } - - event = kzalloc(sizeof(*event), GFP_KERNEL); - if (!event) - goto out_ack; - - INIT_WORK(&event->work, ucsi_connector_work); - event->status = status; - event->con = con; - queue_work(con->wq, &event->work); - -out_ack: - clear_bit(EVENT_PENDING, &con->ucsi->flags); - - ret = ucsi_acknowledge_connector_change(con->ucsi); - if (ret) - dev_err(con->ucsi->dev, "%s: ACK failed (%d)", __func__, ret); -} - /** * ucsi_connector_change - Process Connector Change Event * @ucsi: UCSI Interface @@ -871,16 +842,20 @@ static void ucsi_handle_connector_change(struct work_struct *work) void ucsi_connector_change(struct ucsi *ucsi, u8 num) { struct ucsi_connector *con = &ucsi->connector[num - 1]; + struct ucsi_con_event *event; if (!(ucsi->ntfy & UCSI_ENABLE_NTFY_CONNECTOR_CHANGE)) { dev_dbg(ucsi->dev, "Bogus connector change event\n"); return; } - if (test_and_set_bit(EVENT_PENDING, &ucsi->flags)) + event = kzalloc(sizeof(*event), GFP_KERNEL); + if (!event) return; - schedule_work(&con->work); + INIT_WORK(&event->work, ucsi_connector_work); + event->con = con; + queue_work(con->wq, &event->work); } EXPORT_SYMBOL_GPL(ucsi_connector_change); @@ -1078,7 +1053,6 @@ static int ucsi_register_port(struct ucsi *ucsi, int index) if (!con->wq) return -ENOMEM; - INIT_WORK(&con->work, ucsi_handle_connector_change); init_completion(&con->complete); mutex_init(&con->lock); con->num = index + 1; -- 2.30.2
On Wed, 2021-06-09 at 15:56 +0300, Heikki Krogerus wrote: > On Wed, Jun 09, 2021 at 03:18:04PM +0300, Heikki Krogerus wrote: > > > > I'm trying to get a confirmation on my suspecion that we do always > > actually get an event from the EC firmware, but we just end up > > filtering it out in this case because we are too slow in the driver. > > I > > have an idea what could be done about that, but I need to test if > > that > > really is the case. > > > > I'll prepare a new version out of this entire series. > > Actually, it's easier if you could just test this attached patch on > top of this series. It makes sure the every single event is > considered. I'm sorry about the hassle. No worries! I probably should have included some more information earlier (i.e. enabling the debug print for spurious events). With the patch, I am seeing the following on plug kworker/u16:1-6847 [002] .... 1375.485010: ucsi_connector_change: port1 status: change=4a04, opmode=5, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 kworker/u16:2-6848 [006] .... 1375.561811: ucsi_connector_change: port1 status: change=4000, opmode=5, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 kworker/u16:2-6848 [007] .... 1375.634275: ucsi_connector_change: port1 status: change=4000, opmode=5, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 kworker/u16:2-6848 [003] .... 1375.743161: ucsi_connector_change: port1 status: change=4000, opmode=3, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 and unplug kworker/u16:1-6847 [005] .... 1394.062501: ucsi_connector_change: port1 status: change=4804, opmode=0, connected=0, sourcing=0, partner_flags=0, partner_type=0, request_data_obj=00000000, BC status=0 kworker/u16:1-6847 [005] .... 1394.161612: ucsi_connector_change: port1 status: change=4000, opmode=0, connected=0, sourcing=0, partner_flags=0, partner_type=0, request_data_obj=00000000, BC status=0 kworker/u16:1-6847 [005] .... 1394.251503: ucsi_connector_change: port1 status: change=4000, opmode=0, connected=0, sourcing=0, partner_flags=0, partner_type=0, request_data_obj=00000000, BC status=0 where all but the first event are spurious events. I believe that in the above spurious event with the change to opmode=3, the PPM should be reporting change=0004 (i.e. UCSI_CONSTAT_POWER_OPMODE_CHANGE). Occasionally I also see the following on plug. Note the non-spurious event with change=0040 (UCSI_CONSTAT_POWER_LEVEL_CHANGE) right before the event where opmode changes. kworker/u16:11-2201 [001] .... 3240.124431: ucsi_connector_change: port1 status: change=4a04, opmode=5, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 kworker/u16:3-7469 [003] .... 3240.222799: ucsi_connector_change: port1 status: change=4000, opmode=5, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 kworker/u16:3-7469 [003] .... 3240.325946: ucsi_connector_change: port1 status: change=0040, opmode=5, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 kworker/u16:3-7469 [003] .... 3240.423503: ucsi_connector_change: port1 status: change=4000, opmode=3, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 kworker/u16:3-7469 [003] .... 3240.861986: ucsi_connector_change: port1 status: change=4000, opmode=3, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 kworker/u16:3-7469 [007] .... 3240.999048: ucsi_connector_change: port1 status: change=4000, opmode=3, connected=1, sourcing=0, partner_flags=1, partner_type=1, request_data_obj=1304b12c, BC status=1 My thought when I first ran into the issue was that the PPM simply resets the change bitfield on ACK, effectively discarding any changes that happened after the last GET_CONNECTOR_STATUS call. I believed to have confirmed this by inserting an msleep in between. However, I have to admit that I have never really looked for alternative explanations. Benjamin
On Wed, Jun 09, 2021 at 07:39:41PM +0200, Benjamin Berg wrote: > My thought when I first ran into the issue was that the PPM simply > resets the change bitfield on ACK, effectively discarding any changes > that happened after the last GET_CONNECTOR_STATUS call. I believed to > have confirmed this by inserting an msleep in between. > However, I have to admit that I have never really looked for > alternative explanations. Thanks a lot for testing these. I now have pretty good idea about the problem. The problem is not what you though it is, but the driver is also not too slow like I suspected. I'm quite certain now that the PPM simply does not create the event at all in this case, regardless of what the driver does. We do need to workaround that problem, but I think we can do it in a better way. I have an idea for that. thanks, -- heikki