Message ID | 20240426225739.2166-2-kl@kl.wtf |
---|---|
State | New |
Headers | show |
Series | HID: i2c-hid: Probe and wake device with HID descriptor fetch | expand |
Hi Kenny, Lukasz, On Sat, Apr 27, 2024 at 12:47:07AM +0200, Kenny Levinsen wrote: > To avoid error messages when a device is not present, b3a81b6c4fc6 added > an initial bus probe using a dummy i2c_smbus_read_byte() call. > > Without this probe, i2c_hid_fetch_hid_descriptor() will fail internally > on a bus error and log. Treat the bus error as a missing device and > remove the error log so we can do away with the probe. > > Tested-by: Lukasz Majczak <lma@chromium.org> > Reviewed-by: Lukasz Majczak <lma@chromium.org> > Signed-off-by: Kenny Levinsen <kl@kl.wtf> > --- > drivers/hid/i2c-hid/i2c-hid-core.c | 21 ++++++--------------- > 1 file changed, 6 insertions(+), 15 deletions(-) > > diff --git a/drivers/hid/i2c-hid/i2c-hid-core.c b/drivers/hid/i2c-hid/i2c-hid-core.c > index d965382196c6..6ffa43d245b4 100644 > --- a/drivers/hid/i2c-hid/i2c-hid-core.c > +++ b/drivers/hid/i2c-hid/i2c-hid-core.c > @@ -872,12 +872,11 @@ static int i2c_hid_fetch_hid_descriptor(struct i2c_hid *ihid) > ihid->wHIDDescRegister, > &ihid->hdesc, > sizeof(ihid->hdesc)); > - if (error) { > - dev_err(&ihid->client->dev, > - "failed to fetch HID descriptor: %d\n", > - error); > - return -ENODEV; > - } > + > + /* The i2c drivers are a bit inconsistent with their error > + * codes, so treat everything as -ENXIO for now. */ > + if (error) > + return -ENXIO; I really think we should differentiate the cases "we do not know if there is a device" vs "we do known that there is a device and we have strong expectation of what that device is, and we do not expect communication to fail". Because of that I think we should have separate retry for the original i2c_smbus_read_byte() (if you want you can make a wrapper around it, something like i2c_hid_check_device_present()", and if there is a concern that we will run into similar issue on resume (either from suspend or from hibernate) then we can have similar retry in i2c_hid_fetch_hid_descriptor(). But I do think that i2c_hid_fetch_hid_descriptor() should not try to clobber the error but rather log the true one. > } > > /* Validate the length of HID descriptor, the 4 first bytes: > @@ -992,17 +991,9 @@ static int __i2c_hid_core_probe(struct i2c_hid *ihid) > struct hid_device *hid = ihid->hid; > int ret; > > - /* Make sure there is something at this address */ > - ret = i2c_smbus_read_byte(client); > - if (ret < 0) { > - i2c_hid_dbg(ihid, "nothing at this address: %d\n", ret); > - return -ENXIO; > - } > - > ret = i2c_hid_fetch_hid_descriptor(ihid); > if (ret < 0) { > - dev_err(&client->dev, > - "Failed to fetch the HID Descriptor\n"); > + i2c_hid_dbg(ihid, "failed to fetch HID descriptor: %d\n", ret); > return ret; > } > > -- > 2.44.0 > > Thanks.
On 4/27/24 5:21 AM, Dmitry Torokhov wrote: > I really think we should differentiate the cases "we do not know if > there is a device" vs "we do known that there is a device and we have > strong expectation of what that device is, and we do not expect > communication to fail". My reasoning was that there is no difference between looking for address acknowledge on a probe read vs. a real command. Unfortunately, I ran into some issues with error code consistency that Doug highlighted... Considering that the smbus probe bails on *any* error, it's really only ACK'd address + NACK'd register that remains, and I thought it maybe wouldn't be too harmful to just always have a debug log as suggested. However, I would still like *more* good errors by being specific about the error condition, and I plan to send some patches to get the number of drivers sending ENXIO up so we can comfortably rely on it in a future i2c-hid patch. If you don't think it's acceptable to leave this as a pure debug print for now, I'll send a patch with just a minor clean-up and Łukasz' delays - then I'll try this again later when the driver situation has improved. I've been rapid-firing revisions, so I'll await an ACK this time... :) --- For some context, I started looking at the i2c-hid driver after a recent regression around assumed Windows behavior, and found that the actual behavior differed a lot from our assumptions. Windows gets the job done notably quicker, with fewer messages and with shorter albeit differently placed delays. My plan is to send patches that clean up and aligns our traffic more, speeding things up and hopefully deprecating some quirks. To that end, I would also like to get rid of the dummy read once we're comfortable with it.
On Sat, Apr 27, 2024 at 03:20:07PM +0200, Kenny Levinsen wrote: > On 4/27/24 5:21 AM, Dmitry Torokhov wrote: > > I really think we should differentiate the cases "we do not know if > > there is a device" vs "we do known that there is a device and we have > > strong expectation of what that device is, and we do not expect > > communication to fail". > > My reasoning was that there is no difference between looking for address > acknowledge on a probe read vs. a real command. Unfortunately, I ran into > some issues with error code consistency that Doug highlighted... I actually believe there is. On Chromebooks we may source components from several vendors and use them in our devices. The components are electrically compatible with each other, have exactly the same connector, and therefore interchangeable. Because of that at probe time we do not quite know if the device is there at given address, or not (i.e. the touchpad could be from a different vendor and listening on another address) and we need to make a quick determination whether we should continue with probe or not. Once we decided that the probe should continue we commit to it and all subsequent errors from the device should be treated as unexpected errors worthy of logging. On resume we do not expect hardware configuration to change, so again when resuming we also treat errors as errors and log them and fail resume. > > Considering that the smbus probe bails on *any* error, it's really only > ACK'd address + NACK'd register that remains, and I thought it maybe > wouldn't be too harmful to just always have a debug log as suggested. > However, I would still like *more* good errors by being specific about the > error condition, and I plan to send some patches to get the number of > drivers sending ENXIO up so we can comfortably rely on it in a future > i2c-hid patch. > > If you don't think it's acceptable to leave this as a pure debug print for > now, I'll send a patch with just a minor clean-up and Łukasz' delays - then > I'll try this again later when the driver situation has improved. I've been > rapid-firing revisions, so I'll await an ACK this time... :) > > --- > > For some context, I started looking at the i2c-hid driver after a recent > regression around assumed Windows behavior, and found that the actual > behavior differed a lot from our assumptions. Windows gets the job done > notably quicker, with fewer messages and with shorter albeit differently > placed delays. I am not sure we can fully unify what Windows does and what Linux does, mainly because our firmwares are different (I think Windows devices do a lot more device discovery in firmware, Chrome OS historically tried to limit amount of code in its firmware). We also need to make sure it works on non-ACPI systems/ARM. Thanks.
On 4/30/24 11:41 PM, Dmitry Torokhov wrote: > I actually believe there is. On Chromebooks we may source components > from several vendors and use them in our devices. The components > are electrically compatible with each other, have exactly the same > connector, and therefore interchangeable. Because of that at probe time > we do not quite know if the device is there at given address, or not > (i.e. the touchpad could be from a different vendor and listening on > another address) and we need to make a quick determination whether we > should continue with probe or not. Maybe I should clarify what I meant: All I2C operations start with the master writing the slave address to the bus. When a slave reads its own address off the bus, it pulls the data line low to ACK. If no device is present on the bus with the specified address, the line stays high which is a NACK. This means that on the bus level, we have a clear error condition specifically for no device with the specified address being present on the bus. Whether the operation used is a dummy read or our first actual write should not matter - if the address is not acknowledged, the device is not present (or able to talk I2C). The problem lies in whether this "no device present on bus" error is reported clearly to us: Some drivers use -ENXIO specifically for this, some use it also for NACKs on written data, some report it but use other return codes for it, etc. Even if we stick to the smbus probe in the long run, if we get to the point where we can rely on the error codes from I2C drivers we would be able to correctly log and propagate other error classes like bus errors or I2C driver issues which are all currently silenced as "nothing at address" by the smbus probe. > I am not sure we can fully unify what Windows does and what Linux does, > mainly because our firmwares are different (I think Windows devices do a > lot more device discovery in firmware, Chrome OS historically tried to > limit amount of code in its firmware). We also need to make sure it > works on non-ACPI systems/ARM. Good point. My main focus is also quirky behaviors we have added to replicate Windows behavior, the smbus probe just stood out in my bus traces. I already sent https://lore.kernel.org/all/20240429233924.6453-1-kl@kl.wtf/ which goes back to improving the bus probe.
On Wed, May 01, 2024 at 07:24:08AM +0200, Kenny Levinsen wrote: > On 4/30/24 11:41 PM, Dmitry Torokhov wrote: > > I actually believe there is. On Chromebooks we may source components > > from several vendors and use them in our devices. The components > > are electrically compatible with each other, have exactly the same > > connector, and therefore interchangeable. Because of that at probe time > > we do not quite know if the device is there at given address, or not > > (i.e. the touchpad could be from a different vendor and listening on > > another address) and we need to make a quick determination whether we > > should continue with probe or not. > > Maybe I should clarify what I meant: All I2C operations start with the > master writing the slave address to the bus. When a slave reads its own > address off the bus, it pulls the data line low to ACK. If no device is > present on the bus with the specified address, the line stays high which is > a NACK. This means that on the bus level, we have a clear error condition > specifically for no device with the specified address being present on the > bus. > > Whether the operation used is a dummy read or our first actual write should > not matter - if the address is not acknowledged, the device is not present > (or able to talk I2C). Is it possible for a device to be wedged so hard that it refuses to acknowledge the address? > The problem lies in whether this "no device present > on bus" error is reported clearly to us: Some drivers use -ENXIO > specifically for this, some use it also for NACKs on written data, some > report it but use other return codes for it, etc. > > Even if we stick to the smbus probe in the long run, if we get to the point > where we can rely on the error codes from I2C drivers we would be able to > correctly log and propagate other error classes like bus errors or I2C > driver issues which are all currently silenced as "nothing at address" by > the smbus probe. I think this depends on the answer to the question above. If there is potential that the chip may stop responding, I still see benefit in differentiating initial "soft touch" poke vs. hard errors once we established that there is/was a device and it started misbehaving. Thanks.
On 5/1/24 9:09 PM, Dmitry Torokhov wrote: > Is it possible for a device to be wedged so hard that it refuses to > acknowledge the address? A slave is allowed to not acknowledge if not able (e.g., "because it's performing some real time function"), but a slave that does not acknowledge its address is electrically indistinguishable from a disconnected device. In such case the device is impossible to detect through I2C operations, and neither smbus probe nor a "real" command will see it. Any logic we have to silence missing devices will also silence entirely unresponsive or extremely non-cooperate devices. That is the price to pay for avoiding the log message unfortunately. No other errors from the smbus probe or a real command would be related to device presence, and some of them even suggest a device is present but broken (arbitration loss, assuming no shorts).
diff --git a/drivers/hid/i2c-hid/i2c-hid-core.c b/drivers/hid/i2c-hid/i2c-hid-core.c index d965382196c6..6ffa43d245b4 100644 --- a/drivers/hid/i2c-hid/i2c-hid-core.c +++ b/drivers/hid/i2c-hid/i2c-hid-core.c @@ -872,12 +872,11 @@ static int i2c_hid_fetch_hid_descriptor(struct i2c_hid *ihid) ihid->wHIDDescRegister, &ihid->hdesc, sizeof(ihid->hdesc)); - if (error) { - dev_err(&ihid->client->dev, - "failed to fetch HID descriptor: %d\n", - error); - return -ENODEV; - } + + /* The i2c drivers are a bit inconsistent with their error + * codes, so treat everything as -ENXIO for now. */ + if (error) + return -ENXIO; } /* Validate the length of HID descriptor, the 4 first bytes: @@ -992,17 +991,9 @@ static int __i2c_hid_core_probe(struct i2c_hid *ihid) struct hid_device *hid = ihid->hid; int ret; - /* Make sure there is something at this address */ - ret = i2c_smbus_read_byte(client); - if (ret < 0) { - i2c_hid_dbg(ihid, "nothing at this address: %d\n", ret); - return -ENXIO; - } - ret = i2c_hid_fetch_hid_descriptor(ihid); if (ret < 0) { - dev_err(&client->dev, - "Failed to fetch the HID Descriptor\n"); + i2c_hid_dbg(ihid, "failed to fetch HID descriptor: %d\n", ret); return ret; }