diff mbox series

thunderbolt: Retry DROM read once if parsing fails

Message ID 20200907110340.71031-1-mika.westerberg@linux.intel.com
State New
Headers show
Series thunderbolt: Retry DROM read once if parsing fails | expand

Commit Message

Mika Westerberg Sept. 7, 2020, 11:03 a.m. UTC
Kai-Heng reported that sometimes DROM parsing of ASUS PA27AC Thunderbolt 3
monitor fails. This makes the driver to fail to add the device so only
DisplayPort tunneling is functional.

It is not clear what exactly happens but waiting for 100 ms and retrying
the read seems to work this around so we do that here.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=206493
Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
---
 drivers/thunderbolt/eeprom.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

Comments

Lukas Wunner Sept. 7, 2020, 11:21 a.m. UTC | #1
On Mon, Sep 07, 2020 at 02:03:40PM +0300, Mika Westerberg wrote:
> Kai-Heng reported that sometimes DROM parsing of ASUS PA27AC Thunderbolt 3
> monitor fails. This makes the driver to fail to add the device so only
> DisplayPort tunneling is functional.

Maybe the DROM is contained in an external EEPROM attached via slow
SPI or i2c and cannot be read in time before a response on the control
channel is due?  Does it help to always delay 100 ms before performing
the DROM read with this display?  Perhaps a quirk specific for this
display is more appropriate than a change which affects all devices?
Because this way of working around problems is difficult to maintain
long term.

Thanks,

Lukas
Mika Westerberg Sept. 7, 2020, 12:02 p.m. UTC | #2
On Mon, Sep 07, 2020 at 01:21:51PM +0200, Lukas Wunner wrote:
> On Mon, Sep 07, 2020 at 02:03:40PM +0300, Mika Westerberg wrote:
> > Kai-Heng reported that sometimes DROM parsing of ASUS PA27AC Thunderbolt 3
> > monitor fails. This makes the driver to fail to add the device so only
> > DisplayPort tunneling is functional.
> 
> Maybe the DROM is contained in an external EEPROM attached via slow
> SPI or i2c and cannot be read in time before a response on the control
> channel is due?  Does it help to always delay 100 ms before performing
> the DROM read with this display?  Perhaps a quirk specific for this
> display is more appropriate than a change which affects all devices?
> Because this way of working around problems is difficult to maintain
> long term.

It helps to always wait the 100ms IIRC but I'm not sure we want to do
that for every device. Also I'm not a fan of maintaining a list of
quirks if we have the possibility to have more "general" solution to the
issue (like what we do here, retry once).
Mika Westerberg Sept. 9, 2020, 11:06 a.m. UTC | #3
On Mon, Sep 07, 2020 at 02:03:40PM +0300, Mika Westerberg wrote:
> Kai-Heng reported that sometimes DROM parsing of ASUS PA27AC Thunderbolt 3
> monitor fails. This makes the driver to fail to add the device so only
> DisplayPort tunneling is functional.
> 
> It is not clear what exactly happens but waiting for 100 ms and retrying
> the read seems to work this around so we do that here.
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206493
> Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> Cc: stable@vger.kernel.org
> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>

Applied to thunderbolt.git/fixes.
diff mbox series

Patch

diff --git a/drivers/thunderbolt/eeprom.c b/drivers/thunderbolt/eeprom.c
index 3ebca44ab3fa..0c8471be3e32 100644
--- a/drivers/thunderbolt/eeprom.c
+++ b/drivers/thunderbolt/eeprom.c
@@ -7,6 +7,7 @@ 
  */
 
 #include <linux/crc32.h>
+#include <linux/delay.h>
 #include <linux/property.h>
 #include <linux/slab.h>
 #include "tb.h"
@@ -389,8 +390,8 @@  static int tb_drom_parse_entries(struct tb_switch *sw)
 		struct tb_drom_entry_header *entry = (void *) (sw->drom + pos);
 		if (pos + 1 == drom_size || pos + entry->len > drom_size
 				|| !entry->len) {
-			tb_sw_warn(sw, "drom buffer overrun, aborting\n");
-			return -EIO;
+			tb_sw_warn(sw, "DROM buffer overrun\n");
+			return -EILSEQ;
 		}
 
 		switch (entry->type) {
@@ -526,7 +527,8 @@  int tb_drom_read(struct tb_switch *sw)
 	u16 size;
 	u32 crc;
 	struct tb_drom_header *header;
-	int res;
+	int res, retries = 1;
+
 	if (sw->drom)
 		return 0;
 
@@ -612,7 +614,17 @@  int tb_drom_read(struct tb_switch *sw)
 		tb_sw_warn(sw, "drom device_rom_revision %#x unknown\n",
 			header->device_rom_revision);
 
-	return tb_drom_parse_entries(sw);
+	res = tb_drom_parse_entries(sw);
+	/* If the DROM parsing fails, wait a moment and retry once */
+	if (res == -EILSEQ && retries--) {
+		tb_sw_warn(sw, "parsing DROM failed, retrying\n");
+		msleep(100);
+		res = tb_drom_read_n(sw, 0, sw->drom, size);
+		if (!res)
+			goto parse;
+	}
+
+	return res;
 err:
 	kfree(sw->drom);
 	sw->drom = NULL;