diff mbox series

[v9,11/19] scsi: sd: Translate data lifetime information

Message ID 20240130214911.1863909-12-bvanassche@acm.org
State Superseded
Headers show
Series Pass data lifetime information to SCSI disk devices | expand

Commit Message

Bart Van Assche Jan. 30, 2024, 9:48 p.m. UTC
Recently T10 standardized SBC constrained streams. This mechanism allows
to pass data lifetime information to SCSI devices in the group number
field. Add support for translating write hint information into a
permanent stream number in the sd driver. Use WRITE(10) instead of
WRITE(6) if data lifetime information is present because the WRITE(6)
command does not have a GROUP NUMBER field.

Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/sd.c | 98 +++++++++++++++++++++++++++++++++++++++++++++--
 drivers/scsi/sd.h |  4 +-
 2 files changed, 98 insertions(+), 4 deletions(-)

Comments

Martin K. Petersen Feb. 15, 2024, 9:33 p.m. UTC | #1
Hi Bart!

> @@ -1256,7 +1283,7 @@ static blk_status_t sd_setup_read_write_cmnd(struct scsi_cmnd *cmd)
>  		ret = sd_setup_rw16_cmnd(cmd, write, lba, nr_blocks,
>  					 protect | fua, dld);
>  	} else if ((nr_blocks > 0xff) || (lba > 0x1fffff) ||
> -		   sdp->use_10_for_rw || protect) {
> +		   sdp->use_10_for_rw || protect || rq->write_hint) {

Wouldn't it be appropriate to check sdkp->permanent_stream_count here?
rq->write_hint being set doesn't help if the device uses 6-byte
commands.
Bart Van Assche Feb. 15, 2024, 9:51 p.m. UTC | #2
On 2/15/24 13:33, Martin K. Petersen wrote:
>> @@ -1256,7 +1283,7 @@ static blk_status_t sd_setup_read_write_cmnd(struct scsi_cmnd *cmd)
>>   		ret = sd_setup_rw16_cmnd(cmd, write, lba, nr_blocks,
>>   					 protect | fua, dld);
>>   	} else if ((nr_blocks > 0xff) || (lba > 0x1fffff) ||
>> -		   sdp->use_10_for_rw || protect) {
>> +		   sdp->use_10_for_rw || protect || rq->write_hint) {
> 
> Wouldn't it be appropriate to check sdkp->permanent_stream_count here?
> rq->write_hint being set doesn't help if the device uses 6-byte
> commands.

Something like this untested change?

@@ -1256,7 +1283,8 @@ static blk_status_t sd_setup_read_write_cmnd(struct scsi_cmnd *cmd)
  		ret = sd_setup_rw16_cmnd(cmd, write, lba, nr_blocks,
  					 protect | fua, dld);
  	} else if ((nr_blocks > 0xff) || (lba > 0x1fffff) ||
-		   sdp->use_10_for_rw || protect) {
+		   sdp->use_10_for_rw || protect ||
+		   (rq->write_hint && sdkp->permanent_stream_count)) {
  		ret = sd_setup_rw10_cmnd(cmd, write, lba, nr_blocks,
  					 protect | fua);
  	} else {

Thanks,

Bart.
Martin K. Petersen Feb. 15, 2024, 10 p.m. UTC | #3
Bart,

>> Wouldn't it be appropriate to check sdkp->permanent_stream_count
>> here?
>> rq->write_hint being set doesn't help if the device uses 6-byte
>> commands.
>
> Something like this untested change?
>
> @@ -1256,7 +1283,8 @@ static blk_status_t sd_setup_read_write_cmnd(struct scsi_cmnd *cmd)
>  		ret = sd_setup_rw16_cmnd(cmd, write, lba, nr_blocks,
>  					 protect | fua, dld);
>  	} else if ((nr_blocks > 0xff) || (lba > 0x1fffff) ||
> -		   sdp->use_10_for_rw || protect) {
> +		   sdp->use_10_for_rw || protect ||
> +		   (rq->write_hint && sdkp->permanent_stream_count)) {
>  		ret = sd_setup_rw10_cmnd(cmd, write, lba, nr_blocks,
>  					 protect | fua);
>  	} else {

Do we even care about rq->write_hint being set? sd_group_number() will
check that later.

In my book, the device supporting permanent streams is a good heuristic
not to bother with 6-byte commands.

Another option would be to simply set use_16_for_rw when streams are
supported like Damien does for ZBC.
Bart Van Assche Feb. 16, 2024, 12:22 a.m. UTC | #4
On 2/15/24 14:00, Martin K. Petersen wrote:
> Another option would be to simply set use_16_for_rw when streams are
> supported like Damien does for ZBC.

Setting use_10_for_rw may work better since devices that support data
lifetime information will support WRITE(10) while it is not guaranteed
that these devices support WRITE(16). As an example, WRITE(10) support
is required by the UFS standard while WRITE(16) support is optional.

Thanks,

Bart.
Martin K. Petersen Feb. 16, 2024, 12:32 a.m. UTC | #5
Bart,

> On 2/15/24 14:00, Martin K. Petersen wrote:
>> Another option would be to simply set use_16_for_rw when streams are
>> supported like Damien does for ZBC.
>
> Setting use_10_for_rw may work better since devices that support data
> lifetime information will support WRITE(10) while it is not guaranteed
> that these devices support WRITE(16). As an example, WRITE(10) support
> is required by the UFS standard while WRITE(16) support is optional.

Sure, that's fine.
Andy Shevchenko June 11, 2024, 8:57 p.m. UTC | #6
Tue, Jan 30, 2024 at 01:48:37PM -0800, Bart Van Assche kirjoitti:
> Recently T10 standardized SBC constrained streams. This mechanism allows
> to pass data lifetime information to SCSI devices in the group number
> field. Add support for translating write hint information into a
> permanent stream number in the sd driver. Use WRITE(10) instead of
> WRITE(6) if data lifetime information is present because the WRITE(6)
> command does not have a GROUP NUMBER field.

This patch broke very badly my connected Garmin FR35 sport watch. The boot time
increased by 1 minute along with broken access to USB mass storage.

On the reboot it takes ages as well.

Revert of this and one little dependency (unrelated by functional means) helps.

Details are here: https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/issues/60

P.S. Big thanks to Arch Linux team to help with bisection!

Note, from tomorrow I will be off from the HW in question for 2+ months, I can
test anything back in mid-August, but I hope you may do something about it
without my participation.
Christian Heusel June 11, 2024, 9:21 p.m. UTC | #7
On 24/06/11 11:57PM, Andy Shevchenko wrote:
> Tue, Jan 30, 2024 at 01:48:37PM -0800, Bart Van Assche kirjoitti:
> > Recently T10 standardized SBC constrained streams. This mechanism allows
> > to pass data lifetime information to SCSI devices in the group number
> > field. Add support for translating write hint information into a
> > permanent stream number in the sd driver. Use WRITE(10) instead of
> > WRITE(6) if data lifetime information is present because the WRITE(6)
> > command does not have a GROUP NUMBER field.
> 
> This patch broke very badly my connected Garmin FR35 sport watch. The boot time
> increased by 1 minute along with broken access to USB mass storage.
> 
> On the reboot it takes ages as well.
> 
> Revert of this and one little dependency (unrelated by functional means) helps.

We have tested that the revert fixes the issue on top of v6.10-rc3.

Also adding the regressions list in CC and making regzbot aware of this
issue.

> Details are here: https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/issues/60
> 
> P.S. Big thanks to Arch Linux team to help with bisection!

If this is fixed adding in a "Reported-by" or "Bisected-by" (depending
on what this subsystem uses) for me would be appreciated :)

Cheers,
Christian

---

#regzbot title: scsi/sd: Timeout/broken USB storage with Garmin FR35
#regzbot introduced: 4f53138fffc2 ^
#regzbot link: https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/issues/60
Bart Van Assche June 11, 2024, 11:08 p.m. UTC | #8
On 6/11/24 2:21 PM, Christian Heusel wrote:
> On 24/06/11 11:57PM, Andy Shevchenko wrote:
>> Tue, Jan 30, 2024 at 01:48:37PM -0800, Bart Van Assche kirjoitti:
>>> Recently T10 standardized SBC constrained streams. This mechanism allows
>>> to pass data lifetime information to SCSI devices in the group number
>>> field. Add support for translating write hint information into a
>>> permanent stream number in the sd driver. Use WRITE(10) instead of
>>> WRITE(6) if data lifetime information is present because the WRITE(6)
>>> command does not have a GROUP NUMBER field.
>>
>> This patch broke very badly my connected Garmin FR35 sport watch. The boot time
>> increased by 1 minute along with broken access to USB mass storage.
>>
>> On the reboot it takes ages as well.
>>
>> Revert of this and one little dependency (unrelated by functional means) helps.
> 
> We have tested that the revert fixes the issue on top of v6.10-rc3.
> 
> Also adding the regressions list in CC and making regzbot aware of this
> issue.
> 
>> Details are here: https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/issues/60
>>
>> P.S. Big thanks to Arch Linux team to help with bisection!
> 
> If this is fixed adding in a "Reported-by" or "Bisected-by" (depending
> on what this subsystem uses) for me would be appreciated :)

Thank you Christian for having gone through the painful process of
bisecting this issue.

Is the Garmin FR35 Flash device perhaps connected to a USB bus? If so,
this is the second report of a USB storage device that resets if it
receives a query for the IO Advice Hints Grouping mode page. Does the
patch below help?

Thanks,

Bart.


diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 3a43e2209751..fcf3d7730466 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -63,6 +63,7 @@
  #include <scsi/scsi_cmnd.h>
  #include <scsi/scsi_dbg.h>
  #include <scsi/scsi_device.h>
+#include <scsi/scsi_devinfo.h>
  #include <scsi/scsi_driver.h>
  #include <scsi/scsi_eh.h>
  #include <scsi/scsi_host.h>
@@ -3117,6 +3118,9 @@ static void sd_read_io_hints(struct scsi_disk *sdkp, unsigned char *buffer)
  	struct scsi_mode_data data;
  	int res;

+	if (sdp->sdev_bflags & BLIST_SKIP_IO_HINTS)
+		return;
+
  	res = scsi_mode_sense(sdp, /*dbd=*/0x8, /*modepage=*/0x0a,
  			      /*subpage=*/0x05, buffer, SD_BUF_SIZE, SD_TIMEOUT,
  			      sdkp->max_retries, &data, &sshdr);
diff --git a/drivers/usb/storage/scsiglue.c b/drivers/usb/storage/scsiglue.c
index b31464740f6c..9a7185c68872 100644
--- a/drivers/usb/storage/scsiglue.c
+++ b/drivers/usb/storage/scsiglue.c
@@ -79,6 +79,8 @@ static int slave_alloc (struct scsi_device *sdev)
  	if (us->protocol == USB_PR_BULK && us->max_lun > 0)
  		sdev->sdev_bflags |= BLIST_FORCELUN;

+	sdev->sdev_bflags |= BLIST_SKIP_IO_HINTS;
+
  	return 0;
  }

diff --git a/include/scsi/scsi_devinfo.h b/include/scsi/scsi_devinfo.h
index 6b548dc2c496..fa8721e49dec 100644
--- a/include/scsi/scsi_devinfo.h
+++ b/include/scsi/scsi_devinfo.h
@@ -69,8 +69,10 @@
  #define BLIST_RETRY_ITF		((__force blist_flags_t)(1ULL << 32))
  /* Always retry ABORTED_COMMAND with ASC 0xc1 */
  #define BLIST_RETRY_ASC_C1	((__force blist_flags_t)(1ULL << 33))
+/* Do not read the I/O hints mode page */
+#define BLIST_SKIP_IO_HINTS	((__force blist_flags_t)(1ULL << 34))

-#define __BLIST_LAST_USED BLIST_RETRY_ASC_C1
+#define __BLIST_LAST_USED BLIST_SKIP_IO_HINTS

  #define __BLIST_HIGH_UNUSED (~(__BLIST_LAST_USED | \
  			       (__force blist_flags_t) \
Martin K. Petersen June 12, 2024, 1:48 a.m. UTC | #9
Hi Bart!

> diff --git a/drivers/usb/storage/scsiglue.c b/drivers/usb/storage/scsiglue.c
> index b31464740f6c..9a7185c68872 100644
> --- a/drivers/usb/storage/scsiglue.c
> +++ b/drivers/usb/storage/scsiglue.c
> @@ -79,6 +79,8 @@ static int slave_alloc (struct scsi_device *sdev)
>  	if (us->protocol == USB_PR_BULK && us->max_lun > 0)
>  		sdev->sdev_bflags |= BLIST_FORCELUN;
>
> +	sdev->sdev_bflags |= BLIST_SKIP_IO_HINTS;
> +

I suspect this is the safest approach and prefer it over the individual
device quirk.
Andy Shevchenko June 12, 2024, 5:42 a.m. UTC | #10
On Wed, Jun 12, 2024 at 8:35 AM Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
> On Wed, Jun 12, 2024 at 2:08 AM Bart Van Assche <bvanassche@acm.org> wrote:
> > On 6/11/24 2:21 PM, Christian Heusel wrote:
> > > On 24/06/11 11:57PM, Andy Shevchenko wrote:
...

> > >> This patch broke very badly my connected Garmin FR35 sport watch. The boot time
> > >> increased by 1 minute along with broken access to USB mass storage.

...

> > Is the Garmin FR35 Flash device perhaps connected to a USB bus? If so,
>
> Yes. It is written above in my report here.

Hmm... Rereading myself and it seems it was not obvious that it's a
USB mass storage device. Sorry for the confusion.
diff mbox series

Patch

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 463b201a3109..997de4daa8c4 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -47,6 +47,7 @@ 
 #include <linux/blkpg.h>
 #include <linux/blk-pm.h>
 #include <linux/delay.h>
+#include <linux/rw_hint.h>
 #include <linux/major.h>
 #include <linux/mutex.h>
 #include <linux/string_helpers.h>
@@ -1080,12 +1081,38 @@  static blk_status_t sd_setup_flush_cmnd(struct scsi_cmnd *cmd)
 	return BLK_STS_OK;
 }
 
+/**
+ * sd_group_number() - Compute the GROUP NUMBER field
+ * @cmd: SCSI command for which to compute the value of the six-bit GROUP NUMBER
+ *	field.
+ *
+ * From SBC-5 r05 (https://www.t10.org/cgi-bin/ac.pl?t=f&f=sbc5r05.pdf):
+ * 0: no relative lifetime.
+ * 1: shortest relative lifetime.
+ * 2: second shortest relative lifetime.
+ * 3 - 0x3d: intermediate relative lifetimes.
+ * 0x3e: second longest relative lifetime.
+ * 0x3f: longest relative lifetime.
+ */
+static u8 sd_group_number(struct scsi_cmnd *cmd)
+{
+	const struct request *rq = scsi_cmd_to_rq(cmd);
+	struct scsi_disk *sdkp = scsi_disk(rq->q->disk);
+
+	if (!sdkp->rscs)
+		return 0;
+
+	return min3((u32)rq->write_hint, (u32)sdkp->permanent_stream_count,
+		    0x3fu);
+}
+
 static blk_status_t sd_setup_rw32_cmnd(struct scsi_cmnd *cmd, bool write,
 				       sector_t lba, unsigned int nr_blocks,
 				       unsigned char flags, unsigned int dld)
 {
 	cmd->cmd_len = SD_EXT_CDB_SIZE;
 	cmd->cmnd[0]  = VARIABLE_LENGTH_CMD;
+	cmd->cmnd[6]  = sd_group_number(cmd);
 	cmd->cmnd[7]  = 0x18; /* Additional CDB len */
 	cmd->cmnd[9]  = write ? WRITE_32 : READ_32;
 	cmd->cmnd[10] = flags;
@@ -1104,7 +1131,7 @@  static blk_status_t sd_setup_rw16_cmnd(struct scsi_cmnd *cmd, bool write,
 	cmd->cmd_len  = 16;
 	cmd->cmnd[0]  = write ? WRITE_16 : READ_16;
 	cmd->cmnd[1]  = flags | ((dld >> 2) & 0x01);
-	cmd->cmnd[14] = (dld & 0x03) << 6;
+	cmd->cmnd[14] = ((dld & 0x03) << 6) | sd_group_number(cmd);
 	cmd->cmnd[15] = 0;
 	put_unaligned_be64(lba, &cmd->cmnd[2]);
 	put_unaligned_be32(nr_blocks, &cmd->cmnd[10]);
@@ -1119,7 +1146,7 @@  static blk_status_t sd_setup_rw10_cmnd(struct scsi_cmnd *cmd, bool write,
 	cmd->cmd_len = 10;
 	cmd->cmnd[0] = write ? WRITE_10 : READ_10;
 	cmd->cmnd[1] = flags;
-	cmd->cmnd[6] = 0;
+	cmd->cmnd[6] = sd_group_number(cmd);
 	cmd->cmnd[9] = 0;
 	put_unaligned_be32(lba, &cmd->cmnd[2]);
 	put_unaligned_be16(nr_blocks, &cmd->cmnd[7]);
@@ -1256,7 +1283,7 @@  static blk_status_t sd_setup_read_write_cmnd(struct scsi_cmnd *cmd)
 		ret = sd_setup_rw16_cmnd(cmd, write, lba, nr_blocks,
 					 protect | fua, dld);
 	} else if ((nr_blocks > 0xff) || (lba > 0x1fffff) ||
-		   sdp->use_10_for_rw || protect) {
+		   sdp->use_10_for_rw || protect || rq->write_hint) {
 		ret = sd_setup_rw10_cmnd(cmd, write, lba, nr_blocks,
 					 protect | fua);
 	} else {
@@ -3059,6 +3086,70 @@  sd_read_cache_type(struct scsi_disk *sdkp, unsigned char *buffer)
 	sdkp->DPOFUA = 0;
 }
 
+static bool sd_is_perm_stream(struct scsi_disk *sdkp, unsigned int stream_id)
+{
+	u8 cdb[16] = { SERVICE_ACTION_IN_16, SAI_GET_STREAM_STATUS };
+	struct {
+		struct scsi_stream_status_header h;
+		struct scsi_stream_status s;
+	} buf;
+	struct scsi_device *sdev = sdkp->device;
+	struct scsi_sense_hdr sshdr;
+	const struct scsi_exec_args exec_args = {
+		.sshdr = &sshdr,
+	};
+	int res;
+
+	put_unaligned_be16(stream_id, &cdb[4]);
+	put_unaligned_be32(sizeof(buf), &cdb[10]);
+
+	res = scsi_execute_cmd(sdev, cdb, REQ_OP_DRV_IN, &buf, sizeof(buf),
+			       SD_TIMEOUT, sdkp->max_retries, &exec_args);
+	if (res < 0)
+		return false;
+	if (scsi_status_is_check_condition(res) && scsi_sense_valid(&sshdr))
+		sd_print_sense_hdr(sdkp, &sshdr);
+	if (res)
+		return false;
+	if (get_unaligned_be32(&buf.h.len) < sizeof(struct scsi_stream_status))
+		return false;
+	return buf.h.stream_status[0].perm;
+}
+
+static void sd_read_io_hints(struct scsi_disk *sdkp, unsigned char *buffer)
+{
+	struct scsi_device *sdp = sdkp->device;
+	const struct scsi_io_group_descriptor *desc, *start, *end;
+	struct scsi_sense_hdr sshdr;
+	struct scsi_mode_data data;
+	int res;
+
+	res = scsi_mode_sense(sdp, /*dbd=*/0x8, /*modepage=*/0x0a,
+			      /*subpage=*/0x05, buffer, SD_BUF_SIZE, SD_TIMEOUT,
+			      sdkp->max_retries, &data, &sshdr);
+	if (res < 0)
+		return;
+	start = (void *)buffer + data.header_length + 16;
+	end = (void *)buffer + ALIGN_DOWN(data.header_length + data.length,
+					  sizeof(*end));
+	/*
+	 * From "SBC-5 Constrained Streams with Data Lifetimes": Device severs
+	 * should assign the lowest numbered stream identifiers to permanent
+	 * streams.
+	 */
+	for (desc = start; desc < end; desc++)
+		if (!desc->st_enble || !sd_is_perm_stream(sdkp, desc - start))
+			break;
+	sdkp->permanent_stream_count = desc - start;
+	if (sdkp->rscs && sdkp->permanent_stream_count < 2)
+		sd_printk(KERN_INFO, sdkp,
+			  "Unexpected: RSCS has been set and the permanent stream count is %u\n",
+			  sdkp->permanent_stream_count);
+	else if (sdkp->permanent_stream_count)
+		sd_printk(KERN_INFO, sdkp, "permanent stream count = %d\n",
+			  sdkp->permanent_stream_count);
+}
+
 /*
  * The ATO bit indicates whether the DIF application tag is available
  * for use by the operating system.
@@ -3539,6 +3630,7 @@  static int sd_revalidate_disk(struct gendisk *disk)
 
 		sd_read_write_protect_flag(sdkp, buffer);
 		sd_read_cache_type(sdkp, buffer);
+		sd_read_io_hints(sdkp, buffer);
 		sd_read_app_tag_own(sdkp, buffer);
 		sd_read_write_same(sdkp, buffer);
 		sd_read_security(sdkp, buffer);
diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
index e4539122f2a2..5c4285a582b2 100644
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -125,6 +125,8 @@  struct scsi_disk {
 	unsigned int	physical_block_size;
 	unsigned int	max_medium_access_timeouts;
 	unsigned int	medium_access_timed_out;
+			/* number of permanent streams */
+	u16		permanent_stream_count;
 	u8		media_present;
 	u8		write_prot;
 	u8		protection_type;/* Data Integrity Field */
@@ -151,7 +153,7 @@  struct scsi_disk {
 	unsigned	urswrz : 1;
 	unsigned	security : 1;
 	unsigned	ignore_medium_access_errors : 1;
-	bool		rscs : 1; /* reduced stream control support */
+	unsigned	rscs : 1; /* reduced stream control support */
 };
 #define to_scsi_disk(obj) container_of(obj, struct scsi_disk, disk_dev)