mbox series

[v2,0/4] SCSI core and UFS patches for kernel v6.4

Message ID 20230417230656.523826-1-bvanassche@acm.org
Headers show
Series SCSI core and UFS patches for kernel v6.4 | expand

Message

Bart Van Assche April 17, 2023, 11:06 p.m. UTC
Hi Martin,

This patch series includes a patch for making sd_shutdown() fail future I/O
and also two UFS patches.

Patch 3/3 of this series has been posted earlier. Compared to the previous
version of that patch, a Fixes: tag has been added.

Please consider this patch series for the next merge window. 

Thanks,

Bart.

Changes compared to v1:
- Slightly changed the description of patch "scsi: sd: Let sd_shutdown() fail
  future I/O".
- Included patch "scsi: ufs: Fix handling of lrbp->cmd"

Bart Van Assche (4):
  scsi: sd: Let sd_shutdown() fail future I/O
  scsi: ufs: Simplify ufshcd_wl_shutdown()
  scsi: ufs: Increase the START STOP UNIT timeout from one to ten
    seconds
  scsi: ufs: Fix handling of lrbp->cmd

 drivers/scsi/sd.c         | 11 ++++++++++-
 drivers/ufs/core/ufshcd.c | 20 +++-----------------
 2 files changed, 13 insertions(+), 18 deletions(-)

Comments

Christoph Hellwig April 18, 2023, 5:06 a.m. UTC | #1
On Mon, Apr 17, 2023 at 04:06:53PM -0700, Bart Van Assche wrote:
> +fail_future_io:
> +	q = sdkp->disk->queue;
> +	blk_queue_flag_set(QUEUE_FLAG_DYING, q);

SD does not own the queue and has abslutely not business marking it
dying.
Adrian Hunter April 18, 2023, 7:30 a.m. UTC | #2
On 18/04/23 02:06, Bart Van Assche wrote:
> One UFS vendor asked to increase the UFS timeout from 1 s to 3 s.
> Another UFS vendor asked to increase the UFS timeout from 1 s to 10 s.
> Hence this patch that increases the UFS timeout to 10 s. This patch can
> cause the total timeout to exceed 20 s, the Android shutdown timeout.
> This is fine since the loop around ufshcd_execute_start_stop() exists to
> deal with unit attentions and because unit attentions are reported
> quickly.
> 
> Fixes: dcd5b7637c6d ("scsi: ufs: Reduce the START STOP UNIT timeout")
> Fixes: 8f2c96420c6e ("scsi: ufs: core: Reduce the power mode change timeout")
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>

Acked-by: Adrian Hunter <adrian.hunter@intel.com>

> ---
>  drivers/ufs/core/ufshcd.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index 784787cf08c3..6831eb1afc30 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -9182,7 +9182,8 @@ static int ufshcd_execute_start_stop(struct scsi_device *sdev,
>  	};
>  
>  	return scsi_execute_cmd(sdev, cdb, REQ_OP_DRV_IN, /*buffer=*/NULL,
> -			/*bufflen=*/0, /*timeout=*/HZ, /*retries=*/0, &args);
> +			/*bufflen=*/0, /*timeout=*/10 * HZ, /*retries=*/0,
> +			&args);
>  }
>  
>  /**
Stanley Jhu April 18, 2023, 7:57 a.m. UTC | #3
Bart Van Assche <bvanassche@acm.org> 於 2023年4月18日 週二 上午7:17寫道:
>
> One UFS vendor asked to increase the UFS timeout from 1 s to 3 s.
> Another UFS vendor asked to increase the UFS timeout from 1 s to 10 s.
> Hence this patch that increases the UFS timeout to 10 s. This patch can
> cause the total timeout to exceed 20 s, the Android shutdown timeout.
> This is fine since the loop around ufshcd_execute_start_stop() exists to
> deal with unit attentions and because unit attentions are reported
> quickly.
>
> Fixes: dcd5b7637c6d ("scsi: ufs: Reduce the START STOP UNIT timeout")
> Fixes: 8f2c96420c6e ("scsi: ufs: core: Reduce the power mode change timeout")
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>

Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Bart Van Assche April 18, 2023, 6:37 p.m. UTC | #4
On 4/18/23 07:36, James Bottomley wrote:
> On Mon, 2023-04-17 at 16:06 -0700, Bart Van Assche wrote:
>> System shutdown happens as follows (see e.g. the systemd source file
>> src/shutdown/shutdown.c):
>> * sync() is called.
>> * reboot(RB_AUTOBOOT/RB_HALT_SYSTEM/RB_POWER_OFF) is called.
>> * If the reboot() system call returns, log an error message.
>>
>> The reboot() system call causes the kernel to call kernel_restart(),
>> kernel_halt() or kernel_power_off(). Each of these functions calls
>> device_shutdown(). device_shutdown() calls sd_shutdown(). After
>> sd_shutdown() has been called the .shutdown() callback of the LLD
>> will be called. Hence, I/O submitted after sd_shutdown() will hang or
>> may even cause a kernel crash.
>>
>> Let sd_shutdown() fail future I/O such that LLD .shutdown() callbacks
>> can be simplified.
> 
> What is the actual reason for this?  What is it you think might be
> submitting I/O after the system gets into this state?  Current
> sd_shutdown is constructed on the premise that it's the last thing that
> ever happens to the device before reboot/power off which is why it
> flushes the cache if necessary and stops the device if required, but
> for most standard devices neither is required because we don't expect
> Linux to go down with pending items in the block queue and for a write
> through disk cache anything that's completed on the block queue is
> safely durable on the device.

Hi James,

.shutdown() callbacks should quiesce I/O but the sd_shutdown() function 
doesn't do this. I see this as a bug.

Regarding your question, I think that sd_check_events() can be called 
while sd_shutdown() is in progress or after sd_shutdown() has finished. 
sd_check_events() may submit a TEST UNIT READY command.

In pci_device_shutdown() one can see that the PCI core clears the bus 
master bit for PCI devices during shutdown. In other words, it is not 
safe to submit I/O or to process completions during invocation of 
shutdown callbacks. I think that also shows that this patch fixes a bug.

Bart.
Bart Van Assche April 19, 2023, 5:58 p.m. UTC | #5
On 4/19/23 07:02, James Bottomley wrote:
> device_shutdown() goes in reverse devices_kset->list order, so it looks
> like it would do the PCI device then the SCSI device then the ULD then
> block, so we can use the queues in SCSI for emergency actions (like
> flush or stop) before block goes down.  Although this isn't guaranteed;
> there are things, like device_link_add, which reorder this kset, so
> we'd need to make sure the above assumption is correct.

Hi James,

My understanding is that both the block device associated with 
/dev/sd<x> and the struct scsi_disk associated with the same SCSI device 
have the sdev_gendev member of struct scsi_device as parent. In other 
words, without creating device links, there are no guarantees about the 
order in which the .shutdown() methods of struct block_device.bd_device 
and struct scsi_disk.disk_dev are called. Adding device links seems like 
an unnecessary complexity to me. Hence my preference to make 
sd_shutdown() responsible for quiescing future SCSI command activity.

See also the attached diagram (not sure I got that diagram right).

Bart.
Bart Van Assche April 19, 2023, 7:24 p.m. UTC | #6
On 4/19/23 11:33, James Bottomley wrote:
> So if we can't reliably get it right, let's not do it at all.  The
> previous argument you made was about problems with I/O to shutdown PCI
> devices.  That's not SCSI specific, so either we fix it for everything
> or decide it's not really a problem for anything.

The reference to PCI devices was an example only. As mentioned elsewhere 
in this email thread, sd_shutdown() does not do what it should do, 
namely to quiesce I/O. From include/linux/bus.h:

  * @shutdown:	Called at shut-down time to quiesce the device.

Bart.