diff mbox series

[v3,3/8] scsi: core: Support failing requests while recovering

Message ID 20220929220021.247097-4-bvanassche@acm.org
State New
Headers show
Series Fix a deadlock in the UFS driver | expand

Commit Message

Bart Van Assche Sept. 29, 2022, 10 p.m. UTC
The current behavior for SCSI commands submitted while error recovery
is ongoing is to retry command submission after error recovery has
finished. See also the scsi_host_in_recovery() check in
scsi_host_queue_ready(). Add support for failing SCSI commands while
host recovery is in progress. This functionality will be used to fix a
deadlock in the UFS driver.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_lib.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Comments

Mike Christie Oct. 3, 2022, 5:27 p.m. UTC | #1
On 9/29/22 5:00 PM, Bart Van Assche wrote:
> The current behavior for SCSI commands submitted while error recovery
> is ongoing is to retry command submission after error recovery has
> finished. See also the scsi_host_in_recovery() check in
> scsi_host_queue_ready(). Add support for failing SCSI commands while
> host recovery is in progress. This functionality will be used to fix a
> deadlock in the UFS driver.
> 
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Ming Lei <ming.lei@redhat.com>
> Cc: John Garry <john.garry@huawei.com>
> Cc: Mike Christie <michael.christie@oracle.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  drivers/scsi/scsi_lib.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 473d9403f0c1..ecd2ce3815df 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1337,9 +1337,6 @@ static inline int scsi_host_queue_ready(struct request_queue *q,
>  				   struct scsi_device *sdev,
>  				   struct scsi_cmnd *cmd)
>  {
> -	if (scsi_host_in_recovery(shost))
> -		return 0;
> -
>  	if (atomic_read(&shost->host_blocked) > 0) {
>  		if (scsi_host_busy(shost) > 0)
>  			goto starved;
> @@ -1727,6 +1724,11 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
>  	ret = BLK_STS_RESOURCE;
>  	if (!scsi_target_queue_ready(shost, sdev))
>  		goto out_put_budget;
> +	if (unlikely(scsi_host_in_recovery(shost))) {
> +		if (req->cmd_flags & REQ_FAILFAST_MASK)
> +			ret = BLK_STS_OFFLINE;
> +		goto out_dec_target_busy;
> +	}
>  	if (!scsi_host_queue_ready(q, shost, sdev, cmd))
>  		goto out_dec_target_busy;
>  

This might add a regression to dm-multipath or it at least makes
the behavior difficult to understand for users and support people.

If there is a transport issue and the cmd times out and the abort
does as well or fails, then we would start the scsi eh recovery. When the
driver/transport class figures out it's a transport issue we will block
the scsi_device. So before this patch we would requeue the cmd then we
would wait until the fast_io_fail (FC) or replacement_timer (iscsi) to
fire before failing commands upwards and failing paths.

With this patch we can end up doing 2 or 3 things depending on timing:
1. If the cmd hits the code above we will fail the command and cause a
path failure.

2. If driver/transport blocks the scsi_device first then we would hit the
scsi_device state check in scsi_queue_rq and not fail the path like before.

3. With or without your patch, if dm_mq_queue_rq calls the busy callback
first (this does blk_lld_busy -> scsi_mq_lld_busy -> scsi_host_in_recovery)
then it might see all the paths in recovery. It considers this a temp condition
and will requeue cmds. So in this case we will not fail the path.

I'm not sure what the correct fix is. Maybe not fast fail REQ_FAILFAST_TRANSPORT
above, or maybe add a new FAILFAST type that you could use?
Bart Van Assche Oct. 3, 2022, 7:20 p.m. UTC | #2
On 10/3/22 10:27, Mike Christie wrote:
> On 9/29/22 5:00 PM, Bart Van Assche wrote:
>> The current behavior for SCSI commands submitted while error recovery
>> is ongoing is to retry command submission after error recovery has
>> finished. See also the scsi_host_in_recovery() check in
>> scsi_host_queue_ready(). Add support for failing SCSI commands while
>> host recovery is in progress. This functionality will be used to fix a
>> deadlock in the UFS driver.
>>
>> Cc: Christoph Hellwig <hch@lst.de>
>> Cc: Ming Lei <ming.lei@redhat.com>
>> Cc: John Garry <john.garry@huawei.com>
>> Cc: Mike Christie <michael.christie@oracle.com>
>> Cc: Hannes Reinecke <hare@suse.de>
>> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
>> ---
>>   drivers/scsi/scsi_lib.c | 8 +++++---
>>   1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
>> index 473d9403f0c1..ecd2ce3815df 100644
>> --- a/drivers/scsi/scsi_lib.c
>> +++ b/drivers/scsi/scsi_lib.c
>> @@ -1337,9 +1337,6 @@ static inline int scsi_host_queue_ready(struct request_queue *q,
>>   				   struct scsi_device *sdev,
>>   				   struct scsi_cmnd *cmd)
>>   {
>> -	if (scsi_host_in_recovery(shost))
>> -		return 0;
>> -
>>   	if (atomic_read(&shost->host_blocked) > 0) {
>>   		if (scsi_host_busy(shost) > 0)
>>   			goto starved;
>> @@ -1727,6 +1724,11 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
>>   	ret = BLK_STS_RESOURCE;
>>   	if (!scsi_target_queue_ready(shost, sdev))
>>   		goto out_put_budget;
>> +	if (unlikely(scsi_host_in_recovery(shost))) {
>> +		if (req->cmd_flags & REQ_FAILFAST_MASK)
>> +			ret = BLK_STS_OFFLINE;
>> +		goto out_dec_target_busy;
>> +	}
>>   	if (!scsi_host_queue_ready(q, shost, sdev, cmd))
>>   		goto out_dec_target_busy;
>>   
> 
> This might add a regression to dm-multipath or it at least makes
> the behavior difficult to understand for users and support people.
> 
> If there is a transport issue and the cmd times out and the abort
> does as well or fails, then we would start the scsi eh recovery. When the
> driver/transport class figures out it's a transport issue we will block
> the scsi_device. So before this patch we would requeue the cmd then we
> would wait until the fast_io_fail (FC) or replacement_timer (iscsi) to
> fire before failing commands upwards and failing paths.
> 
> With this patch we can end up doing 2 or 3 things depending on timing:
> 1. If the cmd hits the code above we will fail the command and cause a
> path failure.
> 
> 2. If driver/transport blocks the scsi_device first then we would hit the
> scsi_device state check in scsi_queue_rq and not fail the path like before.
> 
> 3. With or without your patch, if dm_mq_queue_rq calls the busy callback
> first (this does blk_lld_busy -> scsi_mq_lld_busy -> scsi_host_in_recovery)
> then it might see all the paths in recovery. It considers this a temp condition
> and will requeue cmds. So in this case we will not fail the path.
> 
> I'm not sure what the correct fix is. Maybe not fast fail REQ_FAILFAST_TRANSPORT
> above, or maybe add a new FAILFAST type that you could use?

Hi Mike,

I will look into introducing a new request flag for this purpose.

Thanks,

Bart.
diff mbox series

Patch

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 473d9403f0c1..ecd2ce3815df 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1337,9 +1337,6 @@  static inline int scsi_host_queue_ready(struct request_queue *q,
 				   struct scsi_device *sdev,
 				   struct scsi_cmnd *cmd)
 {
-	if (scsi_host_in_recovery(shost))
-		return 0;
-
 	if (atomic_read(&shost->host_blocked) > 0) {
 		if (scsi_host_busy(shost) > 0)
 			goto starved;
@@ -1727,6 +1724,11 @@  static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
 	ret = BLK_STS_RESOURCE;
 	if (!scsi_target_queue_ready(shost, sdev))
 		goto out_put_budget;
+	if (unlikely(scsi_host_in_recovery(shost))) {
+		if (req->cmd_flags & REQ_FAILFAST_MASK)
+			ret = BLK_STS_OFFLINE;
+		goto out_dec_target_busy;
+	}
 	if (!scsi_host_queue_ready(q, shost, sdev, cmd))
 		goto out_dec_target_busy;