Message ID | 20220929220021.247097-4-bvanassche@acm.org |
---|---|
State | New |
Headers | show |
Series | Fix a deadlock in the UFS driver | expand |
On 9/29/22 5:00 PM, Bart Van Assche wrote: > The current behavior for SCSI commands submitted while error recovery > is ongoing is to retry command submission after error recovery has > finished. See also the scsi_host_in_recovery() check in > scsi_host_queue_ready(). Add support for failing SCSI commands while > host recovery is in progress. This functionality will be used to fix a > deadlock in the UFS driver. > > Cc: Christoph Hellwig <hch@lst.de> > Cc: Ming Lei <ming.lei@redhat.com> > Cc: John Garry <john.garry@huawei.com> > Cc: Mike Christie <michael.christie@oracle.com> > Cc: Hannes Reinecke <hare@suse.de> > Signed-off-by: Bart Van Assche <bvanassche@acm.org> > --- > drivers/scsi/scsi_lib.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > index 473d9403f0c1..ecd2ce3815df 100644 > --- a/drivers/scsi/scsi_lib.c > +++ b/drivers/scsi/scsi_lib.c > @@ -1337,9 +1337,6 @@ static inline int scsi_host_queue_ready(struct request_queue *q, > struct scsi_device *sdev, > struct scsi_cmnd *cmd) > { > - if (scsi_host_in_recovery(shost)) > - return 0; > - > if (atomic_read(&shost->host_blocked) > 0) { > if (scsi_host_busy(shost) > 0) > goto starved; > @@ -1727,6 +1724,11 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx, > ret = BLK_STS_RESOURCE; > if (!scsi_target_queue_ready(shost, sdev)) > goto out_put_budget; > + if (unlikely(scsi_host_in_recovery(shost))) { > + if (req->cmd_flags & REQ_FAILFAST_MASK) > + ret = BLK_STS_OFFLINE; > + goto out_dec_target_busy; > + } > if (!scsi_host_queue_ready(q, shost, sdev, cmd)) > goto out_dec_target_busy; > This might add a regression to dm-multipath or it at least makes the behavior difficult to understand for users and support people. If there is a transport issue and the cmd times out and the abort does as well or fails, then we would start the scsi eh recovery. When the driver/transport class figures out it's a transport issue we will block the scsi_device. So before this patch we would requeue the cmd then we would wait until the fast_io_fail (FC) or replacement_timer (iscsi) to fire before failing commands upwards and failing paths. With this patch we can end up doing 2 or 3 things depending on timing: 1. If the cmd hits the code above we will fail the command and cause a path failure. 2. If driver/transport blocks the scsi_device first then we would hit the scsi_device state check in scsi_queue_rq and not fail the path like before. 3. With or without your patch, if dm_mq_queue_rq calls the busy callback first (this does blk_lld_busy -> scsi_mq_lld_busy -> scsi_host_in_recovery) then it might see all the paths in recovery. It considers this a temp condition and will requeue cmds. So in this case we will not fail the path. I'm not sure what the correct fix is. Maybe not fast fail REQ_FAILFAST_TRANSPORT above, or maybe add a new FAILFAST type that you could use?
On 10/3/22 10:27, Mike Christie wrote: > On 9/29/22 5:00 PM, Bart Van Assche wrote: >> The current behavior for SCSI commands submitted while error recovery >> is ongoing is to retry command submission after error recovery has >> finished. See also the scsi_host_in_recovery() check in >> scsi_host_queue_ready(). Add support for failing SCSI commands while >> host recovery is in progress. This functionality will be used to fix a >> deadlock in the UFS driver. >> >> Cc: Christoph Hellwig <hch@lst.de> >> Cc: Ming Lei <ming.lei@redhat.com> >> Cc: John Garry <john.garry@huawei.com> >> Cc: Mike Christie <michael.christie@oracle.com> >> Cc: Hannes Reinecke <hare@suse.de> >> Signed-off-by: Bart Van Assche <bvanassche@acm.org> >> --- >> drivers/scsi/scsi_lib.c | 8 +++++--- >> 1 file changed, 5 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c >> index 473d9403f0c1..ecd2ce3815df 100644 >> --- a/drivers/scsi/scsi_lib.c >> +++ b/drivers/scsi/scsi_lib.c >> @@ -1337,9 +1337,6 @@ static inline int scsi_host_queue_ready(struct request_queue *q, >> struct scsi_device *sdev, >> struct scsi_cmnd *cmd) >> { >> - if (scsi_host_in_recovery(shost)) >> - return 0; >> - >> if (atomic_read(&shost->host_blocked) > 0) { >> if (scsi_host_busy(shost) > 0) >> goto starved; >> @@ -1727,6 +1724,11 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx, >> ret = BLK_STS_RESOURCE; >> if (!scsi_target_queue_ready(shost, sdev)) >> goto out_put_budget; >> + if (unlikely(scsi_host_in_recovery(shost))) { >> + if (req->cmd_flags & REQ_FAILFAST_MASK) >> + ret = BLK_STS_OFFLINE; >> + goto out_dec_target_busy; >> + } >> if (!scsi_host_queue_ready(q, shost, sdev, cmd)) >> goto out_dec_target_busy; >> > > This might add a regression to dm-multipath or it at least makes > the behavior difficult to understand for users and support people. > > If there is a transport issue and the cmd times out and the abort > does as well or fails, then we would start the scsi eh recovery. When the > driver/transport class figures out it's a transport issue we will block > the scsi_device. So before this patch we would requeue the cmd then we > would wait until the fast_io_fail (FC) or replacement_timer (iscsi) to > fire before failing commands upwards and failing paths. > > With this patch we can end up doing 2 or 3 things depending on timing: > 1. If the cmd hits the code above we will fail the command and cause a > path failure. > > 2. If driver/transport blocks the scsi_device first then we would hit the > scsi_device state check in scsi_queue_rq and not fail the path like before. > > 3. With or without your patch, if dm_mq_queue_rq calls the busy callback > first (this does blk_lld_busy -> scsi_mq_lld_busy -> scsi_host_in_recovery) > then it might see all the paths in recovery. It considers this a temp condition > and will requeue cmds. So in this case we will not fail the path. > > I'm not sure what the correct fix is. Maybe not fast fail REQ_FAILFAST_TRANSPORT > above, or maybe add a new FAILFAST type that you could use? Hi Mike, I will look into introducing a new request flag for this purpose. Thanks, Bart.
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 473d9403f0c1..ecd2ce3815df 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1337,9 +1337,6 @@ static inline int scsi_host_queue_ready(struct request_queue *q, struct scsi_device *sdev, struct scsi_cmnd *cmd) { - if (scsi_host_in_recovery(shost)) - return 0; - if (atomic_read(&shost->host_blocked) > 0) { if (scsi_host_busy(shost) > 0) goto starved; @@ -1727,6 +1724,11 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx, ret = BLK_STS_RESOURCE; if (!scsi_target_queue_ready(shost, sdev)) goto out_put_budget; + if (unlikely(scsi_host_in_recovery(shost))) { + if (req->cmd_flags & REQ_FAILFAST_MASK) + ret = BLK_STS_OFFLINE; + goto out_dec_target_busy; + } if (!scsi_host_queue_ready(q, shost, sdev, cmd)) goto out_dec_target_busy;
The current behavior for SCSI commands submitted while error recovery is ongoing is to retry command submission after error recovery has finished. See also the scsi_host_in_recovery() check in scsi_host_queue_ready(). Add support for failing SCSI commands while host recovery is in progress. This functionality will be used to fix a deadlock in the UFS driver. Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mike Christie <michael.christie@oracle.com> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> --- drivers/scsi/scsi_lib.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)