Message ID | 20230517230927.1091124-5-bvanassche@acm.org |
---|---|
State | New |
Headers | show |
Series | SCSI core patches | expand |
On Wed, May 17, 2023 at 04:09:27PM -0700, Bart Van Assche wrote: > Tell the block layer to rerun the queue after a delay instead of > immediately if the SCSI LLD decided to block the host. > > Note: neither scsi_run_host_queues() nor scsi_run_queue() are called > from the fast path. > > Reviewed-by: Christoph Hellwig <hch@lst.de> > Cc: Ming Lei <ming.lei@redhat.com> > Cc: Hannes Reinecke <hare@suse.de> > Cc: John Garry <john.g.garry@oracle.com> > Cc: Mike Christie <michael.christie@oracle.com> > Signed-off-by: Bart Van Assche <bvanassche@acm.org> > --- > drivers/scsi/scsi_lib.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > index e4f34217b879..b37718ebbbfc 100644 > --- a/drivers/scsi/scsi_lib.c > +++ b/drivers/scsi/scsi_lib.c > @@ -1767,7 +1767,7 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx, > break; > case BLK_STS_RESOURCE: > case BLK_STS_ZONE_RESOURCE: > - if (scsi_device_blocked(sdev)) > + if (scsi_device_blocked(sdev) || shost->host_self_blocked) > ret = BLK_STS_DEV_RESOURCE; What if scsi_unblock_requests() is just called after the above check and before returning to block layer core? Then this request is invisible to scsi_run_host_queues()<-scsi_unblock_requests(), and io hang happens. Thanks, Ming
On 5/17/23 18:16, Ming Lei wrote: > On Wed, May 17, 2023 at 04:09:27PM -0700, Bart Van Assche wrote: >> @@ -1767,7 +1767,7 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx, >> break; >> case BLK_STS_RESOURCE: >> case BLK_STS_ZONE_RESOURCE: >> - if (scsi_device_blocked(sdev)) >> + if (scsi_device_blocked(sdev) || shost->host_self_blocked) >> ret = BLK_STS_DEV_RESOURCE; > > What if scsi_unblock_requests() is just called after the above check and > before returning to block layer core? Then this request is invisible to > scsi_run_host_queues()<-scsi_unblock_requests(), and io hang happens. If returning BLK_STS_DEV_RESOURCE could cause an I/O hang, wouldn't that be a bug in the block layer core? Isn't the block layer core expected to rerun the queue after a delay if a block driver returns BLK_STS_DEV_RESOURCE? See also blk_mq_dispatch_rq_list(). Thanks, Bart.
On 5/17/23 19:39, Ming Lei wrote: > In reality, it can be full of race, maybe we can just remove > BLK_STS_DEV_RESOURCE. We need a better solution than changing BLK_STS_DEV_RESOURCE into BLK_STS_RESOURCE. Otherwise power usage will go up in battery-powered devices that use block drivers that return BLK_STS_DEV_RESOURCE today. Thanks, Bart.
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index e4f34217b879..b37718ebbbfc 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1767,7 +1767,7 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx, break; case BLK_STS_RESOURCE: case BLK_STS_ZONE_RESOURCE: - if (scsi_device_blocked(sdev)) + if (scsi_device_blocked(sdev) || shost->host_self_blocked) ret = BLK_STS_DEV_RESOURCE; break; case BLK_STS_AGAIN: