diff mbox series

[V2,04/13] block/wbt: fix negative inflight counter when remove scsi device

Message ID 20220122111054.1126146-5-ming.lei@redhat.com
State New
Headers show
Series block: don't drain file system I/O on del_gendisk | expand

Commit Message

Ming Lei Jan. 22, 2022, 11:10 a.m. UTC
From: Laibin Qiu <qiulaibin@huawei.com>

Now that we disable wbt by set WBT_STATE_OFF_DEFAULT in
wbt_disable_default() when switch elevator to bfq. And when
we remove scsi device, wbt will be enabled by wbt_enable_default.
If it become false positive between wbt_wait() and wbt_track()
when submit write request.

The following is the scenario that triggered the problem.

T1                          T2                           T3
                            elevator_switch_mq
                            bfq_init_queue
                            wbt_disable_default <= Set
                            rwb->enable_state (OFF)
Submit_bio
blk_mq_make_request
rq_qos_throttle
<= rwb->enable_state (OFF)
                                                         scsi_remove_device
                                                         sd_remove
                                                         del_gendisk
                                                         blk_unregister_queue
                                                         elv_unregister_queue
                                                         wbt_enable_default
                                                         <= Set rwb->enable_state (ON)
q_qos_track
<= rwb->enable_state (ON)
^^^^^^ this request will mark WBT_TRACKED without inflight add and will
lead to drop rqw->inflight to -1 in wbt_done() which will trigger IO hung.

Fix this by move wbt_enable_default() from elv_unregister to
bfq_exit_queue(). Only re-enable wbt when bfq exit.

Fixes: 76a8040817b4b ("blk-wbt: make sure throttle is enabled properly")

Remove oneline stale comment, and kill one oneshot local variable.

Signed-off-by: Ming Lei <ming.lei@rehdat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/linux-block/20211214133103.551813-1-qiulaibin@huawei.com/
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
---
 block/bfq-iosched.c | 2 ++
 block/elevator.c    | 2 --
 2 files changed, 2 insertions(+), 2 deletions(-)

Comments

Christoph Hellwig Feb. 17, 2022, 7:45 a.m. UTC | #1
Jens, can you pick this up for the block-5.17 tree?

On Sat, Jan 22, 2022 at 07:10:45PM +0800, Ming Lei wrote:
> From: Laibin Qiu <qiulaibin@huawei.com>
> 
> Now that we disable wbt by set WBT_STATE_OFF_DEFAULT in
> wbt_disable_default() when switch elevator to bfq. And when
> we remove scsi device, wbt will be enabled by wbt_enable_default.
> If it become false positive between wbt_wait() and wbt_track()
> when submit write request.
> 
> The following is the scenario that triggered the problem.
> 
> T1                          T2                           T3
>                             elevator_switch_mq
>                             bfq_init_queue
>                             wbt_disable_default <= Set
>                             rwb->enable_state (OFF)
> Submit_bio
> blk_mq_make_request
> rq_qos_throttle
> <= rwb->enable_state (OFF)
>                                                          scsi_remove_device
>                                                          sd_remove
>                                                          del_gendisk
>                                                          blk_unregister_queue
>                                                          elv_unregister_queue
>                                                          wbt_enable_default
>                                                          <= Set rwb->enable_state (ON)
> q_qos_track
> <= rwb->enable_state (ON)
> ^^^^^^ this request will mark WBT_TRACKED without inflight add and will
> lead to drop rqw->inflight to -1 in wbt_done() which will trigger IO hung.
> 
> Fix this by move wbt_enable_default() from elv_unregister to
> bfq_exit_queue(). Only re-enable wbt when bfq exit.
> 
> Fixes: 76a8040817b4b ("blk-wbt: make sure throttle is enabled properly")
> 
> Remove oneline stale comment, and kill one oneshot local variable.
> 
> Signed-off-by: Ming Lei <ming.lei@rehdat.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Link: https://lore.kernel.org/linux-block/20211214133103.551813-1-qiulaibin@huawei.com/
> Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
> ---
>  block/bfq-iosched.c | 2 ++
>  block/elevator.c    | 2 --
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
> index 0c612a911696..36a66e97e3c2 100644
> --- a/block/bfq-iosched.c
> +++ b/block/bfq-iosched.c
> @@ -7018,6 +7018,8 @@ static void bfq_exit_queue(struct elevator_queue *e)
>  	spin_unlock_irq(&bfqd->lock);
>  #endif
>  
> +	wbt_enable_default(bfqd->queue);
> +
>  	kfree(bfqd);
>  }
>  
> diff --git a/block/elevator.c b/block/elevator.c
> index ec98aed39c4f..482df2a350fc 100644
> --- a/block/elevator.c
> +++ b/block/elevator.c
> @@ -525,8 +525,6 @@ void elv_unregister_queue(struct request_queue *q)
>  		kobject_del(&e->kobj);
>  
>  		e->registered = 0;
> -		/* Re-enable throttling in case elevator disabled it */
> -		wbt_enable_default(q);
>  	}
>  }
>  
> -- 
> 2.31.1
---end quoted text---
Jens Axboe Feb. 17, 2022, 2:53 p.m. UTC | #2
On 2/17/22 12:45 AM, Christoph Hellwig wrote:
> Jens, can you pick this up for the block-5.17 tree?

Done
diff mbox series

Patch

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 0c612a911696..36a66e97e3c2 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -7018,6 +7018,8 @@  static void bfq_exit_queue(struct elevator_queue *e)
 	spin_unlock_irq(&bfqd->lock);
 #endif
 
+	wbt_enable_default(bfqd->queue);
+
 	kfree(bfqd);
 }
 
diff --git a/block/elevator.c b/block/elevator.c
index ec98aed39c4f..482df2a350fc 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -525,8 +525,6 @@  void elv_unregister_queue(struct request_queue *q)
 		kobject_del(&e->kobj);
 
 		e->registered = 0;
-		/* Re-enable throttling in case elevator disabled it */
-		wbt_enable_default(q);
 	}
 }