mbox series

[0/5] blk-mq: quiesce improvement

Message ID 20211119021849.2259254-1-ming.lei@redhat.com
Headers show
Series blk-mq: quiesce improvement | expand

Message

Ming Lei Nov. 19, 2021, 2:18 a.m. UTC
Hi,

The 1st two patches moves srcu from blk_mq_hw_ctx to request_queue.

The other patches add one new helper for supporting quiesce in parallel.


Ming Lei (5):
  blk-mq: move srcu from blk_mq_hw_ctx to request_queue
  blk-mq: rename hctx_lock & hctx_unlock
  blk-mq: add helper of blk_mq_global_quiesce_wait()
  nvme: quiesce namespace queue in parallel
  scsi: use blk-mq quiesce APIs to implement scsi_host_block

 block/blk-core.c         | 23 +++++++++---
 block/blk-mq-sysfs.c     |  2 --
 block/blk-mq.c           | 78 +++++++++++++++++-----------------------
 block/blk-sysfs.c        |  3 +-
 block/blk.h              | 10 +++++-
 block/genhd.c            |  2 +-
 drivers/nvme/host/core.c |  9 +++--
 drivers/scsi/scsi_lib.c  | 16 ++++-----
 include/linux/blk-mq.h   | 21 ++++++-----
 include/linux/blkdev.h   |  8 +++++
 10 files changed, 98 insertions(+), 74 deletions(-)

Comments

Sagi Grimberg Dec. 8, 2021, 12:49 p.m. UTC | #1
On 11/30/21 4:33 AM, Ming Lei wrote:
> On Tue, Nov 23, 2021 at 11:00:45AM +0200, Sagi Grimberg wrote:
>>
>>>>>>> Add helper of blk_mq_global_quiesce_wait() for supporting to quiesce
>>>>>>> queues in parallel, then we can just wait once if global quiesce wait
>>>>>>> is allowed.
>>>>>>
>>>>>> blk_mq_global_quiesce_wait() is a poor name... global is scope-less and
>>>>>> obviously it has a scope.
>>>>>
>>>>> How about blk_mq_shared_quiesce_wait()? or any suggestion?
>>>>
>>>> Shared between what?
>>>
>>> All request queues in one host-wide, both scsi and nvme has such
>>> requirement.
>>>
>>>>
>>>> Maybe if the queue has a non-blocking tagset, it can have a "quiesced"
>>>> flag that is cleared in unquiesce? then the callers can just continue
>>>> to iterate but will only wait the rcu grace period once.
>>>
>>> Yeah, that is what these patches try to implement.
>>
>> I was suggesting to "hide" it in the interface.
>> Maybe something like:
>> --
>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>> index 8799fa73ef34..627b631db1f9 100644
>> --- a/block/blk-mq.c
>> +++ b/block/blk-mq.c
>> @@ -263,14 +263,18 @@ void blk_mq_wait_quiesce_done(struct request_queue *q)
>>          unsigned int i;
>>          bool rcu = false;
>>
>> +       if (!q->has_srcu && q->quiesced)
>> +               return;
>>          queue_for_each_hw_ctx(q, hctx, i) {
>>                  if (hctx->flags & BLK_MQ_F_BLOCKING)
>>                          synchronize_srcu(hctx->srcu);
>>                  else
>>                          rcu = true;
>>          }
>> -       if (rcu)
>> +       if (rcu) {
>>                  synchronize_rcu();
>> +               q->quiesced = true;
>> +       }
>>   }
>>   EXPORT_SYMBOL_GPL(blk_mq_wait_quiesce_done);
>>
>> @@ -308,6 +312,7 @@ void blk_mq_unquiesce_queue(struct request_queue *q)
>>          } else if (!--q->quiesce_depth) {
>>                  blk_queue_flag_clear(QUEUE_FLAG_QUIESCED, q);
>>                  run_queue = true;
>> +               q->quiesced = false;
> 
> Different request queues are passed to blk_mq_wait_quiesce_done() during
> the iteration, so marking 'quiesced' doesn't make any difference here.

I actually meant q->tag_set->quiesced, such that the flag will be
used in the tag_set reference. This way this sharing will be kept
hidden from the callers.
Ming Lei Dec. 10, 2021, 2:02 a.m. UTC | #2
On Wed, Dec 08, 2021 at 02:49:25PM +0200, Sagi Grimberg wrote:
> 
> 
> On 11/30/21 4:33 AM, Ming Lei wrote:
> > On Tue, Nov 23, 2021 at 11:00:45AM +0200, Sagi Grimberg wrote:
> > > 
> > > > > > > > Add helper of blk_mq_global_quiesce_wait() for supporting to quiesce
> > > > > > > > queues in parallel, then we can just wait once if global quiesce wait
> > > > > > > > is allowed.
> > > > > > > 
> > > > > > > blk_mq_global_quiesce_wait() is a poor name... global is scope-less and
> > > > > > > obviously it has a scope.
> > > > > > 
> > > > > > How about blk_mq_shared_quiesce_wait()? or any suggestion?
> > > > > 
> > > > > Shared between what?
> > > > 
> > > > All request queues in one host-wide, both scsi and nvme has such
> > > > requirement.
> > > > 
> > > > > 
> > > > > Maybe if the queue has a non-blocking tagset, it can have a "quiesced"
> > > > > flag that is cleared in unquiesce? then the callers can just continue
> > > > > to iterate but will only wait the rcu grace period once.
> > > > 
> > > > Yeah, that is what these patches try to implement.
> > > 
> > > I was suggesting to "hide" it in the interface.
> > > Maybe something like:
> > > --
> > > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > > index 8799fa73ef34..627b631db1f9 100644
> > > --- a/block/blk-mq.c
> > > +++ b/block/blk-mq.c
> > > @@ -263,14 +263,18 @@ void blk_mq_wait_quiesce_done(struct request_queue *q)
> > >          unsigned int i;
> > >          bool rcu = false;
> > > 
> > > +       if (!q->has_srcu && q->quiesced)
> > > +               return;
> > >          queue_for_each_hw_ctx(q, hctx, i) {
> > >                  if (hctx->flags & BLK_MQ_F_BLOCKING)
> > >                          synchronize_srcu(hctx->srcu);
> > >                  else
> > >                          rcu = true;
> > >          }
> > > -       if (rcu)
> > > +       if (rcu) {
> > >                  synchronize_rcu();
> > > +               q->quiesced = true;
> > > +       }
> > >   }
> > >   EXPORT_SYMBOL_GPL(blk_mq_wait_quiesce_done);
> > > 
> > > @@ -308,6 +312,7 @@ void blk_mq_unquiesce_queue(struct request_queue *q)
> > >          } else if (!--q->quiesce_depth) {
> > >                  blk_queue_flag_clear(QUEUE_FLAG_QUIESCED, q);
> > >                  run_queue = true;
> > > +               q->quiesced = false;
> > 
> > Different request queues are passed to blk_mq_wait_quiesce_done() during
> > the iteration, so marking 'quiesced' doesn't make any difference here.
> 
> I actually meant q->tag_set->quiesced, such that the flag will be
> used in the tag_set reference. This way this sharing will be kept
> hidden from the callers.

That looks easy, but not true really from API viewpoint, q->tag_set has different
lifetime which is covered by driver, not mention other request queues may touch
q->tag_set->quiesced from other code path, so things will become much complicated,
also what is the meaning of quiesced state for tagset wide?

Here I prefer to philosophy of 'Worse is better'[1], and simplicity over perfection
by adding one simple helper of blk_mq_shared_quiesce_wait() for improving the case
very easily.

[1] ihttps://ipfs.fleek.co/ipfs/QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco/wiki/Unix_philosophy.html


Thanks,
Ming