mbox series

[v2,00/11] blk-mq: Reduce static requests memory footprint for shared sbitmap

Message ID 1628519378-211232-1-git-send-email-john.garry@huawei.com
Headers show
Series blk-mq: Reduce static requests memory footprint for shared sbitmap | expand

Message

John Garry Aug. 9, 2021, 2:29 p.m. UTC
Currently a full set of static requests are allocated per hw queue per
tagset when shared sbitmap is used.

However, only tagset->queue_depth number of requests may be active at
any given time. As such, only tagset->queue_depth number of static
requests are required.

The same goes for using an IO scheduler, which allocates a full set of
static requests per hw queue per request queue.

This series changes shared sbitmap support by using a shared tags per
tagset and request queue. Ming suggested something along those lines in
v1. But we'll keep name "shared sbitmap" name as it is fimilar. In using
a shared tags, the static rqs also become shared, reducing the number of
sets of static rqs, reducing memory usage.

Patch "blk-mq: Use shared tags for shared sbitmap support" is a bit big,
and could be broken down. But then maintaining ability to bisect
becomes harder and each sub-patch would get more convoluted.

For megaraid sas driver on my 128-CPU arm64 system with 1x SATA disk, we
save approx. 300MB(!) [370MB -> 60MB]

Baseline is 2112f5c1330a (for-5.15/block) loop: Select I/O scheduler ...

Changes since v1:
- Switch to use blk_mq_tags for shared sbitmap
- Add new blk_mq_ops init request callback
- Add some RB tags (thanks!)

John Garry (11):
  blk-mq: Change rqs check in blk_mq_free_rqs()
  block: Rename BLKDEV_MAX_RQ -> BLKDEV_DEFAULT_RQ
  blk-mq: Relocate shared sbitmap resize in blk_mq_update_nr_requests()
  blk-mq: Invert check in blk_mq_update_nr_requests()
  blk-mq-sched: Rename blk_mq_sched_alloc_{tags -> map_and_request}()
  blk-mq: Pass driver tags to blk_mq_clear_rq_mapping()
  blk-mq: Add blk_mq_tag_update_sched_shared_sbitmap()
  blk-mq: Add blk_mq_ops.init_request_no_hctx()
  scsi: Set blk_mq_ops.init_request_no_hctx
  blk-mq: Use shared tags for shared sbitmap support
  blk-mq: Stop using pointers for blk_mq_tags bitmap tags

 block/bfq-iosched.c      |   4 +-
 block/blk-core.c         |   2 +-
 block/blk-mq-debugfs.c   |   8 +--
 block/blk-mq-sched.c     |  92 +++++++++++++------------
 block/blk-mq-sched.h     |   2 +-
 block/blk-mq-tag.c       | 111 +++++++++++++-----------------
 block/blk-mq-tag.h       |  12 ++--
 block/blk-mq.c           | 144 +++++++++++++++++++++++----------------
 block/blk-mq.h           |   8 +--
 block/kyber-iosched.c    |   4 +-
 block/mq-deadline-main.c |   2 +-
 drivers/block/rbd.c      |   2 +-
 drivers/scsi/scsi_lib.c  |   6 +-
 include/linux/blk-mq.h   |  20 +++---
 include/linux/blkdev.h   |   5 +-
 15 files changed, 218 insertions(+), 204 deletions(-)

-- 
2.26.2

Comments

John Garry Aug. 17, 2021, 7:02 a.m. UTC | #1
On 09/08/2021 15:29, John Garry wrote:

Hi guys,

Any chance to review or test this?

Thanks!

> Currently a full set of static requests are allocated per hw queue per

> tagset when shared sbitmap is used.

> 

> However, only tagset->queue_depth number of requests may be active at

> any given time. As such, only tagset->queue_depth number of static

> requests are required.

> 

> The same goes for using an IO scheduler, which allocates a full set of

> static requests per hw queue per request queue.

> 

> This series changes shared sbitmap support by using a shared tags per

> tagset and request queue. Ming suggested something along those lines in

> v1. But we'll keep name "shared sbitmap" name as it is fimilar. In using

> a shared tags, the static rqs also become shared, reducing the number of

> sets of static rqs, reducing memory usage.

> 

> Patch "blk-mq: Use shared tags for shared sbitmap support" is a bit big,

> and could be broken down. But then maintaining ability to bisect

> becomes harder and each sub-patch would get more convoluted.

> 

> For megaraid sas driver on my 128-CPU arm64 system with 1x SATA disk, we

> save approx. 300MB(!) [370MB -> 60MB]

> 

> Baseline is 2112f5c1330a (for-5.15/block) loop: Select I/O scheduler ...

> 

> Changes since v1:

> - Switch to use blk_mq_tags for shared sbitmap

> - Add new blk_mq_ops init request callback

> - Add some RB tags (thanks!)

> 

> John Garry (11):

>    blk-mq: Change rqs check in blk_mq_free_rqs()

>    block: Rename BLKDEV_MAX_RQ -> BLKDEV_DEFAULT_RQ

>    blk-mq: Relocate shared sbitmap resize in blk_mq_update_nr_requests()

>    blk-mq: Invert check in blk_mq_update_nr_requests()

>    blk-mq-sched: Rename blk_mq_sched_alloc_{tags -> map_and_request}()

>    blk-mq: Pass driver tags to blk_mq_clear_rq_mapping()

>    blk-mq: Add blk_mq_tag_update_sched_shared_sbitmap()

>    blk-mq: Add blk_mq_ops.init_request_no_hctx()

>    scsi: Set blk_mq_ops.init_request_no_hctx

>    blk-mq: Use shared tags for shared sbitmap support

>    blk-mq: Stop using pointers for blk_mq_tags bitmap tags

> 

>   block/bfq-iosched.c      |   4 +-

>   block/blk-core.c         |   2 +-

>   block/blk-mq-debugfs.c   |   8 +--

>   block/blk-mq-sched.c     |  92 +++++++++++++------------

>   block/blk-mq-sched.h     |   2 +-

>   block/blk-mq-tag.c       | 111 +++++++++++++-----------------

>   block/blk-mq-tag.h       |  12 ++--

>   block/blk-mq.c           | 144 +++++++++++++++++++++++----------------

>   block/blk-mq.h           |   8 +--

>   block/kyber-iosched.c    |   4 +-

>   block/mq-deadline-main.c |   2 +-

>   drivers/block/rbd.c      |   2 +-

>   drivers/scsi/scsi_lib.c  |   6 +-

>   include/linux/blk-mq.h   |  20 +++---

>   include/linux/blkdev.h   |   5 +-

>   15 files changed, 218 insertions(+), 204 deletions(-)

>