mbox series

[V2,0/9] blk-mq: fix wrong queue mapping for kdump kernel

Message ID 20230726094027.535126-1-ming.lei@redhat.com
Headers show
Series blk-mq: fix wrong queue mapping for kdump kernel | expand

Message

Ming Lei July 26, 2023, 9:40 a.m. UTC
Hi,

On arm and ppc64, 'maxcpus=1' is required for kdump kernel,
see `Documentation/admin-guide/kdump/kdump.rst`, so num_possible_cpus()
still returns all CPUs because 'maxcpus=1' just bring up one single
cpu core during booting.

blk-mq sees single queue in kdump kernel, and in driver's viewpoint
there are still multiple queues, this inconsistency causes driver to apply
wrong queue mapping for handling IO, and IO timeout is triggered.

This issue is only triggered on managed irq in case of multiple hw
queues. Some drivers takes online cpus into account for nr_hw_queues,
and don't have such issue, such as nvme rdma/tcp.

Meantime, single queue makes much less resource utilization, and reduce
risk of kernel failure.

V2:
	- add helper of scsi_max_nr_hw_queues() for avoiding potential build
	failure because scsi driver often doesn't deal with blk-mq directly
	- apply scsi_max_nr_hw_queues() for all scsi changes
	- move lpfc's change into managed irq code path


Thanks,
Ming

Ming Lei (9):
  blk-mq: add blk_mq_max_nr_hw_queues()
  nvme-pci: use blk_mq_max_nr_hw_queues() to calculate io queues
  scsi: core: add helper of scsi_max_nr_hw_queues()
  scsi: lpfc: use blk_mq_max_nr_hw_queues() to calculate io vectors
  scsi: hisi: take blk_mq_max_nr_hw_queues() into account for
    calculating io vectors
  scsi: mpi3mr: take blk_mq_max_nr_hw_queues() into account for
    calculating io vectors
  scsi: megaraid: take blk_mq_max_nr_hw_queues() into account for
    calculating io vectors
  scsi: mpt3sas: take blk_mq_max_nr_hw_queues() into account for
    calculating io vectors
  scsi: pm8001: take blk_mq_max_nr_hw_queues() into account for
    calculating io vectors

 block/blk-mq.c                            | 16 ++++++++++++++++
 drivers/nvme/host/pci.c                   |  2 +-
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c    |  3 +++
 drivers/scsi/lpfc/lpfc_init.c             |  2 ++
 drivers/scsi/megaraid/megaraid_sas_base.c |  6 +++++-
 drivers/scsi/mpi3mr/mpi3mr_fw.c           |  3 +++
 drivers/scsi/mpt3sas/mpt3sas_base.c       |  4 ++--
 drivers/scsi/pm8001/pm8001_init.c         |  4 +++-
 include/linux/blk-mq.h                    |  1 +
 include/scsi/scsi_host.h                  |  5 +++++
 10 files changed, 41 insertions(+), 5 deletions(-)

Comments

Christoph Hellwig July 31, 2023, 7:14 a.m. UTC | #1
On Wed, Jul 26, 2023 at 05:40:18PM +0800, Ming Lei wrote:
> Hi,
> 
> On arm and ppc64, 'maxcpus=1' is required for kdump kernel,
> see `Documentation/admin-guide/kdump/kdump.rst`, so num_possible_cpus()
> still returns all CPUs because 'maxcpus=1' just bring up one single
> cpu core during booting.

Just as said last time:  Please drop the odd is_kdump_kernel checks
in blk_mq_update_queue_map and blk_mq_alloc_tag_set instead of sprinkling
this crap even further.