From patchwork Thu Apr 24 18:19:40 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 884270 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D65FC28F924; Thu, 24 Apr 2025 18:19:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745518789; cv=none; b=aL7DIueIGb+mMQhKBQFP3etaVmCdM3K9AZJtJAqnod11KmrYLSm3Hz8kSruPwZ0qeZ/P+ZPrxkKgZyy2nANbOcxweiFsU/WhoclmLZsCX66Ob/y1YoC203aQ+f6RD9I/Xl8CA8vSIPNGTE/cucWqKyTewjQaktoqzviKcvdMK+M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745518789; c=relaxed/simple; bh=p70xZ9qUBuN2+D/7bMS3cfL1TjJS82LHpJ46TrUBRPE=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=DuNTeOEqS006I75YBP7LxhQONqSEZ2jlMtESCZ7jqmx9K1erKCuf78fF1TVL3LSHbpc42wNzoFAweHSwjOrjN/jKCDXpfez2Wk/1tCaV5nUx9zgxhe290sUWfC/3n7enF5DLvuMHYZBN1xAufvCdvssgLSAQQIJ8GWxGdo/r97A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=mekwy8uI; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="mekwy8uI" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1EB51C4CEE3; Thu, 24 Apr 2025 18:19:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1745518788; bh=p70xZ9qUBuN2+D/7bMS3cfL1TjJS82LHpJ46TrUBRPE=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=mekwy8uIEzYOk5n29dEvX2S/0t/QuDV9f4W9rwM2gf8YJezsAV6Jd/3WUGG3EToQg JEzwlzVF8baYWN71AHvMaO442PMH2opaeEbAOXBzzxSRd2L68kL+s52lJTLu9/nxEr KQuDU8V9iTVq/7DBr8q96WqPgI+fn4EH4dNaqKQEAIbETtftI/iLW2fpwpUjX9f2VY Qk9N+Ny8MiHCY8a0O7RlgEr0SRTCwGlrHeiALfx2JnppGG58dIM3rv4FarGyiW2l+a 5J2zrvXuJdXLc1+1HRb/jxCD5ty5Gw1MmpgZv+eby/sQYMTTufFKkmomD9jcID6PUm 6+FC5xXADXIXg== From: Daniel Wagner Date: Thu, 24 Apr 2025 20:19:40 +0200 Subject: [PATCH v6 1/9] lib/group_cpus: let group_cpu_evenly return number initialized masks Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250424-isolcpus-io-queues-v6-1-9a53a870ca1f@kernel.org> References: <20250424-isolcpus-io-queues-v6-0-9a53a870ca1f@kernel.org> In-Reply-To: <20250424-isolcpus-io-queues-v6-0-9a53a870ca1f@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , "Michael S. Tsirkin" Cc: "Martin K. Petersen" , Thomas Gleixner , Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , Mathieu Desnoyers , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, GR-QLogic-Storage-Upstream@marvell.com, Daniel Wagner X-Mailer: b4 0.14.2 group_cpu_evenly might allocated less groups then the requested: group_cpu_evenly __group_cpus_evenly alloc_nodes_groups # allocated total groups may be less than numgrps when # active total CPU number is less then numgrps In this case, the caller will do an out of bound access because the caller assumes the masks returned has numgrps. Return the number of groups created so the caller can limit the access range accordingly. Reviewed-by: Hannes Reinecke Signed-off-by: Daniel Wagner --- block/blk-mq-cpumap.c | 6 +++--- drivers/virtio/virtio_vdpa.c | 9 +++++---- fs/fuse/virtio_fs.c | 6 +++--- include/linux/group_cpus.h | 3 ++- kernel/irq/affinity.c | 9 +++++---- lib/group_cpus.c | 12 +++++++++--- 6 files changed, 27 insertions(+), 18 deletions(-) diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c index 444798c5374f48088b661b519f2638bda8556cf2..269161252add756897fce1b65cae5b2e6aebd647 100644 --- a/block/blk-mq-cpumap.c +++ b/block/blk-mq-cpumap.c @@ -19,9 +19,9 @@ void blk_mq_map_queues(struct blk_mq_queue_map *qmap) { const struct cpumask *masks; - unsigned int queue, cpu; + unsigned int queue, cpu, nr_masks; - masks = group_cpus_evenly(qmap->nr_queues); + masks = group_cpus_evenly(qmap->nr_queues, &nr_masks); if (!masks) { for_each_possible_cpu(cpu) qmap->mq_map[cpu] = qmap->queue_offset; @@ -29,7 +29,7 @@ void blk_mq_map_queues(struct blk_mq_queue_map *qmap) } for (queue = 0; queue < qmap->nr_queues; queue++) { - for_each_cpu(cpu, &masks[queue]) + for_each_cpu(cpu, &masks[queue % nr_masks]) qmap->mq_map[cpu] = qmap->queue_offset + queue; } kfree(masks); diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c index 1f60c9d5cb1810a6f208c24bb2ac640d537391a0..a7b297dae4890c9d6002744b90fc133bbedb7b44 100644 --- a/drivers/virtio/virtio_vdpa.c +++ b/drivers/virtio/virtio_vdpa.c @@ -329,20 +329,21 @@ create_affinity_masks(unsigned int nvecs, struct irq_affinity *affd) for (i = 0, usedvecs = 0; i < affd->nr_sets; i++) { unsigned int this_vecs = affd->set_size[i]; + unsigned int nr_masks; int j; - struct cpumask *result = group_cpus_evenly(this_vecs); + struct cpumask *result = group_cpus_evenly(this_vecs, &nr_masks); if (!result) { kfree(masks); return NULL; } - for (j = 0; j < this_vecs; j++) + for (j = 0; j < nr_masks; j++) cpumask_copy(&masks[curvec + j], &result[j]); kfree(result); - curvec += this_vecs; - usedvecs += this_vecs; + curvec += nr_masks; + usedvecs += nr_masks; } /* Fill out vectors at the end that don't need affinity */ diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c index 2c7b24cb67adb2cb329ed545f56f04700aca8b81..7ed43b9ea4f3f8b108f1e0d7050c27267b9941c9 100644 --- a/fs/fuse/virtio_fs.c +++ b/fs/fuse/virtio_fs.c @@ -862,7 +862,7 @@ static void virtio_fs_requests_done_work(struct work_struct *work) static void virtio_fs_map_queues(struct virtio_device *vdev, struct virtio_fs *fs) { const struct cpumask *mask, *masks; - unsigned int q, cpu; + unsigned int q, cpu, nr_masks; /* First attempt to map using existing transport layer affinities * e.g. PCIe MSI-X @@ -882,7 +882,7 @@ static void virtio_fs_map_queues(struct virtio_device *vdev, struct virtio_fs *f return; fallback: /* Attempt to map evenly in groups over the CPUs */ - masks = group_cpus_evenly(fs->num_request_queues); + masks = group_cpus_evenly(fs->num_request_queues, &nr_masks); /* If even this fails we default to all CPUs use first request queue */ if (!masks) { for_each_possible_cpu(cpu) @@ -891,7 +891,7 @@ static void virtio_fs_map_queues(struct virtio_device *vdev, struct virtio_fs *f } for (q = 0; q < fs->num_request_queues; q++) { - for_each_cpu(cpu, &masks[q]) + for_each_cpu(cpu, &masks[q % nr_masks]) fs->mq_map[cpu] = q + VQ_REQUEST; } kfree(masks); diff --git a/include/linux/group_cpus.h b/include/linux/group_cpus.h index e42807ec61f6e8cf3787af7daa0d8686edfef0a3..bd5dada6e8606fa6cf8f7babf939e39fd7475c8d 100644 --- a/include/linux/group_cpus.h +++ b/include/linux/group_cpus.h @@ -9,6 +9,7 @@ #include #include -struct cpumask *group_cpus_evenly(unsigned int numgrps); +struct cpumask *group_cpus_evenly(unsigned int numgrps, + unsigned int *nummasks); #endif diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c index 44a4eba80315cc098ecfa366ca1d88483641b12a..d2aefab5eb2b929877ced43f48b6268098484bd7 100644 --- a/kernel/irq/affinity.c +++ b/kernel/irq/affinity.c @@ -70,20 +70,21 @@ irq_create_affinity_masks(unsigned int nvecs, struct irq_affinity *affd) */ for (i = 0, usedvecs = 0; i < affd->nr_sets; i++) { unsigned int this_vecs = affd->set_size[i]; + unsigned int nr_masks; int j; - struct cpumask *result = group_cpus_evenly(this_vecs); + struct cpumask *result = group_cpus_evenly(this_vecs, &nr_masks); if (!result) { kfree(masks); return NULL; } - for (j = 0; j < this_vecs; j++) + for (j = 0; j < nr_masks; j++) cpumask_copy(&masks[curvec + j].mask, &result[j]); kfree(result); - curvec += this_vecs; - usedvecs += this_vecs; + curvec += nr_masks; + usedvecs += nr_masks; } /* Fill out vectors at the end that don't need affinity */ diff --git a/lib/group_cpus.c b/lib/group_cpus.c index ee272c4cefcc13907ce9f211f479615d2e3c9154..016c6578a07616959470b47121459a16a1bc99e5 100644 --- a/lib/group_cpus.c +++ b/lib/group_cpus.c @@ -332,9 +332,11 @@ static int __group_cpus_evenly(unsigned int startgrp, unsigned int numgrps, /** * group_cpus_evenly - Group all CPUs evenly per NUMA/CPU locality * @numgrps: number of groups + * @nummasks: number of initialized cpumasks * * Return: cpumask array if successful, NULL otherwise. And each element - * includes CPUs assigned to this group + * includes CPUs assigned to this group. nummasks contains the number + * of initialized masks which can be less than numgrps. * * Try to put close CPUs from viewpoint of CPU and NUMA locality into * same group, and run two-stage grouping: @@ -344,7 +346,8 @@ static int __group_cpus_evenly(unsigned int startgrp, unsigned int numgrps, * We guarantee in the resulted grouping that all CPUs are covered, and * no same CPU is assigned to multiple groups */ -struct cpumask *group_cpus_evenly(unsigned int numgrps) +struct cpumask *group_cpus_evenly(unsigned int numgrps, + unsigned int *nummasks) { unsigned int curgrp = 0, nr_present = 0, nr_others = 0; cpumask_var_t *node_to_cpumask; @@ -421,10 +424,12 @@ struct cpumask *group_cpus_evenly(unsigned int numgrps) kfree(masks); return NULL; } + *nummasks = nr_present + nr_others; return masks; } #else /* CONFIG_SMP */ -struct cpumask *group_cpus_evenly(unsigned int numgrps) +struct cpumask *group_cpus_evenly(unsigned int numgrps, + unsigned int *nummasks) { struct cpumask *masks = kcalloc(numgrps, sizeof(*masks), GFP_KERNEL); @@ -433,6 +438,7 @@ struct cpumask *group_cpus_evenly(unsigned int numgrps) /* assign all CPUs(cpu 0) to the 1st group only */ cpumask_copy(&masks[0], cpu_possible_mask); + *nummasks = 1; return masks; } #endif /* CONFIG_SMP */ From patchwork Thu Apr 24 18:19:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 885272 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 30D342900AB; Thu, 24 Apr 2025 18:19:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745518792; cv=none; b=gjOnOZx62mT1X1OFcn1wkRa1LlL0Yg1mYyanwISQ/84LRRjL+9QgpQ1/R2GwnlmJB/1jnSSCo4JjwFQO3G6Y1tSxmzkSvcdE80Nje/LNjIf/CYTvt7/G9/l6CVsr5Xrnf9kDSc2ofZ09oqzYD2djQY+1S0436EQYX3R6meBQAZY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745518792; c=relaxed/simple; bh=f6sLhWLNagZY9XRHD4d7BNg+tMcaz2lFqt2rqAjJjGg=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=kZqhr6OH7wiVTr6QmGgmYu7zNfuETtasy1xZjUgz28EPWKyti/u2gLPZe9iuZjXP87f/vvE0oUM6US3pORumZMXcHsa/9bhctPTIQ/Hpn7pvMtzxrns3jM1w11pfSHNMKglNsUA3TAIX2nAvGTtEFRJKzeIrEDblVieHwpIQh84= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=fbdMUKeG; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="fbdMUKeG" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1490DC4CEE3; Thu, 24 Apr 2025 18:19:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1745518791; bh=f6sLhWLNagZY9XRHD4d7BNg+tMcaz2lFqt2rqAjJjGg=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=fbdMUKeGWwMjJERfSlSZeWEmQeMLGEdo6klkJmeyyVtR0QbTpM5JQWYx2vNqp69gG HpXez2i1UI1CuXGApm8LQccBssU6R09hJqnHdbm0QeqO1cggMHntVkDLKyjU9ZPSqD lj4+8RBDKymttnR6MDAFL7MQX/Yv+bYL7S0owsdIlfewG75LOjfjFbF7h3y+g/NtAV 859MrTH4dTHjMkfBadxWfKIMcf2zL8Z3blE/P8FvZvK0Mw4gd1EPwNwSlx24T/5Fj6 Xr+/tPIGTAnYwBQ8t1hqp6NWV2dFP6d8UqvrBCVvPe20YwegfX+DAUOxehUehE9cyW ki0ZEVdusQOjA== From: Daniel Wagner Date: Thu, 24 Apr 2025 20:19:41 +0200 Subject: [PATCH v6 2/9] blk-mq: add number of queue calc helper Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250424-isolcpus-io-queues-v6-2-9a53a870ca1f@kernel.org> References: <20250424-isolcpus-io-queues-v6-0-9a53a870ca1f@kernel.org> In-Reply-To: <20250424-isolcpus-io-queues-v6-0-9a53a870ca1f@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , "Michael S. Tsirkin" Cc: "Martin K. Petersen" , Thomas Gleixner , Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , Mathieu Desnoyers , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, GR-QLogic-Storage-Upstream@marvell.com, Daniel Wagner X-Mailer: b4 0.14.2 Multiqueue devices should only allocate queues for the housekeeping CPUs when isolcpus=io_queue is set. This avoids that the isolated CPUs get disturbed with OS workload. Add two variants of helpers which calculates the correct number of queues which should be used. The need for two variants is necessary because some drivers calculate their max number of queues based on the possible CPU mask, others based on the online CPU mask. Reviewed-by: Christoph Hellwig Reviewed-by: Hannes Reinecke Signed-off-by: Daniel Wagner --- block/blk-mq-cpumap.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ include/linux/blk-mq.h | 2 ++ 2 files changed, 47 insertions(+) diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c index 269161252add756897fce1b65cae5b2e6aebd647..6e6b3e989a5676186b5a31296a1b94b7602f1542 100644 --- a/block/blk-mq-cpumap.c +++ b/block/blk-mq-cpumap.c @@ -12,10 +12,55 @@ #include #include #include +#include #include "blk.h" #include "blk-mq.h" +static unsigned int blk_mq_num_queues(const struct cpumask *mask, + unsigned int max_queues) +{ + unsigned int num; + + if (housekeeping_enabled(HK_TYPE_MANAGED_IRQ)) + mask = housekeeping_cpumask(HK_TYPE_MANAGED_IRQ); + + num = cpumask_weight(mask); + return min_not_zero(num, max_queues); +} + +/** + * blk_mq_num_possible_queues - Calc nr of queues for multiqueue devices + * @max_queues: The maximal number of queues the hardware/driver + * supports. If max_queues is 0, the argument is + * ignored. + * + * Calculate the number of queues which should be used for a multiqueue + * device based on the number of possible cpu. The helper is considering + * isolcpus settings. + */ +unsigned int blk_mq_num_possible_queues(unsigned int max_queues) +{ + return blk_mq_num_queues(cpu_possible_mask, max_queues); +} +EXPORT_SYMBOL_GPL(blk_mq_num_possible_queues); + +/** + * blk_mq_num_online_queues - Calc nr of queues for multiqueue devices + * @max_queues: The maximal number of queues the hardware/driver + * supports. If max_queues is 0, the argument is + * ignored. + * + * Calculate the number of queues which should be used for a multiqueue + * device based on the number of online cpus. The helper is considering + * isolcpus settings. + */ +unsigned int blk_mq_num_online_queues(unsigned int max_queues) +{ + return blk_mq_num_queues(cpu_online_mask, max_queues); +} +EXPORT_SYMBOL_GPL(blk_mq_num_online_queues); + void blk_mq_map_queues(struct blk_mq_queue_map *qmap) { const struct cpumask *masks; diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 8eb9b3310167c36f8a67ee8756a97d1274f8e73b..feed1dcaeef51c8db49d3fe667c64ecc824ce655 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -941,6 +941,8 @@ int blk_mq_freeze_queue_wait_timeout(struct request_queue *q, void blk_mq_unfreeze_queue_non_owner(struct request_queue *q); void blk_freeze_queue_start_non_owner(struct request_queue *q); +unsigned int blk_mq_num_possible_queues(unsigned int max_queues); +unsigned int blk_mq_num_online_queues(unsigned int max_queues); void blk_mq_map_queues(struct blk_mq_queue_map *qmap); void blk_mq_map_hw_queues(struct blk_mq_queue_map *qmap, struct device *dev, unsigned int offset); From patchwork Thu Apr 24 18:19:42 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 884269 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AA4B7291145; Thu, 24 Apr 2025 18:19:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745518794; cv=none; b=EnbDDqnt2ZXBYMq5FAk3FjZCNRmLch1Ov2e2x731V1gyE/YDFHdnCo0lK89/0KZ4HKqTRN9gt+AyX1NjxXH1aA1fODhOq/SRfssXz389KMbA0IcQsRkcGbGNWpQi5B7iyMVmx5vAqosHfSGkvXEqMqsRU5noA1bAZ7stPTFocBU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745518794; c=relaxed/simple; bh=ok6xPrZqMzj+pYA0b2w0oFvw0PB4VPK6RSYzfiVkDs4=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=orlzLf7X/4PEcgSzehid4feRrlf+haGnKK9IrbqRvmPwlyPG+IL5oK6C7q3NDYbdqMDDBin0SeGKA4AFKrOWaWaxJIW1WfV2+JJq+r8TGx6+KJ+lDWpP3lyOWPO1uB9sXkeLjEvFMkOVqrbYCJWNLeJfJ8BNpc19T6v5RVMaN4I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=uBOlEZKB; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="uBOlEZKB" Received: by smtp.kernel.org (Postfix) with ESMTPSA id EF269C4CEF0; Thu, 24 Apr 2025 18:19:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1745518794; bh=ok6xPrZqMzj+pYA0b2w0oFvw0PB4VPK6RSYzfiVkDs4=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=uBOlEZKBSib8YU5vBRS+rzjM30Tzh2DoF8SmxfVGZ40u4kG+z1iBuJ1YYa5TuzJ+E K29lL3XrcN9mi/wlaQCz2p5XcvI0AEk4rqYIQj4Q0+1gfutkmTTAgFZfI7DDyXKOE2 oUpz/xbUodeoYK7oRUr7kcA4v2y3MJUwkNqPVA3TVDTSXki9wFUkhVN+BW6cIx9kqw eLdDWGpof5wsS9ErUeSrom/DhdQ1S/5odHIVzYeceWhcEWbRs/kDb0u9h2mBL76cQA WEZ4myBfmfOPbeVzlqzXY6DutFgxxU51roTLOh/LHFjsBcVgcnIhEf8lTnPIFSDCva cMv9hQM3yn1tw== From: Daniel Wagner Date: Thu, 24 Apr 2025 20:19:42 +0200 Subject: [PATCH v6 3/9] nvme-pci: use block layer helpers to calculate num of queues Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250424-isolcpus-io-queues-v6-3-9a53a870ca1f@kernel.org> References: <20250424-isolcpus-io-queues-v6-0-9a53a870ca1f@kernel.org> In-Reply-To: <20250424-isolcpus-io-queues-v6-0-9a53a870ca1f@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , "Michael S. Tsirkin" Cc: "Martin K. Petersen" , Thomas Gleixner , Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , Mathieu Desnoyers , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, GR-QLogic-Storage-Upstream@marvell.com, Daniel Wagner X-Mailer: b4 0.14.2 Multiqueue devices should only allocate queues for the housekeeping CPUs when isolcpus=io_queue is set. This avoids that the isolated CPUs get disturbed with OS workload. Use helpers which calculates the correct number of queues which should be used when isolcpus is used. Reviewed-by: Christoph Hellwig Reviewed-by: Hannes Reinecke Signed-off-by: Daniel Wagner --- drivers/nvme/host/pci.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index b178d52eac1b7f7286e217226b9b3686d07b7b6c..2b1aa6833a12a5ecf7b293461a115026f97ea94c 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -81,7 +81,7 @@ static int io_queue_count_set(const char *val, const struct kernel_param *kp) int ret; ret = kstrtouint(val, 10, &n); - if (ret != 0 || n > num_possible_cpus()) + if (ret != 0 || n > blk_mq_num_possible_queues(0)) return -EINVAL; return param_set_uint(val, kp); } @@ -2448,7 +2448,8 @@ static unsigned int nvme_max_io_queues(struct nvme_dev *dev) */ if (dev->ctrl.quirks & NVME_QUIRK_SHARED_TAGS) return 1; - return num_possible_cpus() + dev->nr_write_queues + dev->nr_poll_queues; + return blk_mq_num_possible_queues(0) + dev->nr_write_queues + + dev->nr_poll_queues; } static int nvme_setup_io_queues(struct nvme_dev *dev) From patchwork Thu Apr 24 18:19:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 885271 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DA86F291145; Thu, 24 Apr 2025 18:19:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745518798; cv=none; b=S4DpyGuj/dlQ6NuxotLWxGXaaw7HMP6kbEQxv48WeGhAhungwvTsKJgNeBgIlTI9gWbZ+NI/0oRpIRw15OZCyqTP1VOpzz6zsyP3esn8imRdoZq5RP3bEcw/v3CeFKYAPD4ioS2T8Fw0XwmS39Ko/FBHV4ajV3u4PrNaajOKA7Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745518798; c=relaxed/simple; bh=cn/aLNjMYvGd2HTIAiQ86uSjaLpDO2Qv48JF8/ZeT+s=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=p21JmEXD7jHReqXMtrqKJKyPjsZoclbD9yiImPjTn+azNKPCBnxo4IG+ZAVpwSHjjn3VQyAj58MGbOqdC1nh+Q8Su+J1RZ4YGrV+HLxhR3vUzFGeADuImmzpB9VIQIgOwSmVySso0NzJ1J1nlITFHZSg5w1LWCR+c/CUofmFDnc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ujIfsSby; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ujIfsSby" Received: by smtp.kernel.org (Postfix) with ESMTPSA id F28C1C4CEEA; Thu, 24 Apr 2025 18:19:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1745518797; bh=cn/aLNjMYvGd2HTIAiQ86uSjaLpDO2Qv48JF8/ZeT+s=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=ujIfsSbyoDrtHF/4fhpzDtYdGBHksXoWSKCVU4cIIYUlCeqP3/4PInM9ggtQ6N404 kntqKsSZEN7TvoOcnAJ763IfLXrJQqVpmh/VVR2ZlLePrA1eohflkh6up/9c1SjRbq 5pwEmc2HDCzInvOzh3Og+00tWAXZtaObP1EXxaXDqXZUf3seAweBea38uDplVWzIoe YK3hGV9yhByzdtwYMGhCUmBJXNLM470c0+T7Xqbjsd9F9rJDa8kaCq70+oq7p/WtOI RnOQ3abJh3lz+W5xRn4+xfFTGKAlaBPwCNjpel6dQm/apprNxe6+VDQgORSZaFGSIU zEWCVQqn/HoeQ== From: Daniel Wagner Date: Thu, 24 Apr 2025 20:19:43 +0200 Subject: [PATCH v6 4/9] scsi: use block layer helpers to calculate num of queues Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250424-isolcpus-io-queues-v6-4-9a53a870ca1f@kernel.org> References: <20250424-isolcpus-io-queues-v6-0-9a53a870ca1f@kernel.org> In-Reply-To: <20250424-isolcpus-io-queues-v6-0-9a53a870ca1f@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , "Michael S. Tsirkin" Cc: "Martin K. Petersen" , Thomas Gleixner , Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , Mathieu Desnoyers , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, GR-QLogic-Storage-Upstream@marvell.com, Daniel Wagner X-Mailer: b4 0.14.2 Multiqueue devices should only allocate queues for the housekeeping CPUs when isolcpus=managed_irq is set. This avoids that the isolated CPUs get disturbed with OS workload. Use helpers which calculates the correct number of queues which should be used when isolcpus is used. Reviewed-by: Martin K. Petersen Reviewed-by: Hannes Reinecke Signed-off-by: Daniel Wagner --- drivers/scsi/megaraid/megaraid_sas_base.c | 15 +++++++++------ drivers/scsi/qla2xxx/qla_isr.c | 10 +++++----- drivers/scsi/smartpqi/smartpqi_init.c | 5 ++--- 3 files changed, 16 insertions(+), 14 deletions(-) diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c index 28c75865967af36c6390c5ee5767577ec1bcf779..a5f1117f3ddb20da04e0b29fd9d52d47ed1af3d8 100644 --- a/drivers/scsi/megaraid/megaraid_sas_base.c +++ b/drivers/scsi/megaraid/megaraid_sas_base.c @@ -5962,7 +5962,8 @@ megasas_alloc_irq_vectors(struct megasas_instance *instance) else instance->iopoll_q_count = 0; - num_msix_req = num_online_cpus() + instance->low_latency_index_start; + num_msix_req = blk_mq_num_online_queues(0) + + instance->low_latency_index_start; instance->msix_vectors = min(num_msix_req, instance->msix_vectors); @@ -5978,7 +5979,8 @@ megasas_alloc_irq_vectors(struct megasas_instance *instance) /* Disable Balanced IOPS mode and try realloc vectors */ instance->perf_mode = MR_LATENCY_PERF_MODE; instance->low_latency_index_start = 1; - num_msix_req = num_online_cpus() + instance->low_latency_index_start; + num_msix_req = blk_mq_num_online_queues(0) + + instance->low_latency_index_start; instance->msix_vectors = min(num_msix_req, instance->msix_vectors); @@ -6234,7 +6236,7 @@ static int megasas_init_fw(struct megasas_instance *instance) intr_coalescing = (scratch_pad_1 & MR_INTR_COALESCING_SUPPORT_OFFSET) ? true : false; if (intr_coalescing && - (num_online_cpus() >= MR_HIGH_IOPS_QUEUE_COUNT) && + (blk_mq_num_online_queues(0) >= MR_HIGH_IOPS_QUEUE_COUNT) && (instance->msix_vectors == MEGASAS_MAX_MSIX_QUEUES)) instance->perf_mode = MR_BALANCED_PERF_MODE; else @@ -6278,7 +6280,8 @@ static int megasas_init_fw(struct megasas_instance *instance) else instance->low_latency_index_start = 1; - num_msix_req = num_online_cpus() + instance->low_latency_index_start; + num_msix_req = blk_mq_num_online_queues(0) + + instance->low_latency_index_start; instance->msix_vectors = min(num_msix_req, instance->msix_vectors); @@ -6310,8 +6313,8 @@ static int megasas_init_fw(struct megasas_instance *instance) megasas_setup_reply_map(instance); dev_info(&instance->pdev->dev, - "current msix/online cpus\t: (%d/%d)\n", - instance->msix_vectors, (unsigned int)num_online_cpus()); + "current msix/max num queues\t: (%d/%u)\n", + instance->msix_vectors, blk_mq_num_online_queues(0)); dev_info(&instance->pdev->dev, "RDPQ mode\t: (%s)\n", instance->is_rdpq ? "enabled" : "disabled"); diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c index fe98c76e9be32ff03a1960f366f0d700d1168383..c4c6b5c6658c0734f7ff68bcc31b33dde87296dd 100644 --- a/drivers/scsi/qla2xxx/qla_isr.c +++ b/drivers/scsi/qla2xxx/qla_isr.c @@ -4533,13 +4533,13 @@ qla24xx_enable_msix(struct qla_hw_data *ha, struct rsp_que *rsp) if (USER_CTRL_IRQ(ha) || !ha->mqiobase) { /* user wants to control IRQ setting for target mode */ ret = pci_alloc_irq_vectors(ha->pdev, min_vecs, - min((u16)ha->msix_count, (u16)(num_online_cpus() + min_vecs)), - PCI_IRQ_MSIX); + blk_mq_num_online_queues(ha->msix_count) + min_vecs, + PCI_IRQ_MSIX); } else ret = pci_alloc_irq_vectors_affinity(ha->pdev, min_vecs, - min((u16)ha->msix_count, (u16)(num_online_cpus() + min_vecs)), - PCI_IRQ_MSIX | PCI_IRQ_AFFINITY, - &desc); + blk_mq_num_online_queues(ha->msix_count) + min_vecs, + PCI_IRQ_MSIX | PCI_IRQ_AFFINITY, + &desc); if (ret < 0) { ql_log(ql_log_fatal, vha, 0x00c7, diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c index 0da7be40c925807519f5bff8d428a29e5ce454a5..7212cb96d0f9a337578fa2b982afa3ee6d17f4be 100644 --- a/drivers/scsi/smartpqi/smartpqi_init.c +++ b/drivers/scsi/smartpqi/smartpqi_init.c @@ -5278,15 +5278,14 @@ static void pqi_calculate_queue_resources(struct pqi_ctrl_info *ctrl_info) if (reset_devices) { num_queue_groups = 1; } else { - int num_cpus; int max_queue_groups; max_queue_groups = min(ctrl_info->max_inbound_queues / 2, ctrl_info->max_outbound_queues - 1); max_queue_groups = min(max_queue_groups, PQI_MAX_QUEUE_GROUPS); - num_cpus = num_online_cpus(); - num_queue_groups = min(num_cpus, ctrl_info->max_msix_vectors); + num_queue_groups = + blk_mq_num_online_queues(ctrl_info->max_msix_vectors); num_queue_groups = min(num_queue_groups, max_queue_groups); } From patchwork Thu Apr 24 18:19:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 884268 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 318DC2918DC; Thu, 24 Apr 2025 18:20:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745518801; cv=none; b=YgiyvgBQ9LW2oPION7GrLOmLMKj/ICsZFREDpuwaVHDmOx40knx4R538m2MVLsi6EQYV91C/G1mFwfN+9soprStaBsIQMnDNNPCXRYgk1zhn1cmmNXllyqknQB7pOv3l5LibpAnqs2kFbsW0LzYOLG0LBY5mTku7dFn+n01qZ1c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745518801; c=relaxed/simple; bh=zDd5OXrJ5hx2XvXP+gWPa71BE69w4axt6D3acfXYsCI=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Zjq0xtSmDCVGp82hqshsAXMZOYuBeTzcUJTcbP8Q61KXjlbALgfSUVPvpU3Fpm30vKkj92iQTY8oA/2x4p9kxhtdeHd1w/xSHW1LXAcwKdpjeotFnf7QRYiVUcdsGUw3ajTxIvTs9oGfMfg/8MYwi4w0V5iNgTe6yDDuuoFbdzk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=eNswffIy; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="eNswffIy" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 109FCC4CEEA; Thu, 24 Apr 2025 18:20:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1745518800; bh=zDd5OXrJ5hx2XvXP+gWPa71BE69w4axt6D3acfXYsCI=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=eNswffIyajw2Ybh0hWiGT1YzdiI2E03jK+EOVl3DnNYwUvuztcZDvqljcj8DICUci lVTZWlT7b4rUBD6/kbjqUAgmZ3q0ZwO5nPh7NtfVlM2mF3z/sCunFrscSGI9HOOcvJ Sh3lTfx9PG7HPLdKTjqFUWbwtOqpnGqRoMn1TITpasdvz3tjJqwGCGCGZgmoeiaMGF PJrbbkGBgb9OWWHasFMg9zxkL+WekL3HOBr54ioJLF3WnzjnssFl3oY/WwP4blQM/L z9QKxV6OhJ51Uzmul4Flnb31rGAlY2U18HqLUg0nuErMc2dxofaEBGrew5ZtzPBQd5 xWKmNfcWOpw9A== From: Daniel Wagner Date: Thu, 24 Apr 2025 20:19:44 +0200 Subject: [PATCH v6 5/9] virtio: blk/scsi: use block layer helpers to calculate num of queues Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250424-isolcpus-io-queues-v6-5-9a53a870ca1f@kernel.org> References: <20250424-isolcpus-io-queues-v6-0-9a53a870ca1f@kernel.org> In-Reply-To: <20250424-isolcpus-io-queues-v6-0-9a53a870ca1f@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , "Michael S. Tsirkin" Cc: "Martin K. Petersen" , Thomas Gleixner , Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , Mathieu Desnoyers , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, GR-QLogic-Storage-Upstream@marvell.com, Daniel Wagner X-Mailer: b4 0.14.2 Multiqueue devices should only allocate queues for the housekeeping CPUs when isolcpus=io_queue is set. This avoids that the isolated CPUs get disturbed with OS workload. Use helpers which calculates the correct number of queues which should be used when isolcpus is used. Reviewed-by: Christoph Hellwig Acked-by: Michael S. Tsirkin Reviewed-by: Hannes Reinecke Signed-off-by: Daniel Wagner --- drivers/block/virtio_blk.c | 5 ++--- drivers/scsi/virtio_scsi.c | 1 + 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index 7cffea01d868c6dcfe6734d3c89c1709fec07956..975036e8ddef5d622bab623843826ac26a0aa63d 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -976,9 +976,8 @@ static int init_vq(struct virtio_blk *vblk) return -EINVAL; } - num_vqs = min_t(unsigned int, - min_not_zero(num_request_queues, nr_cpu_ids), - num_vqs); + num_vqs = blk_mq_num_possible_queues( + min_not_zero(num_request_queues, num_vqs)); num_poll_vqs = min_t(unsigned int, poll_queues, num_vqs - 1); diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c index 21ce3e9401929cd273fde08b0944e8b47e1e66cc..96a69edddbe5555574fc8fed1ba7c82a99df4472 100644 --- a/drivers/scsi/virtio_scsi.c +++ b/drivers/scsi/virtio_scsi.c @@ -919,6 +919,7 @@ static int virtscsi_probe(struct virtio_device *vdev) /* We need to know how many queues before we allocate. */ num_queues = virtscsi_config_get(vdev, num_queues) ? : 1; num_queues = min_t(unsigned int, nr_cpu_ids, num_queues); + num_queues = blk_mq_num_possible_queues(num_queues); num_targets = virtscsi_config_get(vdev, max_target) + 1; From patchwork Thu Apr 24 18:19:45 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 885270 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A151729292A; Thu, 24 Apr 2025 18:20:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745518803; cv=none; b=C2VJyyxMaZYn6joG7gVbH0vJIeoGCsKN4AQ7yEw3sbo8/vvJCpSsXV7qk8DfryUrI8SXM6M4Am175f0Ym7UVl/TwLXIju2+9z6Nf9A4RhPHNWNUkohz2Sxjq1B5FcPZkR6vRNXgtuGdCerchH3YPd/tdjXU6xyhFbR2a/C/WavQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745518803; c=relaxed/simple; bh=ESLdMXYBM3B8MU09r7uQESEbFY24U9of5fSwsUyPBL4=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=mQEH25SHA4Ko2ezF44zEA3w3KMkSjJ9nqYjFzmGqGx8gwQiMwPJhc+VM7sbUVQj6UsJdD8JZNkehu/+fZce9hdt2XBv2OAkHUodoBziCEoXRzdm901p7MNuOShKhO8Q5eMyY1K3pAUPbc2iey4lAsB2CY7H3yuROZKr2xUfv8w4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=WSddQFb9; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="WSddQFb9" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E4049C4CEEE; Thu, 24 Apr 2025 18:20:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1745518803; bh=ESLdMXYBM3B8MU09r7uQESEbFY24U9of5fSwsUyPBL4=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=WSddQFb9HoKtNj6oN8xQVX7nVnG+gaRSYoD7m8LEVY3dlkMKfMZkSXpjhcca8ycRg 03cQHbU7111SE9ATw/9KS29owS6bvX/iWpOcrwwWOwbLO6sJN2RV9Ja+iPw9cVgijj UeMPYH1J7Oon7hLiTMvqbiTeF/tBqbjvz4gibIgBqfPP7qtjteQMbIMFmBG5ZmQycU xf9bZDpIJ4yu8iQ/2OTfWPk/OK0m0kdItX+XqhGlOYRZKh8ycg8JYZTUvUVZL78lVN Yn24hf+C8wGB7Y2Z5CRVBOgTiqpLEzmakSCboCh7qsT2KF2l73scWgmAEUQWO+WVS4 BY+yOrg3h2f+Q== From: Daniel Wagner Date: Thu, 24 Apr 2025 20:19:45 +0200 Subject: [PATCH v6 6/9] isolation: introduce io_queue isolcpus type Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250424-isolcpus-io-queues-v6-6-9a53a870ca1f@kernel.org> References: <20250424-isolcpus-io-queues-v6-0-9a53a870ca1f@kernel.org> In-Reply-To: <20250424-isolcpus-io-queues-v6-0-9a53a870ca1f@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , "Michael S. Tsirkin" Cc: "Martin K. Petersen" , Thomas Gleixner , Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , Mathieu Desnoyers , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, GR-QLogic-Storage-Upstream@marvell.com, Daniel Wagner X-Mailer: b4 0.14.2 Multiqueue drivers spreading IO queues on all CPUs for optimal performance. The drivers are not aware of the CPU isolated requirement and will spread all queues ignoring the isolcpus configuration. Introduce a new isolcpus mask which allows the user to define on which CPUs IO queues should be placed. This is similar to the managed_irq but for drivers which do not use the managed IRQ infrastructure. Signed-off-by: Daniel Wagner --- include/linux/sched/isolation.h | 1 + kernel/sched/isolation.c | 7 +++++++ 2 files changed, 8 insertions(+) diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h index d8501f4709b583b8a1c91574446382f093bccdb1..6b6ae9c5b2f61a93c649a98ea27482b932627fca 100644 --- a/include/linux/sched/isolation.h +++ b/include/linux/sched/isolation.h @@ -9,6 +9,7 @@ enum hk_type { HK_TYPE_DOMAIN, HK_TYPE_MANAGED_IRQ, + HK_TYPE_IO_QUEUE, HK_TYPE_KERNEL_NOISE, HK_TYPE_MAX, diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c index 81bc8b329ef17cd3a3f5ae0a20ca02af3a1a69bc..687b11f900e31ab656e25cae263f15f6d8f46a9a 100644 --- a/kernel/sched/isolation.c +++ b/kernel/sched/isolation.c @@ -11,6 +11,7 @@ enum hk_flags { HK_FLAG_DOMAIN = BIT(HK_TYPE_DOMAIN), HK_FLAG_MANAGED_IRQ = BIT(HK_TYPE_MANAGED_IRQ), + HK_FLAG_IO_QUEUE = BIT(HK_TYPE_IO_QUEUE), HK_FLAG_KERNEL_NOISE = BIT(HK_TYPE_KERNEL_NOISE), }; @@ -224,6 +225,12 @@ static int __init housekeeping_isolcpus_setup(char *str) continue; } + if (!strncmp(str, "io_queue,", 9)) { + str += 9; + flags |= HK_FLAG_IO_QUEUE; + continue; + } + /* * Skip unknown sub-parameter and validate that it is not * containing an invalid character. From patchwork Thu Apr 24 18:19:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 884267 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D810629292A; Thu, 24 Apr 2025 18:20:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745518807; cv=none; b=eZi/QBdiypnTvdrxv02Q9jO3YFJx+7DC275G6bhwKMYRDoEnATaz2pm2GmiblWdrtXx+62rO+Z6ayo1lrFHXeP9wHB7IBkLEsiZdYkANx2AqLFhNHikC2DWTCU/sf0km+mz3MdnkdhefnEaHXWQRuAR0kT+najZt8xczT8ttN7k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745518807; c=relaxed/simple; bh=rDjb0DnW4LDPOm6D5bF8M2KSEcAP2XD//Ks8fwkXcCQ=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=IFM4VGo0MC+7P9YPl7iEpQhMfd+BKikIFhEHAkQmCOBy/EIAm5qEDezCKGIjkuCvE9G9pw8FzDhNMN9xMmJjfJ4vyxFf8zrt0r2VvdaC62y0mJmv7sRj8g1GNY2JBanxsfOgQt4MFpYYM2eVLOB5p3IJSJe6NOY3CGaU2kOBlWI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=F8Y8nlZC; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="F8Y8nlZC" Received: by smtp.kernel.org (Postfix) with ESMTPSA id B944FC4CEEC; Thu, 24 Apr 2025 18:20:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1745518806; bh=rDjb0DnW4LDPOm6D5bF8M2KSEcAP2XD//Ks8fwkXcCQ=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=F8Y8nlZC79aLc3KNS7NrtR2sYGxEg8KVRlqfDnSMPI05cYTPuAseVrQToUgnuPf4y a3wPYh0MR/lVC3BQ0cNnBKO6pUdEV+lcmiRDvm4gHF0ApkxX5Cjj43Q4SB5DM8R8KS s7gtXRPciyuONuDPY4HbHFjNv4pUDC4lSt6e0A0rddUQuhyxDJny4R9FvqfTzBBFUI FN1Wg8cG5xqabey9LjzCgWxouyumM5B4qjursw0xxpAOjpHiCYE91LF+HtMQXj9dhS MzxPpm0pT+bjSP/UxP51jsZVQKG0vMzy7g33LD4WTJ0siJMBmpwR+ClV3u6MOLS4+3 xRHmXUXGXx7Ig== From: Daniel Wagner Date: Thu, 24 Apr 2025 20:19:46 +0200 Subject: [PATCH v6 7/9] lib/group_cpus: honor housekeeping config when grouping CPUs Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250424-isolcpus-io-queues-v6-7-9a53a870ca1f@kernel.org> References: <20250424-isolcpus-io-queues-v6-0-9a53a870ca1f@kernel.org> In-Reply-To: <20250424-isolcpus-io-queues-v6-0-9a53a870ca1f@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , "Michael S. Tsirkin" Cc: "Martin K. Petersen" , Thomas Gleixner , Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , Mathieu Desnoyers , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, GR-QLogic-Storage-Upstream@marvell.com, Daniel Wagner X-Mailer: b4 0.14.2 group_cpus_evenly distributes all present CPUs into groups. This ignores the isolcpus configuration and assigns isolated CPUs into the groups. Make group_cpus_evenly aware of isolcpus configuration and use the housekeeping CPU mask as base for distributing the available CPUs into groups. Reviewed-by: Christoph Hellwig Reviewed-by: Hannes Reinecke Reviewed-by: Sagi Grimberg Signed-off-by: Daniel Wagner --- lib/group_cpus.c | 82 +++++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 79 insertions(+), 3 deletions(-) diff --git a/lib/group_cpus.c b/lib/group_cpus.c index 016c6578a07616959470b47121459a16a1bc99e5..707997bca55344b18f63ccfa539ba77a89d8acb6 100644 --- a/lib/group_cpus.c +++ b/lib/group_cpus.c @@ -8,6 +8,7 @@ #include #include #include +#include #ifdef CONFIG_SMP @@ -330,7 +331,7 @@ static int __group_cpus_evenly(unsigned int startgrp, unsigned int numgrps, } /** - * group_cpus_evenly - Group all CPUs evenly per NUMA/CPU locality + * group_possible_cpus_evenly - Group all CPUs evenly per NUMA/CPU locality * @numgrps: number of groups * @nummasks: number of initialized cpumasks * @@ -346,8 +347,8 @@ static int __group_cpus_evenly(unsigned int startgrp, unsigned int numgrps, * We guarantee in the resulted grouping that all CPUs are covered, and * no same CPU is assigned to multiple groups */ -struct cpumask *group_cpus_evenly(unsigned int numgrps, - unsigned int *nummasks) +static struct cpumask *group_possible_cpus_evenly(unsigned int numgrps, + unsigned int *nummasks) { unsigned int curgrp = 0, nr_present = 0, nr_others = 0; cpumask_var_t *node_to_cpumask; @@ -427,6 +428,81 @@ struct cpumask *group_cpus_evenly(unsigned int numgrps, *nummasks = nr_present + nr_others; return masks; } + +/** + * group_mask_cpus_evenly - Group all CPUs evenly per NUMA/CPU locality + * @numgrps: number of groups + * @cpu_mask: CPU to consider for the grouping + * @nummasks: number of initialized cpusmasks + * + * Return: cpumask array if successful, NULL otherwise. And each element + * includes CPUs assigned to this group. + * + * Try to put close CPUs from viewpoint of CPU and NUMA locality into + * same group. Allocate present CPUs on these groups evenly. + */ +static struct cpumask *group_mask_cpus_evenly(unsigned int numgrps, + const struct cpumask *cpu_mask, + unsigned int *nummasks) +{ + cpumask_var_t *node_to_cpumask; + cpumask_var_t nmsk; + int ret = -ENOMEM; + struct cpumask *masks = NULL; + + if (!zalloc_cpumask_var(&nmsk, GFP_KERNEL)) + return NULL; + + node_to_cpumask = alloc_node_to_cpumask(); + if (!node_to_cpumask) + goto fail_nmsk; + + masks = kcalloc(numgrps, sizeof(*masks), GFP_KERNEL); + if (!masks) + goto fail_node_to_cpumask; + + build_node_to_cpumask(node_to_cpumask); + + ret = __group_cpus_evenly(0, numgrps, node_to_cpumask, cpu_mask, nmsk, + masks); + +fail_node_to_cpumask: + free_node_to_cpumask(node_to_cpumask); + +fail_nmsk: + free_cpumask_var(nmsk); + if (ret < 0) { + kfree(masks); + return NULL; + } + *nummasks = ret; + return masks; +} + +/** + * group_cpus_evenly - Group all CPUs evenly per NUMA/CPU locality + * @numgrps: number of groups + * @nummasks: number of initialized cpusmasks + * + * Return: cpumask array if successful, NULL otherwise. + * + * group_possible_cpus_evently() is used for distributing the cpus on all + * possible cpus in absence of isolcpus command line argument. + * group_mask_cpu_evenly() is used when the isolcpus command line + * argument is used with managed_irq option. In this case only the + * housekeeping CPUs are considered. + */ +struct cpumask *group_cpus_evenly(unsigned int numgrps, + unsigned int *nummasks) +{ + if (housekeeping_enabled(HK_TYPE_IO_QUEUE)) { + return group_mask_cpus_evenly(numgrps, + housekeeping_cpumask(HK_TYPE_IO_QUEUE), + nummasks); + } + + return group_possible_cpus_evenly(numgrps, nummasks); +} #else /* CONFIG_SMP */ struct cpumask *group_cpus_evenly(unsigned int numgrps, unsigned int *nummasks) From patchwork Thu Apr 24 18:19:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 885269 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AD9D82949F1; Thu, 24 Apr 2025 18:20:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745518810; cv=none; b=dUx6sxIc6bDlx7H9qVre0EkIxXaycCwCMTThAh/9Em+aYFWC9pHKPvjEvHxsuFh4LPb6XRIFjxvvh801ZqIFlPdyoTrIbkbTCav8XuHfHuOBJGjL/NDEfN2r8MGnoNFtVIdQnmVfSDRgRbpfnZ+LiPsY5CVbFMeEs5hhSkWFmYE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745518810; c=relaxed/simple; bh=ZzXctOeUvdzZq7E43MFmzj1XIDOOZfBrFh235iH0XOE=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=gKEM3kPDvaYLUpnU4Mmc/iB8JBze0oXpKRO8BS7IHnISUaKELOHtwrwExOBAYR4AdytZFCW/v4v7uwWjzY2dpTRe6Ma6B836jDRe3aZQj2nnwyUe3Mh9iFytWbhyRTdo7Y6nwlRqDk0+hHZ1TcwNgkH3xVPSjzbR7A7DXOiXc3Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=P3yHszTI; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="P3yHszTI" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 99FC0C4AF0B; Thu, 24 Apr 2025 18:20:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1745518809; bh=ZzXctOeUvdzZq7E43MFmzj1XIDOOZfBrFh235iH0XOE=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=P3yHszTIDoFLhLy3ZcAgleg8u8tq9hRoxYf6mN/qZfyjKT80K5YBhxRY4YWIpbLv4 r5xCswJlpgyG13Ij4q8+suYqHAqUz9sffsVS3zhABMsdCSA5tsi1DdViS9bZwj27Y+ vs1E0iCBQz/JxNf0O+JqYIDHpvS3jJcKBCBSRNcoDuqQOlbbZjC8eNb3+mUkMfC9v3 PxBlmjblcB66XahWpNtmlPfRcJfJyfprb9aObxPQSkSX6NUZPDEZwRRmg58k4z8ZGk VUCmc94aYwc5CkKWt2TYIWhcNN6erq/G7iunk0FuSFtqdMWvoNKF30pYy51lEd7U7Y 79mQ7Eg/gBTcQ== From: Daniel Wagner Date: Thu, 24 Apr 2025 20:19:47 +0200 Subject: [PATCH v6 8/9] blk-mq: use hk cpus only when isolcpus=io_queue is enabled Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250424-isolcpus-io-queues-v6-8-9a53a870ca1f@kernel.org> References: <20250424-isolcpus-io-queues-v6-0-9a53a870ca1f@kernel.org> In-Reply-To: <20250424-isolcpus-io-queues-v6-0-9a53a870ca1f@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , "Michael S. Tsirkin" Cc: "Martin K. Petersen" , Thomas Gleixner , Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , Mathieu Desnoyers , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, GR-QLogic-Storage-Upstream@marvell.com, Daniel Wagner X-Mailer: b4 0.14.2 When isolcpus=io_queue is enabled all hardware queues should run on the housekeeping CPUs only. Thus ignore the affinity mask provided by the driver. Also we can't use blk_mq_map_queues because it will map all CPUs to first hctx unless, the CPU is the same as the hctx has the affinity set to, e.g. 8 CPUs with isolcpus=io_queue,2-3,6-7 config queue mapping for /dev/nvme0n1 hctx0: default 2 3 4 6 7 hctx1: default 5 hctx2: default 0 hctx3: default 1 PCI name is 00:05.0: nvme0n1 irq 57 affinity 0-1 effective 1 is_managed:0 nvme0q0 irq 58 affinity 4 effective 4 is_managed:1 nvme0q1 irq 59 affinity 5 effective 5 is_managed:1 nvme0q2 irq 60 affinity 0 effective 0 is_managed:1 nvme0q3 irq 61 affinity 1 effective 1 is_managed:1 nvme0q4 where as with blk_mq_hk_map_queues we get queue mapping for /dev/nvme0n1 hctx0: default 2 4 hctx1: default 3 5 hctx2: default 0 6 hctx3: default 1 7 PCI name is 00:05.0: nvme0n1 irq 56 affinity 0-1 effective 1 is_managed:0 nvme0q0 irq 61 affinity 4 effective 4 is_managed:1 nvme0q1 irq 62 affinity 5 effective 5 is_managed:1 nvme0q2 irq 63 affinity 0 effective 0 is_managed:1 nvme0q3 irq 64 affinity 1 effective 1 is_managed:1 nvme0q4 Reviewed-by: Christoph Hellwig Reviewed-by: Hannes Reinecke Signed-off-by: Daniel Wagner --- block/blk-mq-cpumap.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 67 insertions(+), 2 deletions(-) diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c index 6e6b3e989a5676186b5a31296a1b94b7602f1542..2d678d1db2b5196fc2b2ce5678fdb0cb6bad26e0 100644 --- a/block/blk-mq-cpumap.c +++ b/block/blk-mq-cpumap.c @@ -22,8 +22,8 @@ static unsigned int blk_mq_num_queues(const struct cpumask *mask, { unsigned int num; - if (housekeeping_enabled(HK_TYPE_MANAGED_IRQ)) - mask = housekeeping_cpumask(HK_TYPE_MANAGED_IRQ); + if (housekeeping_enabled(HK_TYPE_IO_QUEUE)) + mask = housekeeping_cpumask(HK_TYPE_IO_QUEUE); num = cpumask_weight(mask); return min_not_zero(num, max_queues); @@ -61,11 +61,73 @@ unsigned int blk_mq_num_online_queues(unsigned int max_queues) } EXPORT_SYMBOL_GPL(blk_mq_num_online_queues); +/* + * blk_mq_map_hk_queues - Create housekeeping CPU to hardware queue mapping + * @qmap: CPU to hardware queue map + * + * Create a housekeeping CPU to hardware queue mapping in @qmap. If the + * isolcpus feature is enabled and blk_mq_map_hk_queues returns true, + * @qmap contains a valid configuration honoring the io_queue + * configuration. If the isolcpus feature is disabled this function + * returns false. + */ +static bool blk_mq_map_hk_queues(struct blk_mq_queue_map *qmap) +{ + struct cpumask *hk_masks; + cpumask_var_t isol_mask; + unsigned int queue, cpu, nr_masks; + + if (!housekeeping_enabled(HK_TYPE_IO_QUEUE)) + return false; + + /* map housekeeping cpus to matching hardware context */ + hk_masks = group_cpus_evenly(qmap->nr_queues, &nr_masks); + if (!hk_masks) + goto fallback; + + for (queue = 0; queue < qmap->nr_queues; queue++) { + for_each_cpu(cpu, &hk_masks[queue % nr_masks]) + qmap->mq_map[cpu] = qmap->queue_offset + queue; + } + + kfree(hk_masks); + + /* map isolcpus to hardware context */ + if (!alloc_cpumask_var(&isol_mask, GFP_KERNEL)) + goto fallback; + + queue = 0; + cpumask_andnot(isol_mask, + cpu_possible_mask, + housekeeping_cpumask(HK_TYPE_IO_QUEUE)); + + for_each_cpu(cpu, isol_mask) { + qmap->mq_map[cpu] = qmap->queue_offset + queue; + queue = (queue + 1) % qmap->nr_queues; + } + + free_cpumask_var(isol_mask); + + return true; + +fallback: + /* map all cpus to hardware context ignoring any affinity */ + queue = 0; + for_each_possible_cpu(cpu) { + qmap->mq_map[cpu] = qmap->queue_offset + queue; + queue = (queue + 1) % qmap->nr_queues; + } + return true; +} + void blk_mq_map_queues(struct blk_mq_queue_map *qmap) { const struct cpumask *masks; unsigned int queue, cpu, nr_masks; + if (blk_mq_map_hk_queues(qmap)) + return; + masks = group_cpus_evenly(qmap->nr_queues, &nr_masks); if (!masks) { for_each_possible_cpu(cpu) @@ -120,6 +182,9 @@ void blk_mq_map_hw_queues(struct blk_mq_queue_map *qmap, if (!dev->bus->irq_get_affinity) goto fallback; + if (blk_mq_map_hk_queues(qmap)) + return; + for (queue = 0; queue < qmap->nr_queues; queue++) { mask = dev->bus->irq_get_affinity(dev, queue + offset); if (!mask) From patchwork Thu Apr 24 18:19:48 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 884266 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 44910294A0F; Thu, 24 Apr 2025 18:20:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745518812; cv=none; b=Tr2xTvWyE4qaA4guyed7VQiMu14QNuM/RXCRombc3UC6aAWtEExLmBfpxVw0qktK1GxiUo8o9VZxvfI+1SSRiTweX5Y2CBZQIeX/Hq44vqYvdJG0/oWHNY5VqjiZrDUdAZBh10UFkEvcMwHPG1v90xmQBvCo32yeXspMO+zv/K8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745518812; c=relaxed/simple; bh=AXA4x2vMWdvsJD5ZZl+of9Llu+mBLzfjSGJCkdx0VW0=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=LpQ2SJTY2Lhj+K77UuAaCN6h7Nlkzd2WfmBpyq+0/U2S/4NhkVzc1lSdvPyaJwol19tRBHwIfftjZbWrhhxhjqUssy7exBbDAOBVTDpeK/yEdbiYRcYqfWr/h3t8ZHfVtKz3SgtzdDiWOofEm9YD81abiKp8V1PuRvsIH5mzlZI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=how6HOg4; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="how6HOg4" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 802EAC4CEEA; Thu, 24 Apr 2025 18:20:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1745518812; bh=AXA4x2vMWdvsJD5ZZl+of9Llu+mBLzfjSGJCkdx0VW0=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=how6HOg4xUd6cxF5Runi3vtMopYXzUiZqPCkh7yY5A/H6/k5svrmHDM7XiIz1q3ma SCF02zEuUcIuA5CgDZOYO5ohyY0Hl+8gJ45yLDzOE5zjzceogSpzKcqViPBB1vwyAE adOYVggFJIH2aIm2JAgS6zQEN7IcZngXyv5ZXJdLXkIU4W7Fl7MXvnJ/WcgwzAc1qt 2gK72HuBkvXifXx+WDd7V8ycU3/XlsFnP5XDYDE5fm/wBSRGtg5BB/ObPyXrMT+5mf MKDTAQxnd07mHXF6gm14HkmEHDir9usojRd/B6GzJS3AlUT33zOUOq/02ruSNDwxum 9pixCn0zcieOg== From: Daniel Wagner Date: Thu, 24 Apr 2025 20:19:48 +0200 Subject: [PATCH v6 9/9] blk-mq: prevent offlining hk CPU with associated online isolated CPUs Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250424-isolcpus-io-queues-v6-9-9a53a870ca1f@kernel.org> References: <20250424-isolcpus-io-queues-v6-0-9a53a870ca1f@kernel.org> In-Reply-To: <20250424-isolcpus-io-queues-v6-0-9a53a870ca1f@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , "Michael S. Tsirkin" Cc: "Martin K. Petersen" , Thomas Gleixner , Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , Mathieu Desnoyers , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, GR-QLogic-Storage-Upstream@marvell.com, Daniel Wagner X-Mailer: b4 0.14.2 When isolcpus=io_queue is enabled, and the last housekeeping CPU for a given hctx would go offline, there would be no CPU left which handles the IOs. To prevent IO stalls, prevent offlining housekeeping CPUs which are still severing isolated CPUs.. Signed-off-by: Daniel Wagner Reviewed-by: Hannes Reinecke --- block/blk-mq.c | 46 ++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 44 insertions(+), 2 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index c2697db591091200cdb9f6e082e472b829701e4c..aff17673b773583dfb2b01cb2f5f010c456bd834 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3627,6 +3627,48 @@ static bool blk_mq_hctx_has_requests(struct blk_mq_hw_ctx *hctx) return data.has_rq; } +static bool blk_mq_hctx_check_isolcpus_online(struct blk_mq_hw_ctx *hctx, unsigned int cpu) +{ + const struct cpumask *hk_mask; + int i; + + if (!housekeeping_enabled(HK_TYPE_IO_QUEUE)) + return true; + + hk_mask = housekeeping_cpumask(HK_TYPE_IO_QUEUE); + + for (i = 0; i < hctx->nr_ctx; i++) { + struct blk_mq_ctx *ctx = hctx->ctxs[i]; + + if (ctx->cpu == cpu) + continue; + + /* + * Check if this context has at least one online + * housekeeping CPU in this case the hardware context is + * usable. + */ + if (cpumask_test_cpu(ctx->cpu, hk_mask) && + cpu_online(ctx->cpu)) + break; + + /* + * The context doesn't have any online housekeeping CPUs + * but there might be an online isolated CPU mapped to + * it. + */ + if (cpu_is_offline(ctx->cpu)) + continue; + + pr_warn("%s: trying to offline hctx%d but there is still an online isolcpu CPU %d mapped to it\n", + hctx->queue->disk->disk_name, + hctx->queue_num, ctx->cpu); + return true; + } + + return false; +} + static bool blk_mq_hctx_has_online_cpu(struct blk_mq_hw_ctx *hctx, unsigned int this_cpu) { @@ -3647,7 +3689,7 @@ static bool blk_mq_hctx_has_online_cpu(struct blk_mq_hw_ctx *hctx, /* this hctx has at least one online CPU */ if (this_cpu != cpu) - return true; + return blk_mq_hctx_check_isolcpus_online(hctx, this_cpu); } return false; @@ -3659,7 +3701,7 @@ static int blk_mq_hctx_notify_offline(unsigned int cpu, struct hlist_node *node) struct blk_mq_hw_ctx, cpuhp_online); if (blk_mq_hctx_has_online_cpu(hctx, cpu)) - return 0; + return -EINVAL; /* * Prevent new request from being allocated on the current hctx.