[3/9] sched: add sched_numa_find_nth_cpu()

Message ID	20230121042436.2661843-4-yury.norov@gmail.com
State	Accepted
Commit	cd7f55359c90a4108e6528e326b8623fce1ad72a
Headers	show Return-Path: <linux-crypto-owner@vger.kernel.org> From: Yury Norov <yury.norov@gmail.com> To: linux-kernel@vger.kernel.org, "David S. Miller" <davem@davemloft.net>, Andy Shevchenko <andriy.shevchenko@linux.intel.com>, Barry Song <baohua@kernel.org>, Ben Segall <bsegall@google.com>, Dietmar Eggemann <dietmar.eggemann@arm.com>, Gal Pressman <gal@nvidia.com>, Greg Kroah-Hartman <gregkh@linuxfoundation.org>, Haniel Bristot de Oliveira <bristot@redhat.com>, Heiko Carstens <hca@linux.ibm.com>, Ingo Molnar <mingo@redhat.com>, Jacob Keller <jacob.e.keller@intel.com>, Jakub Kicinski <kuba@kernel.org>, Jason Gunthorpe <jgg@nvidia.com>, Jesse Brandeburg <jesse.brandeburg@intel.com>, Jonathan Cameron <Jonathan.Cameron@huawei.com>, Juri Lelli <juri.lelli@redhat.com>, Leon Romanovsky <leonro@nvidia.com>, Linus Torvalds <torvalds@linux-foundation.org>, Mel Gorman <mgorman@suse.de>, Peter Lafreniere <peter@n8pjl.ca>, Peter Zijlstra <peterz@infradead.org>, Rasmus Villemoes <linux@rasmusvillemoes.dk>, Saeed Mahameed <saeedm@nvidia.com>, Steven Rostedt <rostedt@goodmis.org>, Tariq Toukan <tariqt@nvidia.com>, Tariq Toukan <ttoukan.linux@gmail.com>, Tony Luck <tony.luck@intel.com>, Valentin Schneider <vschneid@redhat.com>, Vincent Guittot <vincent.guittot@linaro.org> Cc: Yury Norov <yury.norov@gmail.com>, linux-crypto@vger.kernel.org, netdev@vger.kernel.org, linux-rdma@vger.kernel.org Subject: [PATCH 3/9] sched: add sched_numa_find_nth_cpu() Date: Fri, 20 Jan 2023 20:24:30 -0800 Message-Id: <20230121042436.2661843-4-yury.norov@gmail.com> In-Reply-To: <20230121042436.2661843-1-yury.norov@gmail.com> References: <20230121042436.2661843-1-yury.norov@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	sched: cpumask: improve on cpumask_local_spread() locality \| expand [RESEND,0/9] sched: cpumask: improve on cpumask_local_spread() locality [1/9] lib/find: introduce find_nth_and_andnot_bit [2/9] cpumask: introduce cpumask_nth_and_andnot [3/9] sched: add sched_numa_find_nth_cpu() [4/9] cpumask: improve on cpumask_local_spread() locality [5/9] lib/cpumask: reorganize cpumask_local_spread() logic [6/9] sched/topology: Introduce sched_numa_hop_mask() [7/9] sched/topology: Introduce for_each_numa_hop_mask() [8/9] net/mlx5e: Improve remote NUMA preferences used for the IRQ affinity hints [9/9] lib/cpumask: update comment for cpumask_local_spread()

Message ID

20230121042436.2661843-4-yury.norov@gmail.com

State

Accepted

Commit

cd7f55359c90a4108e6528e326b8623fce1ad72a

Headers

From: Yury Norov <yury.norov@gmail.com>
To: linux-kernel@vger.kernel.org,
        "David S. Miller" <davem@davemloft.net>,
        Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
        Barry Song <baohua@kernel.org>,
        Ben Segall <bsegall@google.com>,
        Dietmar Eggemann <dietmar.eggemann@arm.com>,
        Gal Pressman <gal@nvidia.com>,
        Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
        Haniel Bristot de Oliveira <bristot@redhat.com>,
        Heiko Carstens <hca@linux.ibm.com>,
        Ingo Molnar <mingo@redhat.com>,
        Jacob Keller <jacob.e.keller@intel.com>,
        Jakub Kicinski <kuba@kernel.org>,
        Jason Gunthorpe <jgg@nvidia.com>,
        Jesse Brandeburg <jesse.brandeburg@intel.com>,
        Jonathan Cameron <Jonathan.Cameron@huawei.com>,
        Juri Lelli <juri.lelli@redhat.com>,
        Leon Romanovsky <leonro@nvidia.com>,
        Linus Torvalds <torvalds@linux-foundation.org>,
        Mel Gorman <mgorman@suse.de>,
        Peter Lafreniere <peter@n8pjl.ca>,
        Peter Zijlstra <peterz@infradead.org>,
        Rasmus Villemoes <linux@rasmusvillemoes.dk>,
        Saeed Mahameed <saeedm@nvidia.com>,
        Steven Rostedt <rostedt@goodmis.org>,
        Tariq Toukan <tariqt@nvidia.com>,
        Tariq Toukan <ttoukan.linux@gmail.com>,
        Tony Luck <tony.luck@intel.com>,
        Valentin Schneider <vschneid@redhat.com>,
        Vincent Guittot <vincent.guittot@linaro.org>
Cc: Yury Norov <yury.norov@gmail.com>, linux-crypto@vger.kernel.org,
        netdev@vger.kernel.org, linux-rdma@vger.kernel.org
Subject: [PATCH 3/9] sched: add sched_numa_find_nth_cpu()
Date: Fri, 20 Jan 2023 20:24:30 -0800
Message-Id: <20230121042436.2661843-4-yury.norov@gmail.com>
In-Reply-To: <20230121042436.2661843-1-yury.norov@gmail.com>
References: <20230121042436.2661843-1-yury.norov@gmail.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: bulk

Series

sched: cpumask: improve on cpumask_local_spread() locality | expand

Commit Message

Yury Norov Jan. 21, 2023, 4:24 a.m. UTC

The function finds Nth set CPU in a given cpumask starting from a given
node.

Leveraging the fact that each hop in sched_domains_numa_masks includes the
same or greater number of CPUs than the previous one, we can use binary
search on hops instead of linear walk, which makes the overall complexity
of O(log n) in terms of number of cpumask_weight() calls.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Peter Lafreniere <peter@n8pjl.ca>
---
 include/linux/topology.h |  8 ++++++
 kernel/sched/topology.c  | 57 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 65 insertions(+)

Comments

Chen Yu Feb. 3, 2023, 12:58 a.m. UTC | #1

On 2023-01-20 at 20:24:30 -0800, Yury Norov wrote:
> The function finds Nth set CPU in a given cpumask starting from a given
> node.
> 
> Leveraging the fact that each hop in sched_domains_numa_masks includes the
> same or greater number of CPUs than the previous one, we can use binary
> search on hops instead of linear walk, which makes the overall complexity
> of O(log n) in terms of number of cpumask_weight() calls.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> Acked-by: Tariq Toukan <tariqt@nvidia.com>
> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
> Reviewed-by: Peter Lafreniere <peter@n8pjl.ca>
> ---
>  include/linux/topology.h |  8 ++++++
>  kernel/sched/topology.c  | 57 ++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 65 insertions(+)
>
[snip] 
> + * sched_numa_find_nth_cpu() - given the NUMA topology, find the Nth next cpu
> + *                             closest to @cpu from @cpumask.
Just minor question, the @cpu below is used as the index, right? What does "close to @cpu"
mean above?
> + * cpumask: cpumask to find a cpu from
> + * cpu: Nth cpu to find
Maybe also add description for @node?

thanks,
Chenyu

Jakub Kicinski Feb. 7, 2023, 5:09 a.m. UTC | #2

On Fri, 20 Jan 2023 20:24:30 -0800 Yury Norov wrote:
> The function finds Nth set CPU in a given cpumask starting from a given
> node.
> 
> Leveraging the fact that each hop in sched_domains_numa_masks includes the
> same or greater number of CPUs than the previous one, we can use binary
> search on hops instead of linear walk, which makes the overall complexity
> of O(log n) in terms of number of cpumask_weight() calls.

Valentin, would you be willing to give us a SoB or Review tag for 
this one?  We'd like to take the whole series via networking, if 
that's okay.

diff --git a/include/linux/topology.h b/include/linux/topology.h
index 4564faafd0e1..72f264575698 100644
--- a/include/linux/topology.h
+++ b/include/linux/topology.h
@@ -245,5 +245,13 @@  static inline const struct cpumask *cpu_cpu_mask(int cpu)
 	return cpumask_of_node(cpu_to_node(cpu));
 }
 
+#ifdef CONFIG_NUMA
+int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node);
+#else
+static __always_inline int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node)
+{
+	return cpumask_nth(cpu, cpus);
+}
+#endif	/* CONFIG_NUMA */
 
 #endif /* _LINUX_TOPOLOGY_H */
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 8739c2a5a54e..2bf89186a10f 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -3,6 +3,8 @@ 
  * Scheduler topology setup/handling methods
  */
 
+#include <linux/bsearch.h>
+
 DEFINE_MUTEX(sched_domains_mutex);
 
 /* Protected by sched_domains_mutex: */
@@ -2067,6 +2069,61 @@  int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
 	return found;
 }
 
+struct __cmp_key {
+	const struct cpumask *cpus;
+	struct cpumask ***masks;
+	int node;
+	int cpu;
+	int w;
+};
+
+static int hop_cmp(const void *a, const void *b)
+{
+	struct cpumask **prev_hop = *((struct cpumask ***)b - 1);
+	struct cpumask **cur_hop = *(struct cpumask ***)b;
+	struct __cmp_key *k = (struct __cmp_key *)a;
+
+	if (cpumask_weight_and(k->cpus, cur_hop[k->node]) <= k->cpu)
+		return 1;
+
+	k->w = (b == k->masks) ? 0 : cpumask_weight_and(k->cpus, prev_hop[k->node]);
+	if (k->w <= k->cpu)
+		return 0;
+
+	return -1;
+}
+
+/*
+ * sched_numa_find_nth_cpu() - given the NUMA topology, find the Nth next cpu
+ *                             closest to @cpu from @cpumask.
+ * cpumask: cpumask to find a cpu from
+ * cpu: Nth cpu to find
+ *
+ * returns: cpu, or nr_cpu_ids when nothing found.
+ */
+int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node)
+{
+	struct __cmp_key k = { .cpus = cpus, .node = node, .cpu = cpu };
+	struct cpumask ***hop_masks;
+	int hop, ret = nr_cpu_ids;
+
+	rcu_read_lock();
+
+	k.masks = rcu_dereference(sched_domains_numa_masks);
+	if (!k.masks)
+		goto unlock;
+
+	hop_masks = bsearch(&k, k.masks, sched_domains_numa_levels, sizeof(k.masks[0]), hop_cmp);
+	hop = hop_masks	- k.masks;
+
+	ret = hop ?
+		cpumask_nth_and_andnot(cpu - k.w, cpus, k.masks[hop][node], k.masks[hop-1][node]) :
+		cpumask_nth_and(cpu, cpus, k.masks[0][node]);
+unlock:
+	rcu_read_unlock();
+	return ret;
+}
+EXPORT_SYMBOL_GPL(sched_numa_find_nth_cpu);
 #endif /* CONFIG_NUMA */
 
 static int __sdt_alloc(const struct cpumask *cpu_map)

[3/9] sched: add sched_numa_find_nth_cpu()

Commit Message

Comments

Patch