diff mbox series

[v2,1/1] mm: only dispaly online cpus of the numa node

Message ID 1506678805-15392-2-git-send-email-thunder.leizhen@huawei.com
State Accepted
Commit 064f0e9302af4f4ab5e9dca03a5a77d6bebfd35e
Headers show
Series mm: only dispaly online cpus of the numa node | expand

Commit Message

Zhen Lei Sept. 29, 2017, 9:53 a.m. UTC
When I executed numactl -H(which read /sys/devices/system/node/nodeX/cpumap
and display cpumask_of_node for each node), but I got different result on
X86 and arm64. For each numa node, the former only displayed online CPUs,
and the latter displayed all possible CPUs. Unfortunately, both Linux
documentation and numactl manual have not described it clear.

I sent a mail to ask for help, and Michal Hocko <mhocko@kernel.org> replied
that he preferred to print online cpus because it doesn't really make much
sense to bind anything on offline nodes.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>

Acked-by: Michal Hocko <mhocko@suse.com>

---
 drivers/base/node.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

-- 
2.5.0

Comments

Will Deacon Oct. 2, 2017, 10:38 a.m. UTC | #1
[+akpm]

Hi Thunder,

On Fri, Sep 29, 2017 at 05:53:25PM +0800, Zhen Lei wrote:
> When I executed numactl -H(which read /sys/devices/system/node/nodeX/cpumap

> and display cpumask_of_node for each node), but I got different result on

> X86 and arm64. For each numa node, the former only displayed online CPUs,

> and the latter displayed all possible CPUs. Unfortunately, both Linux

> documentation and numactl manual have not described it clear.

> 

> I sent a mail to ask for help, and Michal Hocko <mhocko@kernel.org> replied

> that he preferred to print online cpus because it doesn't really make much

> sense to bind anything on offline nodes.

> 

> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>

> Acked-by: Michal Hocko <mhocko@suse.com>

> ---

>  drivers/base/node.c | 12 ++++++++++--

>  1 file changed, 10 insertions(+), 2 deletions(-)


Which tree is this intended to go through? I'm happy to take it via arm64,
but I don't want to tread on anybody's toes in linux-next and it looks like
there are already queued changes to this file via Andrew's tree.

Will

> diff --git a/drivers/base/node.c b/drivers/base/node.c

> index 3855902..aae2402 100644

> --- a/drivers/base/node.c

> +++ b/drivers/base/node.c

> @@ -27,13 +27,21 @@ static struct bus_type node_subsys = {

>  

>  static ssize_t node_read_cpumap(struct device *dev, bool list, char *buf)

>  {

> +	ssize_t n;

> +	cpumask_var_t mask;

>  	struct node *node_dev = to_node(dev);

> -	const struct cpumask *mask = cpumask_of_node(node_dev->dev.id);

>  

>  	/* 2008/04/07: buf currently PAGE_SIZE, need 9 chars per 32 bits. */

>  	BUILD_BUG_ON((NR_CPUS/32 * 9) > (PAGE_SIZE-1));

>  

> -	return cpumap_print_to_pagebuf(list, buf, mask);

> +	if (!alloc_cpumask_var(&mask, GFP_KERNEL))

> +		return 0;

> +

> +	cpumask_and(mask, cpumask_of_node(node_dev->dev.id), cpu_online_mask);

> +	n = cpumap_print_to_pagebuf(list, buf, mask);

> +	free_cpumask_var(mask);

> +

> +	return n;

>  }

>  

>  static inline ssize_t node_read_cpumask(struct device *dev,

> -- 

> 2.5.0

> 

>
Andrew Morton Oct. 2, 2017, 9:54 p.m. UTC | #2
On Mon, 2 Oct 2017 11:38:07 +0100 Will Deacon <will.deacon@arm.com> wrote:

> > When I executed numactl -H(which read /sys/devices/system/node/nodeX/cpumap

> > and display cpumask_of_node for each node), but I got different result on

> > X86 and arm64. For each numa node, the former only displayed online CPUs,

> > and the latter displayed all possible CPUs. Unfortunately, both Linux

> > documentation and numactl manual have not described it clear.

> > 

> > I sent a mail to ask for help, and Michal Hocko <mhocko@kernel.org> replied

> > that he preferred to print online cpus because it doesn't really make much

> > sense to bind anything on offline nodes.

> > 

> > Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>

> > Acked-by: Michal Hocko <mhocko@suse.com>

> > ---

> >  drivers/base/node.c | 12 ++++++++++--

> >  1 file changed, 10 insertions(+), 2 deletions(-)

> 

> Which tree is this intended to go through? I'm happy to take it via arm64,

> but I don't want to tread on anybody's toes in linux-next and it looks like

> there are already queued changes to this file via Andrew's tree.


I grabbed it.  I suppose there's some small risk of userspace breakage
so I suggest it be a 4.15-rc1 thing?
Michael Ellerman Oct. 3, 2017, 1:22 a.m. UTC | #3
Zhen Lei <thunder.leizhen@huawei.com> writes:

> When I executed numactl -H(which read /sys/devices/system/node/nodeX/cpumap

> and display cpumask_of_node for each node), but I got different result on

> X86 and arm64. For each numa node, the former only displayed online CPUs,

> and the latter displayed all possible CPUs. Unfortunately, both Linux

> documentation and numactl manual have not described it clear.


FWIW powerpc happens to implement the x86 behaviour, online CPUs only.

cheers
Will Deacon Oct. 3, 2017, 1:47 p.m. UTC | #4
On Mon, Oct 02, 2017 at 02:54:46PM -0700, Andrew Morton wrote:
> On Mon, 2 Oct 2017 11:38:07 +0100 Will Deacon <will.deacon@arm.com> wrote:

> 

> > > When I executed numactl -H(which read /sys/devices/system/node/nodeX/cpumap

> > > and display cpumask_of_node for each node), but I got different result on

> > > X86 and arm64. For each numa node, the former only displayed online CPUs,

> > > and the latter displayed all possible CPUs. Unfortunately, both Linux

> > > documentation and numactl manual have not described it clear.

> > > 

> > > I sent a mail to ask for help, and Michal Hocko <mhocko@kernel.org> replied

> > > that he preferred to print online cpus because it doesn't really make much

> > > sense to bind anything on offline nodes.

> > > 

> > > Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>

> > > Acked-by: Michal Hocko <mhocko@suse.com>

> > > ---

> > >  drivers/base/node.c | 12 ++++++++++--

> > >  1 file changed, 10 insertions(+), 2 deletions(-)

> > 

> > Which tree is this intended to go through? I'm happy to take it via arm64,

> > but I don't want to tread on anybody's toes in linux-next and it looks like

> > there are already queued changes to this file via Andrew's tree.

> 

> I grabbed it.  I suppose there's some small risk of userspace breakage

> so I suggest it be a 4.15-rc1 thing?


To be honest, I suspect the vast majority (if not all) code that reads this
file was developed for x86, so having the same behaviour for arm64 sounds
like something we should do ASAP before people try to special case with
things like #ifdef __aarch64__.

I'd rather have this in 4.14 if possible.

Cheers,

Will
Michal Hocko Oct. 3, 2017, 1:56 p.m. UTC | #5
On Tue 03-10-17 14:47:26, Will Deacon wrote:
> On Mon, Oct 02, 2017 at 02:54:46PM -0700, Andrew Morton wrote:

> > On Mon, 2 Oct 2017 11:38:07 +0100 Will Deacon <will.deacon@arm.com> wrote:

> > 

> > > > When I executed numactl -H(which read /sys/devices/system/node/nodeX/cpumap

> > > > and display cpumask_of_node for each node), but I got different result on

> > > > X86 and arm64. For each numa node, the former only displayed online CPUs,

> > > > and the latter displayed all possible CPUs. Unfortunately, both Linux

> > > > documentation and numactl manual have not described it clear.

> > > > 

> > > > I sent a mail to ask for help, and Michal Hocko <mhocko@kernel.org> replied

> > > > that he preferred to print online cpus because it doesn't really make much

> > > > sense to bind anything on offline nodes.

> > > > 

> > > > Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>

> > > > Acked-by: Michal Hocko <mhocko@suse.com>

> > > > ---

> > > >  drivers/base/node.c | 12 ++++++++++--

> > > >  1 file changed, 10 insertions(+), 2 deletions(-)

> > > 

> > > Which tree is this intended to go through? I'm happy to take it via arm64,

> > > but I don't want to tread on anybody's toes in linux-next and it looks like

> > > there are already queued changes to this file via Andrew's tree.

> > 

> > I grabbed it.  I suppose there's some small risk of userspace breakage

> > so I suggest it be a 4.15-rc1 thing?

> 

> To be honest, I suspect the vast majority (if not all) code that reads this

> file was developed for x86, so having the same behaviour for arm64 sounds

> like something we should do ASAP before people try to special case with

> things like #ifdef __aarch64__.

> 

> I'd rather have this in 4.14 if possible.


Agreed!

-- 
Michal Hocko
SUSE Labs
Zhen Lei Oct. 9, 2017, 6:06 a.m. UTC | #6
On 2017/10/3 21:56, Michal Hocko wrote:
> On Tue 03-10-17 14:47:26, Will Deacon wrote:

>> On Mon, Oct 02, 2017 at 02:54:46PM -0700, Andrew Morton wrote:

>>> On Mon, 2 Oct 2017 11:38:07 +0100 Will Deacon <will.deacon@arm.com> wrote:

>>>

>>>>> When I executed numactl -H(which read /sys/devices/system/node/nodeX/cpumap

>>>>> and display cpumask_of_node for each node), but I got different result on

>>>>> X86 and arm64. For each numa node, the former only displayed online CPUs,

>>>>> and the latter displayed all possible CPUs. Unfortunately, both Linux

>>>>> documentation and numactl manual have not described it clear.

>>>>>

>>>>> I sent a mail to ask for help, and Michal Hocko <mhocko@kernel.org> replied

>>>>> that he preferred to print online cpus because it doesn't really make much

>>>>> sense to bind anything on offline nodes.

>>>>>

>>>>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>

>>>>> Acked-by: Michal Hocko <mhocko@suse.com>

>>>>> ---

>>>>>  drivers/base/node.c | 12 ++++++++++--

>>>>>  1 file changed, 10 insertions(+), 2 deletions(-)

>>>>

>>>> Which tree is this intended to go through? I'm happy to take it via arm64,

>>>> but I don't want to tread on anybody's toes in linux-next and it looks like

>>>> there are already queued changes to this file via Andrew's tree.

>>>

>>> I grabbed it.  I suppose there's some small risk of userspace breakage

>>> so I suggest it be a 4.15-rc1 thing?

>>

>> To be honest, I suspect the vast majority (if not all) code that reads this

>> file was developed for x86, so having the same behaviour for arm64 sounds

>> like something we should do ASAP before people try to special case with

>> things like #ifdef __aarch64__.

>>

>> I'd rather have this in 4.14 if possible.

> 

> Agreed!

> 


+1

-- 
Thanks!
BestRegards
diff mbox series

Patch

diff --git a/drivers/base/node.c b/drivers/base/node.c
index 3855902..aae2402 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -27,13 +27,21 @@  static struct bus_type node_subsys = {
 
 static ssize_t node_read_cpumap(struct device *dev, bool list, char *buf)
 {
+	ssize_t n;
+	cpumask_var_t mask;
 	struct node *node_dev = to_node(dev);
-	const struct cpumask *mask = cpumask_of_node(node_dev->dev.id);
 
 	/* 2008/04/07: buf currently PAGE_SIZE, need 9 chars per 32 bits. */
 	BUILD_BUG_ON((NR_CPUS/32 * 9) > (PAGE_SIZE-1));
 
-	return cpumap_print_to_pagebuf(list, buf, mask);
+	if (!alloc_cpumask_var(&mask, GFP_KERNEL))
+		return 0;
+
+	cpumask_and(mask, cpumask_of_node(node_dev->dev.id), cpu_online_mask);
+	n = cpumap_print_to_pagebuf(list, buf, mask);
+	free_cpumask_var(mask);
+
+	return n;
 }
 
 static inline ssize_t node_read_cpumask(struct device *dev,