Message ID | ZUoRY33AAHMc5ThW@shell.armlinux.org.uk |
---|---|
Headers | show |
Series | Initial cleanups for vCPU hotplug | expand |
Hi Russell, On 11/7/23 18:29, Russell King (Oracle) wrote: > From: James Morse <james.morse@arm.com> > > Three of the five ACPI architectures create sysfs entries using > register_cpu() for present CPUs, whereas arm64, riscv and all > GENERIC_CPU_DEVICES do this for possible CPUs. > > Registering a CPU is what causes them to show up in sysfs. > > It makes very little sense to register all possible CPUs. Registering > a CPU is what triggers the udev notifications allowing user-space to > react to newly added CPUs. > > To allow all five ACPI architectures to use GENERIC_CPU_DEVICES, change > it to use for_each_present_cpu(). Making the ACPI architectures use > GENERIC_CPU_DEVICES is a pre-requisite step to centralise their > cpu_register() logic, before moving it into the ACPI processor driver. > When ACPI is disabled this work would be done by > cpu_dev_register_generic(). What do you actually mean about when ACPI is disabled this work would be done by cpu_dev_register_generic()? Is the work means register the cpu? I'm not quite understand that, and how about when ACPI is enabled, which function do this work? > > Of the ACPI architectures that register possible CPUs, arm64 and riscv > do not support making possible CPUs present as they use the weak 'always > fails' version of arch_register_cpu(). > > Only two of the eight architectures that use GENERIC_CPU_DEVICES have a > distinction between present and possible CPUs. > > The following architectures use GENERIC_CPU_DEVICES but are not SMP, > so possible == present: > * m68k > * microblaze > * nios2 > > The following architectures use GENERIC_CPU_DEVICES and consider > possible == present: > * csky: setup_smp() > * processor_probe() sets possible for all CPUs and present for all CPUs > except the boot cpu, which will have been done by > init/main.c::start_kernel(). > > um appears to be a subarchitecture of x86. > > The remaining architecture using GENERIC_CPU_DEVICES are: > * openrisc and hexagon: > where smp_init_cpus() makes all CPUs < NR_CPUS possible, > whereas smp_prepare_cpus() only makes CPUs < setup_max_cpus present. > > After this change, openrisc and hexagon systems that use the max_cpus > command line argument would not see the other CPUs present in sysfs. > This should not be a problem as these CPUs can't bre brought online as ^ nit: can't be > _cpu_up() checks cpu_present(). > > After this change, only CPUs which are present appear in sysfs. > > Signed-off-by: James Morse <james.morse@arm.com> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Thanks, Shaoqin > --- > drivers/base/cpu.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c > index 9ea22e165acd..34b48f660b6b 100644 > --- a/drivers/base/cpu.c > +++ b/drivers/base/cpu.c > @@ -533,7 +533,7 @@ static void __init cpu_dev_register_generic(void) > #ifdef CONFIG_GENERIC_CPU_DEVICES > int i; > > - for_each_possible_cpu(i) { > + for_each_present_cpu(i) { > if (register_cpu(&per_cpu(cpu_devices, i), i)) > panic("Failed to register CPU device"); > }
On Thu, Nov 09, 2023 at 06:09:32PM +0800, Shaoqin Huang wrote: > Hi Russell, > > On 11/7/23 18:29, Russell King (Oracle) wrote: > > From: James Morse <james.morse@arm.com> > > > > Three of the five ACPI architectures create sysfs entries using > > register_cpu() for present CPUs, whereas arm64, riscv and all > > GENERIC_CPU_DEVICES do this for possible CPUs. > > > > Registering a CPU is what causes them to show up in sysfs. > > > > It makes very little sense to register all possible CPUs. Registering > > a CPU is what triggers the udev notifications allowing user-space to > > react to newly added CPUs. > > > > To allow all five ACPI architectures to use GENERIC_CPU_DEVICES, change > > it to use for_each_present_cpu(). Making the ACPI architectures use > > GENERIC_CPU_DEVICES is a pre-requisite step to centralise their > > cpu_register() logic, before moving it into the ACPI processor driver. > > When ACPI is disabled this work would be done by > > cpu_dev_register_generic(). > > What do you actually mean about when ACPI is disabled this work would be Firstly, please note that "you" is not appropriate here. This is James' commit message, not mine. > done by cpu_dev_register_generic()? Is the work means register the cpu? When ACPI is disabled _and_ CONFIG_GENERIC_CPU_DEVICES is enabled, then cpu_dev_register_generic() will call arch_register_cpu() for each present CPU after this commit, rather than for each _possible_ CPU (which is the actual code change here.) > I'm not quite understand that, and how about when ACPI is enabled, which > function do this work? This is what happens later in the series. "drivers: base: Allow parts of GENERIC_CPU_DEVICES to be overridden" adds a test for CONFIG_GENERIC_CPU_DEVICES, so this will only be used with architectures using GENERIC_CPU_DEVICES. Then in: "ACPI: processor: Register all CPUs from acpi_processor_get_info()" which is not part of this series, this adds a call to arch_register_cpu() in the ACPI code, and disables this path via a test for !acpi_disabled. Essentially, this path gets used to register the present CPUs when firmware (ACPI) isn't going to be registering the present CPUs. I've changed this to: "It makes very little sense to register all possible CPUs. Registering a CPU is what triggers the udev notifications allowing user-space to react to newly added CPUs. "To allow all five ACPI architectures to use GENERIC_CPU_DEVICES, change it to use for_each_present_cpu(). "Making the ACPI architectures use GENERIC_CPU_DEVICES is a pre-requisite step to centralise their register_cpu() logic, before moving it into the ACPI processor driver. When we add support for register CPUs from ACPI in a later patch, we will avoid registering CPUs in this path." which I hope makes it clearer. > > After this change, openrisc and hexagon systems that use the max_cpus > > command line argument would not see the other CPUs present in sysfs. > > This should not be a problem as these CPUs can't bre brought online as > ^ nit: can't be Thanks, I'll fix that.
On 11/7/23 18:29, Russell King (Oracle) wrote: > From: James Morse <james.morse@arm.com> > > Add arch_unregister_cpu() to allow the ACPI machinery to call > unregister_cpu(). This is enough for arm64, riscv and loongarch, but > needs to be overridden by x86 and ia64 who need to do more work. > > CC: Jean-Philippe Brucker <jean-philippe@linaro.org> > Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> > --- > Changes since v1: > * Added CONFIG_HOTPLUG_CPU ifdeffery around unregister_cpu > Changes since RFC v2: > * Move earlier in the series > --- > drivers/base/cpu.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c > index 579064fda97b..58bb86091b34 100644 > --- a/drivers/base/cpu.c > +++ b/drivers/base/cpu.c > @@ -531,7 +531,14 @@ int __weak arch_register_cpu(int cpu) > { > return register_cpu(&per_cpu(cpu_devices, cpu), cpu); > } > -#endif > + > +#ifdef CONFIG_HOTPLUG_CPU > +void __weak arch_unregister_cpu(int num) > +{ > + unregister_cpu(&per_cpu(cpu_devices, num)); > +} > +#endif /* CONFIG_HOTPLUG_CPU */ > +#endif /* CONFIG_GENERIC_CPU_DEVICES */ > > static void __init cpu_dev_register_generic(void) > {
On 11/9/23 18:29, Russell King (Oracle) wrote: > On Thu, Nov 09, 2023 at 06:09:32PM +0800, Shaoqin Huang wrote: >> Hi Russell, >> >> On 11/7/23 18:29, Russell King (Oracle) wrote: >>> From: James Morse <james.morse@arm.com> >>> >>> Three of the five ACPI architectures create sysfs entries using >>> register_cpu() for present CPUs, whereas arm64, riscv and all >>> GENERIC_CPU_DEVICES do this for possible CPUs. >>> >>> Registering a CPU is what causes them to show up in sysfs. >>> >>> It makes very little sense to register all possible CPUs. Registering >>> a CPU is what triggers the udev notifications allowing user-space to >>> react to newly added CPUs. >>> >>> To allow all five ACPI architectures to use GENERIC_CPU_DEVICES, change >>> it to use for_each_present_cpu(). Making the ACPI architectures use >>> GENERIC_CPU_DEVICES is a pre-requisite step to centralise their >>> cpu_register() logic, before moving it into the ACPI processor driver. >>> When ACPI is disabled this work would be done by >>> cpu_dev_register_generic(). >> >> What do you actually mean about when ACPI is disabled this work would be > > Firstly, please note that "you" is not appropriate here. This is James' > commit message, not mine. > Oh, Sorry for that. >> done by cpu_dev_register_generic()? Is the work means register the cpu? > > When ACPI is disabled _and_ CONFIG_GENERIC_CPU_DEVICES is enabled, then > cpu_dev_register_generic() will call arch_register_cpu() for each present > CPU after this commit, rather than for each _possible_ CPU (which is the > actual code change here.) > >> I'm not quite understand that, and how about when ACPI is enabled, which >> function do this work? > > This is what happens later in the series. > > "drivers: base: Allow parts of GENERIC_CPU_DEVICES to be overridden" > adds a test for CONFIG_GENERIC_CPU_DEVICES, so this will only be used > with architectures using GENERIC_CPU_DEVICES. Then in: > > "ACPI: processor: Register all CPUs from acpi_processor_get_info()" > which is not part of this series, this adds a call to arch_register_cpu() > in the ACPI code, and disables this path via a test for !acpi_disabled. > > Essentially, this path gets used to register the present CPUs when > firmware (ACPI) isn't going to be registering the present CPUs. > > I've changed this to: > > "It makes very little sense to register all possible CPUs. Registering > a CPU is what triggers the udev notifications allowing user-space to > react to newly added CPUs. > > "To allow all five ACPI architectures to use GENERIC_CPU_DEVICES, change > it to use for_each_present_cpu(). > > "Making the ACPI architectures use GENERIC_CPU_DEVICES is a pre-requisite > step to centralise their register_cpu() logic, before moving it into the > ACPI processor driver. When we add support for register CPUs from ACPI > in a later patch, we will avoid registering CPUs in this path." > > which I hope makes it clearer. > Thanks for your great explanation. Change commit message to this makes me understand well. Thanks, Shaoqin >>> After this change, openrisc and hexagon systems that use the max_cpus >>> command line argument would not see the other CPUs present in sysfs. >>> This should not be a problem as these CPUs can't bre brought online as >> ^ nit: can't be > > Thanks, I'll fix that. >
On 11/7/23 18:30, Russell King (Oracle) wrote: > From: James Morse <james.morse@arm.com> > > loongarch, mips, parisc, riscv and sh all print a warning if > register_cpu() returns an error. Architectures that use > GENERIC_CPU_DEVICES call panic() instead. > > Errors in this path indicate something is wrong with the firmware > description of the platform, but the kernel is able to keep running. > > Downgrade this to a warning to make it easier to debug this issue. > > This will allow architectures that switching over to GENERIC_CPU_DEVICES > to drop their warning, but keep the existing behaviour. > > Signed-off-by: James Morse <james.morse@arm.com> > Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> > --- > drivers/base/cpu.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c > index 221ffbeb1c9b..82b6a76125f5 100644 > --- a/drivers/base/cpu.c > +++ b/drivers/base/cpu.c > @@ -551,14 +551,15 @@ void __weak arch_unregister_cpu(int num) > > static void __init cpu_dev_register_generic(void) > { > - int i; > + int i, ret; > > if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES)) > return; > > for_each_present_cpu(i) { > - if (arch_register_cpu(i)) > - panic("Failed to register CPU device"); > + ret = arch_register_cpu(i); > + if (ret) > + pr_warn("register_cpu %d failed (%d)\n", i, ret); > } > } >
On 11/7/23 20:29, Russell King (Oracle) wrote: > From: James Morse <james.morse@arm.com> > > register_cpu_capacity_sysctl() adds a property to sysfs that describes > the CPUs capacity. This is done from a subsys_initcall() that assumes > all possible CPUs are registered. > > With CPU hotplug, possible CPUs aren't registered until they become > present, (or for arm64 enabled). This leads to messages during boot: > | register_cpu_capacity_sysctl: too early to get CPU1 device! > and once these CPUs are added to the system, the file is missing. > > Move this to a cpuhp callback, so that the file is created once > CPUs are brought online. This covers CPUs that are added late by > mechanisms like hotplug. > One observable difference is the file is now missing for offline CPUs. > > Signed-off-by: James Morse <james.morse@arm.com> > --- > If the offline CPUs thing is a problem for the tools that consume > this value, we'd need to move cpu_capacity to be part of cpu.c's > common_cpu_attr_groups. > --- > drivers/base/arch_topology.c | 38 ++++++++++++++++++++++++------------ > 1 file changed, 26 insertions(+), 12 deletions(-) > Reviewed-by: Gavin Shan <gshan@redhat.com>
On 11/7/23 20:29, Russell King (Oracle) wrote: > arch_register_cpu() and arch_unregister_cpu() are not used by anything > that can be a module - they are used by drivers/base/cpu.c and > drivers/acpi/acpi_processor.c, neither of which can be a module. > > Remove the exports. > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > --- > arch/x86/kernel/topology.c | 2 -- > 1 file changed, 2 deletions(-) > Reviewed-by: Gavin Shan <gshan@redhat.com>
On 11/7/23 20:29, Russell King (Oracle) wrote: > From: James Morse <james.morse@arm.com> > > Three of the five ACPI architectures create sysfs entries using > register_cpu() for present CPUs, whereas arm64, riscv and all > GENERIC_CPU_DEVICES do this for possible CPUs. > > Registering a CPU is what causes them to show up in sysfs. > > It makes very little sense to register all possible CPUs. Registering > a CPU is what triggers the udev notifications allowing user-space to > react to newly added CPUs. > > To allow all five ACPI architectures to use GENERIC_CPU_DEVICES, change > it to use for_each_present_cpu(). Making the ACPI architectures use > GENERIC_CPU_DEVICES is a pre-requisite step to centralise their > cpu_register() logic, before moving it into the ACPI processor driver. > When ACPI is disabled this work would be done by > cpu_dev_register_generic(). > > Of the ACPI architectures that register possible CPUs, arm64 and riscv > do not support making possible CPUs present as they use the weak 'always > fails' version of arch_register_cpu(). > > Only two of the eight architectures that use GENERIC_CPU_DEVICES have a > distinction between present and possible CPUs. > > The following architectures use GENERIC_CPU_DEVICES but are not SMP, > so possible == present: > * m68k > * microblaze > * nios2 > > The following architectures use GENERIC_CPU_DEVICES and consider > possible == present: > * csky: setup_smp() > * processor_probe() sets possible for all CPUs and present for all CPUs > except the boot cpu, which will have been done by > init/main.c::start_kernel(). > > um appears to be a subarchitecture of x86. > > The remaining architecture using GENERIC_CPU_DEVICES are: > * openrisc and hexagon: > where smp_init_cpus() makes all CPUs < NR_CPUS possible, > whereas smp_prepare_cpus() only makes CPUs < setup_max_cpus present. > > After this change, openrisc and hexagon systems that use the max_cpus > command line argument would not see the other CPUs present in sysfs. > This should not be a problem as these CPUs can't bre brought online as > _cpu_up() checks cpu_present(). > > After this change, only CPUs which are present appear in sysfs. > > Signed-off-by: James Morse <james.morse@arm.com> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > --- > drivers/base/cpu.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > Reviewed-by: Gavin Shan <gshan@redhat.com>
On 11/7/23 20:29, Russell King (Oracle) wrote: > From: James Morse <james.morse@arm.com> > > Add arch_unregister_cpu() to allow the ACPI machinery to call > unregister_cpu(). This is enough for arm64, riscv and loongarch, but > needs to be overridden by x86 and ia64 who need to do more work. > > CC: Jean-Philippe Brucker <jean-philippe@linaro.org> > Signed-off-by: James Morse <james.morse@arm.com> > --- > Changes since v1: > * Added CONFIG_HOTPLUG_CPU ifdeffery around unregister_cpu > Changes since RFC v2: > * Move earlier in the series > --- > drivers/base/cpu.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > Reviewed-by: Gavin Shan <gshan@redhat.com>
On 11/7/23 20:30, Russell King (Oracle) wrote: > From: James Morse <james.morse@arm.com> > > NUMA systems require the node descriptions to be ready before CPUs are > registered. This is so that the node symlinks can be created in sysfs. > > Currently no NUMA platform uses GENERIC_CPU_DEVICES, meaning that CPUs > are registered by arch code, instead of cpu_dev_init(). > > Move cpu_dev_init() after node_dev_init() so that NUMA architectures > can use GENERIC_CPU_DEVICES. > > Signed-off-by: James Morse <james.morse@arm.com> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > --- > Note: Jonathan's comment still needs addressing - see > https://lore.kernel.org/r/20230914121612.00006ac7@Huawei.com > --- > drivers/base/init.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > With Jonathan's comments addressed: Reviewed-by: Gavin Shan <gshan@redhat.com>
On 11/7/23 20:30, Russell King (Oracle) wrote: > From: James Morse <james.morse@arm.com> > > loongarch, mips, parisc, riscv and sh all print a warning if > register_cpu() returns an error. Architectures that use > GENERIC_CPU_DEVICES call panic() instead. > > Errors in this path indicate something is wrong with the firmware > description of the platform, but the kernel is able to keep running. > > Downgrade this to a warning to make it easier to debug this issue. > > This will allow architectures that switching over to GENERIC_CPU_DEVICES > to drop their warning, but keep the existing behaviour. > > Signed-off-by: James Morse <james.morse@arm.com> > Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > --- > drivers/base/cpu.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > Reviewed-by: Gavin Shan <gshan@redhat.com>
On 11/7/23 20:30, Russell King (Oracle) wrote: > Since "drivers: base: Move cpu_dev_init() after node_dev_init()", we > can remove some redundant code. > > node_dev_init() will walk through the nodes calling register_one_node() > on each. This will trickle down to __register_one_node() which walks > all present CPUs, calling register_cpu_under_node() on each. > > register_cpu_under_node() will call get_cpu_device(cpu) for each, which > will return NULL until the CPU is registered using register_cpu(). This > now happens _after_ node_dev_init(). > > Therefore, calling register_cpu_under_node() from __register_one_node() > becomes a no-op, and can be removed. > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > --- > drivers/base/node.c | 7 ------- > 1 file changed, 7 deletions(-) > __register_one_node() can be called in memory hot add path either. In that path, a new NUMA node can be presented and becomes online. Does this become a problem after the logic of associating CPU with newly added NUMA node? Thanks, Gavin > diff --git a/drivers/base/node.c b/drivers/base/node.c > index 493d533f8375..4d5ac7cf8757 100644 > --- a/drivers/base/node.c > +++ b/drivers/base/node.c > @@ -867,7 +867,6 @@ void register_memory_blocks_under_node(int nid, unsigned long start_pfn, > int __register_one_node(int nid) > { > int error; > - int cpu; > > node_devices[nid] = kzalloc(sizeof(struct node), GFP_KERNEL); > if (!node_devices[nid]) > @@ -875,12 +874,6 @@ int __register_one_node(int nid) > > error = register_node(node_devices[nid], nid); > > - /* link cpu under this node */ > - for_each_present_cpu(cpu) { > - if (cpu_to_node(cpu) == nid) > - register_cpu_under_node(cpu, nid); > - } > - > INIT_LIST_HEAD(&node_devices[nid]->access_list); > node_init_caches(nid); >
On 11/7/23 20:30, Russell King (Oracle) wrote: > From: James Morse <james.morse@arm.com> > > Now that GENERIC_CPU_DEVICES calls arch_register_cpu(), which can be > overridden by the arch code, switch over to this to allow common code > to choose when the register_cpu() call is made. > > x86's struct cpus come from struct x86_cpu, which has no other members > or users. Remove this and use the version defined by common code. > > This is an intermediate step to the logic being moved to drivers/acpi, > where GENERIC_CPU_DEVICES will do the work when booting with acpi=off. > > This patch also has the effect of moving the registration of CPUs from > subsys to driver core initialisation, prior to any initcalls running. > > Signed-off-by: James Morse <james.morse@arm.com> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > ---- > Changes since RFC: > * Fixed the second copy of arch_register_cpu() used for non-hotplug > Changes since RFC v2: > * Remove duplicate of the weak generic arch_register_cpu(), spotted > by Jonathan Cameron. Add note about initialisation order change. > Changes since RFC v3: > * Adapt to removal of EXPORT_SYMBOL()s > --- > arch/x86/Kconfig | 1 + > arch/x86/include/asm/cpu.h | 4 ---- > arch/x86/kernel/topology.c | 27 ++++----------------------- > 3 files changed, 5 insertions(+), 27 deletions(-) > Reviewed-by: Gavin Shan <gshan@redhat.com>
On 11/7/23 20:30, Russell King (Oracle) wrote: > Convert x86 to use the arch_cpu_is_hotpluggable() helper rather than > arch_register_cpu(). > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > --- > arch/x86/kernel/topology.c | 7 ++----- > 1 file changed, 2 insertions(+), 5 deletions(-) > Reviewed-by: Gavin Shan <gshan@redhat.com>
On 11/7/23 20:31, Russell King (Oracle) wrote: > From: James Morse <james.morse@arm.com> > > Now that GENERIC_CPU_DEVICES calls arch_register_cpu(), which can be > overridden by the arch code, switch over to this to allow common code > to choose when the register_cpu() call is made. > > This allows topology_init() to be removed. > > This is an intermediate step to the logic being moved to drivers/acpi, > where GENERIC_CPU_DEVICES will do the work when booting with acpi=off. > > This patch also has the effect of moving the registration of CPUs from > subsys to driver core initialisation, prior to any initcalls running. > > Signed-off-by: James Morse <james.morse@arm.com> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > --- > Changes since RFC v2: > * Add note about initialisation order change. > --- > arch/riscv/Kconfig | 1 + > arch/riscv/kernel/setup.c | 19 ++++--------------- > 2 files changed, 5 insertions(+), 15 deletions(-) > Reviewed-by: Gavin Shan <gshan@redhat.com>
On Mon, Nov 13, 2023 at 10:58:46AM +1000, Gavin Shan wrote: > > > On 11/7/23 20:30, Russell King (Oracle) wrote: > > From: James Morse <james.morse@arm.com> > > > > NUMA systems require the node descriptions to be ready before CPUs are > > registered. This is so that the node symlinks can be created in sysfs. > > > > Currently no NUMA platform uses GENERIC_CPU_DEVICES, meaning that CPUs > > are registered by arch code, instead of cpu_dev_init(). > > > > Move cpu_dev_init() after node_dev_init() so that NUMA architectures > > can use GENERIC_CPU_DEVICES. > > > > Signed-off-by: James Morse <james.morse@arm.com> > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > > --- > > Note: Jonathan's comment still needs addressing - see > > https://lore.kernel.org/r/20230914121612.00006ac7@Huawei.com > > --- > > drivers/base/init.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > With Jonathan's comments addressed: That needs James' input, which is why I made the note on the patch.
On Mon, Nov 13, 2023 at 02:04:32PM +1000, Gavin Shan wrote: > On 11/7/23 20:30, Russell King (Oracle) wrote: > > Since "drivers: base: Move cpu_dev_init() after node_dev_init()", we > > can remove some redundant code. > > > > node_dev_init() will walk through the nodes calling register_one_node() > > on each. This will trickle down to __register_one_node() which walks > > all present CPUs, calling register_cpu_under_node() on each. > > > > register_cpu_under_node() will call get_cpu_device(cpu) for each, which > > will return NULL until the CPU is registered using register_cpu(). This > > now happens _after_ node_dev_init(). > > > > Therefore, calling register_cpu_under_node() from __register_one_node() > > becomes a no-op, and can be removed. > > > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > > --- > > drivers/base/node.c | 7 ------- > > 1 file changed, 7 deletions(-) > > > > __register_one_node() can be called in memory hot add path either. In that path, > a new NUMA node can be presented and becomes online. Does this become a problem > after the logic of associating CPU with newly added NUMA node? I guess this is where ordering matters. As mentioned in the commit message, register_cpu_under_node() does this: if (!node_online(nid)) return 0; obj = get_cpu_device(cpu); if (!obj) return 0; get_cpu_device() will return NULL if the CPU is not possible or is out of range, or register_cpu() has not yet been called for this CPU, and register_cpu() will call register_cpu_under_node(). I guess it is possible for a CPU it be present, but the node its associated with would not be online, which means we end up with register_cpu_under_node() returning on !node_online(nid) but we've populated the CPU devices (thus get_cpu_device(cpu) would return non-NULL). Then when the numa node comes online, we do still need to call this path, so this change is incorrect. It came about trying to address Jonathan's comment for this patch: https://lore.kernel.org/r/20230913163823.7880-7-james.morse@arm.com I think my response to Jonathan is still correct - but didn't need a code change. I'm dropping this patch.
On Tue, Nov 07, 2023 at 10:29:59AM +0000, Russell King wrote: > From: James Morse <james.morse@arm.com> > > Add arch_unregister_cpu() to allow the ACPI machinery to call > unregister_cpu(). This is enough for arm64, riscv and loongarch, but > needs to be overridden by x86 and ia64 who need to do more work. > > CC: Jean-Philippe Brucker <jean-philippe@linaro.org> > Signed-off-by: James Morse <james.morse@arm.com> > --- > Changes since v1: > * Added CONFIG_HOTPLUG_CPU ifdeffery around unregister_cpu > Changes since RFC v2: > * Move earlier in the series > --- > drivers/base/cpu.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c > index 579064fda97b..58bb86091b34 100644 > --- a/drivers/base/cpu.c > +++ b/drivers/base/cpu.c > @@ -531,7 +531,14 @@ int __weak arch_register_cpu(int cpu) > { > return register_cpu(&per_cpu(cpu_devices, cpu), cpu); > } > -#endif > + > +#ifdef CONFIG_HOTPLUG_CPU > +void __weak arch_unregister_cpu(int num) > +{ > + unregister_cpu(&per_cpu(cpu_devices, num)); > +} > +#endif /* CONFIG_HOTPLUG_CPU */ I have previously asked the question whether we should provide a stub weak function for the !HOTPLUG_CPU case for this, which would alleviate the concerns around if (IS_ENABLED()) in some of the later hotplug vCPU patches... which failed to get _any_ responses. So, I'm now going to deem the comment I received about if (IS_ENABLED()) potentially causing issues to be unimportant, and thus there's no need for a stub weak function. If we start getting compile errors, then we can address the issue at that point. So far, however, the kernel build bot has not identified that this as an issue... and it's been chewing on this entire patch set for well over a month now.
On Tue, 21 Nov 2023 13:27:08 +0000 "Russell King (Oracle)" <linux@armlinux.org.uk> wrote: > On Mon, Nov 13, 2023 at 08:00:19AM +0000, Russell King (Oracle) wrote: > > On Mon, Nov 13, 2023 at 10:58:46AM +1000, Gavin Shan wrote: > > > > > > > > > On 11/7/23 20:30, Russell King (Oracle) wrote: > > > > From: James Morse <james.morse@arm.com> > > > > > > > > NUMA systems require the node descriptions to be ready before CPUs are > > > > registered. This is so that the node symlinks can be created in sysfs. > > > > > > > > Currently no NUMA platform uses GENERIC_CPU_DEVICES, meaning that CPUs > > > > are registered by arch code, instead of cpu_dev_init(). > > > > > > > > Move cpu_dev_init() after node_dev_init() so that NUMA architectures > > > > can use GENERIC_CPU_DEVICES. > > > > > > > > Signed-off-by: James Morse <james.morse@arm.com> > > > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > > > > --- > > > > Note: Jonathan's comment still needs addressing - see > > > > https://lore.kernel.org/r/20230914121612.00006ac7@Huawei.com > > > > --- > > > > drivers/base/init.c | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > > > With Jonathan's comments addressed: > > > > That needs James' input, which is why I made the note on the patch. > > I'm going to be posting the series without RFC soon, and it will be > with Jonathan's comment unaddressed - because as I've said several > times it needs James' input and we have sadly not yet received that. > > Short of waiting until James can respond, I don't think there are > any other alternatives. In the interests of expediency I'm fine with that. (To be honest I'd forgotten I even made that comment ;) Jonathan > > I do hope we can get this queued up for v6.8 though. >
On Tue, 07 Nov 2023 10:29:23 +0000 Russell King <rmk+kernel@armlinux.org.uk> wrote: > From: James Morse <james.morse@arm.com> > > register_cpu_capacity_sysctl() adds a property to sysfs that describes > the CPUs capacity. This is done from a subsys_initcall() that assumes > all possible CPUs are registered. > > With CPU hotplug, possible CPUs aren't registered until they become > present, (or for arm64 enabled). This leads to messages during boot: > | register_cpu_capacity_sysctl: too early to get CPU1 device! > and once these CPUs are added to the system, the file is missing. > > Move this to a cpuhp callback, so that the file is created once > CPUs are brought online. This covers CPUs that are added late by > mechanisms like hotplug. > One observable difference is the file is now missing for offline CPUs. > > Signed-off-by: James Morse <james.morse@arm.com> > --- > If the offline CPUs thing is a problem for the tools that consume > this value, we'd need to move cpu_capacity to be part of cpu.c's > common_cpu_attr_groups. I'm not keen on squirting sysfs files in from code so might be nice to do that anyway and use is_visible() / sysfs_update_group() but that would be a job for another day if at all. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
On Tue, 07 Nov 2023 10:29:33 +0000 "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk> wrote: > arch_register_cpu() and arch_unregister_cpu() are not used by anything > that can be a module - they are used by drivers/base/cpu.c and > drivers/acpi/acpi_processor.c, neither of which can be a module. > > Remove the exports. > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Would be 'novel' to have CPUs registered by a module. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > + > > +#ifdef CONFIG_HOTPLUG_CPU > > +void __weak arch_unregister_cpu(int num) > > +{ > > + unregister_cpu(&per_cpu(cpu_devices, num)); > > +} > > +#endif /* CONFIG_HOTPLUG_CPU */ > > I have previously asked the question whether we should provide a > stub weak function for the !HOTPLUG_CPU case for this, which would > alleviate the concerns around if (IS_ENABLED()) in some of the later > hotplug vCPU patches... which failed to get _any_ responses. > > So, I'm now going to deem the comment I received about if (IS_ENABLED()) > potentially causing issues to be unimportant, and thus there's no > need for a stub weak function. If we start getting compile errors, > then we can address the issue at that point. So far, however, the > kernel build bot has not identified that this as an issue... and it's > been chewing on this entire patch set for well over a month now. > Make sense to fix this only if it's a real problem. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
On Tue, 28 Nov 2023 13:55:36 +0000 Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote: > On Tue, 21 Nov 2023 13:27:08 +0000 > "Russell King (Oracle)" <linux@armlinux.org.uk> wrote: > > > On Mon, Nov 13, 2023 at 08:00:19AM +0000, Russell King (Oracle) wrote: > > > On Mon, Nov 13, 2023 at 10:58:46AM +1000, Gavin Shan wrote: > > > > > > > > > > > > On 11/7/23 20:30, Russell King (Oracle) wrote: > > > > > From: James Morse <james.morse@arm.com> > > > > > > > > > > NUMA systems require the node descriptions to be ready before CPUs are > > > > > registered. This is so that the node symlinks can be created in sysfs. > > > > > > > > > > Currently no NUMA platform uses GENERIC_CPU_DEVICES, meaning that CPUs > > > > > are registered by arch code, instead of cpu_dev_init(). > > > > > > > > > > Move cpu_dev_init() after node_dev_init() so that NUMA architectures > > > > > can use GENERIC_CPU_DEVICES. > > > > > > > > > > Signed-off-by: James Morse <james.morse@arm.com> > > > > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > > > > > --- > > > > > Note: Jonathan's comment still needs addressing - see > > > > > https://lore.kernel.org/r/20230914121612.00006ac7@Huawei.com > > > > > --- > > > > > drivers/base/init.c | 2 +- > > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > > > > > > With Jonathan's comments addressed: > > > > > > That needs James' input, which is why I made the note on the patch. > > > > I'm going to be posting the series without RFC soon, and it will be > > with Jonathan's comment unaddressed - because as I've said several > > times it needs James' input and we have sadly not yet received that. > > > > Short of waiting until James can respond, I don't think there are > > any other alternatives. > > In the interests of expediency I'm fine with that. (To be honest I'd > forgotten I even made that comment ;) > Given what I was looking for was a 'nice to have' extra bit of info in the patch description and I'm fine with the actual change even without that: Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > Jonathan > > > > > I do hope we can get this queued up for v6.8 though. *fingers crossed* ! > > > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Tue, 07 Nov 2023 10:30:20 +0000 Russell King <rmk+kernel@armlinux.org.uk> wrote: > From: James Morse <james.morse@arm.com> > > loongarch, mips, parisc, riscv and sh all print a warning if > register_cpu() returns an error. Architectures that use > GENERIC_CPU_DEVICES call panic() instead. > > Errors in this path indicate something is wrong with the firmware > description of the platform, but the kernel is able to keep running. > > Downgrade this to a warning to make it easier to debug this issue. > > This will allow architectures that switching over to GENERIC_CPU_DEVICES > to drop their warning, but keep the existing behaviour. > > Signed-off-by: James Morse <james.morse@arm.com> > Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> I guess there may be paths later that were never exposed because of this panic, but any such should be fixed rather than relying on this big hammer. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
On Tue, 07 Nov 2023 10:30:35 +0000 Russell King <rmk+kernel@armlinux.org.uk> wrote: > From: James Morse <james.morse@arm.com> > > Now that GENERIC_CPU_DEVICES calls arch_register_cpu(), which can be > overridden by the arch code, switch over to this to allow common code > to choose when the register_cpu() call is made. > > x86's struct cpus come from struct x86_cpu, which has no other members > or users. Remove this and use the version defined by common code. > > This is an intermediate step to the logic being moved to drivers/acpi, > where GENERIC_CPU_DEVICES will do the work when booting with acpi=off. > > This patch also has the effect of moving the registration of CPUs from > subsys to driver core initialisation, prior to any initcalls running. > > Signed-off-by: James Morse <james.morse@arm.com> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> In perfect world, I'd have liked the structure squash done as a precursor as then the patch would have been a little less noisy. However, sometimes it's just not worth the effort. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
On Tue, 07 Nov 2023 10:30:45 +0000 "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk> wrote: > Convert x86 to use the arch_cpu_is_hotpluggable() helper rather than > arch_register_cpu(). > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> As with earlier set of related changes, could squash this down to avoid churn and use Co-developed or similar. Up to you though. Maybe a forwards reference to this being a later change in the patch 15 description might be good though! > --- > arch/x86/kernel/topology.c | 7 ++----- > 1 file changed, 2 insertions(+), 5 deletions(-) > > diff --git a/arch/x86/kernel/topology.c b/arch/x86/kernel/topology.c > index 211863cb5b81..d42c28b8bfd8 100644 > --- a/arch/x86/kernel/topology.c > +++ b/arch/x86/kernel/topology.c > @@ -36,11 +36,8 @@ > #include <asm/cpu.h> > > #ifdef CONFIG_HOTPLUG_CPU > -int arch_register_cpu(int cpu) > +bool arch_cpu_is_hotpluggable(int cpu) > { > - struct cpu *c = per_cpu_ptr(&cpu_devices, cpu); > - > - c->hotpluggable = cpu > 0; > - return register_cpu(c, cpu); > + return cpu > 0; > } > #endif /* CONFIG_HOTPLUG_CPU */
On Tue, Nov 28, 2023 at 02:37:22PM +0000, Jonathan Cameron wrote: > On Tue, 07 Nov 2023 10:29:23 +0000 > Russell King <rmk+kernel@armlinux.org.uk> wrote: > > > From: James Morse <james.morse@arm.com> > > > > register_cpu_capacity_sysctl() adds a property to sysfs that describes > > the CPUs capacity. This is done from a subsys_initcall() that assumes > > all possible CPUs are registered. > > > > With CPU hotplug, possible CPUs aren't registered until they become > > present, (or for arm64 enabled). This leads to messages during boot: > > | register_cpu_capacity_sysctl: too early to get CPU1 device! > > and once these CPUs are added to the system, the file is missing. > > > > Move this to a cpuhp callback, so that the file is created once > > CPUs are brought online. This covers CPUs that are added late by > > mechanisms like hotplug. > > One observable difference is the file is now missing for offline CPUs. > > > > Signed-off-by: James Morse <james.morse@arm.com> > > --- > > If the offline CPUs thing is a problem for the tools that consume > > this value, we'd need to move cpu_capacity to be part of cpu.c's > > common_cpu_attr_groups. > > I'm not keen on squirting sysfs files in from code so > might be nice to do that anyway and use is_visible() / sysfs_update_group() > but that would be a job for another day if at all. I'm doing my best, but it's really not helped by the dysfunctional nature of some parts of the kernel community. I have now decided that this is not possible to implement. So while it's a nice idea, I don't think we'll ever see it. As I mentioned on the 14th November, complete with a patch (and got no response from anyone): > Looking into doing this, the easy bit is adding the attribute group > with an appropriate .is_visible dependent on cpu_present(), but we > need to be able to call sysfs_update_groups() when the state of the > .is_visible() changes. > > Given the comment in sysfs_update_groups() about "if an error occurs", > rather than making this part of common_cpu_attr_groups, would it be > better that it's part of its own set of groups, thus limiting the > damage from a possible error? I suspect, however, that any error at > that point means that the system is rather fatally wounded. > > This is what I have so far to implement your idea, less the necessary > sysfs_update_groups() call when we need to change the visibility of > the attributes.
On Mon, Nov 13, 2023 at 10:58:46AM +1000, Gavin Shan wrote: > > > On 11/7/23 20:30, Russell King (Oracle) wrote: > > From: James Morse <james.morse@arm.com> > > > > NUMA systems require the node descriptions to be ready before CPUs are > > registered. This is so that the node symlinks can be created in sysfs. > > > > Currently no NUMA platform uses GENERIC_CPU_DEVICES, meaning that CPUs > > are registered by arch code, instead of cpu_dev_init(). > > > > Move cpu_dev_init() after node_dev_init() so that NUMA architectures > > can use GENERIC_CPU_DEVICES. > > > > Signed-off-by: James Morse <james.morse@arm.com> > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > > --- > > Note: Jonathan's comment still needs addressing - see > > https://lore.kernel.org/r/20230914121612.00006ac7@Huawei.com > > --- > > drivers/base/init.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > With Jonathan's comments addressed: > > Reviewed-by: Gavin Shan <gshan@redhat.com> Can I assume, given Jonathan's reply later in this sub-thread, that you are happy for me to add your r-b without the referred comment having been addressed - Jonathan says tit was a "nice to have" and he's fine without the requested change. See https://lore.kernel.org/r/20231128150017.000069eb@Huawei.com Thanks.