mbox series

[v10,0/7] Make ghes_edac a proper module

Message ID 20221018082214.569504-1-justin.he@arm.com
Headers show
Series Make ghes_edac a proper module | expand

Message

Jia He Oct. 18, 2022, 8:22 a.m. UTC
Commit dc4e8c07e9e2 ("ACPI: APEI: explicit init of HEST and GHES in
apci_init()") introduced a bug that ghes_edac_register() would be invoked
before edac_init(). Because at that time, the bus "edac" hasn't been even
registered, this created sysfs /devices/mc0 instead of
/sys/devices/system/edac/mc/mc0 on an Ampere eMag server.

The solution is to make ghes_edac a proper module.

Changelog:
v10:
 - add the RCU_INITIALIZER and use the unrcu_pointer wrap for victim
v9:https://lore.kernel.org/lkml/20221017130140.420986-1-justin.he@arm.com/
 - drop the unrcu_pointer patch 06 of v8
 - add Ard's xchg_release patch to use a better memory barrier
v8:https://lore.kernel.org/lkml/20221010023559.69655-1-justin.he@arm.com/
 - merge v7 two force_enable and ghes_get_devices() patches into one
 - make force_enable static
v7:https://lore.kernel.org/lkml/20220929023726.73727-1-justin.he@arm.com/
 - remove the ghes_edac_preferred and ghes_present (suggested by Borislav)
 - adjust the patch splitting, no major functional changes
 - remove the r-b tag in those changed patches
v6:https://www.spinics.net/lists/kernel/msg4511453.html
 - no code changes from v5 patches
 - add the reviewed and acked by from Toshi
 - describe the removal of ghes_edac_force_enable checking in Patch 05
v5: https://www.spinics.net/lists/kernel/msg4502787.html
 - add the review-by from Toshi for patch 04 and 06
 - refine the commit msg
 - remove the unconditional set of ghes_edac_force_enable on Arm
v4: https://lore.kernel.org/lkml/20220831074027.13849-6-justin.he@arm.com/
 - move the kernel boot option to ghes module parameter
 - collapse th ghes_present and ghes_edac_preferred into one patch
v3: https://lore.kernel.org/lkml/20220822154048.188253-1-justin.he@arm.com/
 - refine the commit logs
 - introduce ghes preferred and present flag (by Toshi)
 - move force_load to setup parameter
 - add the ghes_edac_preferred() check for x86/Arm edac drivers
v2: https://lore.kernel.org/lkml/20220817143458.335938-1-justin.he@arm.com/
 - add acked-by tag of Patch 1 from Ard
 - split the notifier patch
 - add 2 patch to get regular drivers selected when ghes edac is not loaded
 - fix an errno in igen6 driver
 - add a patch to fix the sparse warning of ghes
 - refine the commit logs
v1: https://lore.kernel.org/lkml/20220811091713.10427-1-justin.he@arm.com/

Ard Biesheuvel (1):
  apei/ghes: Use xchg_release() for updating new cache slot instead of
    cmpxchg()

Jia He (6):
  efi/cper: export several helpers for ghes_edac to use
  EDAC/ghes: Add a notifier for reporting memory errors
  EDAC/ghes: Prepare to make ghes_edac a proper module
  EDAC/ghes: Make ghes_edac a proper module to remove the dependency on
    ghes
  EDAC: Add the ghes_get_devices() check for chipset-specific edac
    drivers
  EDAC/igen6: Return consistent errno when another edac driver is
    enabled

 drivers/acpi/apei/ghes.c       | 111 +++++++++++++++++++++++++--------
 drivers/edac/Kconfig           |   4 +-
 drivers/edac/amd64_edac.c      |   3 +
 drivers/edac/armada_xp_edac.c  |   3 +
 drivers/edac/edac_module.h     |   1 +
 drivers/edac/ghes_edac.c       |  90 +++++++++++++++-----------
 drivers/edac/i10nm_base.c      |   3 +
 drivers/edac/igen6_edac.c      |   5 +-
 drivers/edac/layerscape_edac.c |   3 +
 drivers/edac/pnd2_edac.c       |   3 +
 drivers/edac/sb_edac.c         |   3 +
 drivers/edac/skx_base.c        |   3 +
 drivers/edac/thunderx_edac.c   |   3 +
 drivers/edac/xgene_edac.c      |   3 +
 drivers/firmware/efi/cper.c    |   3 +
 include/acpi/ghes.h            |  34 +++-------
 16 files changed, 187 insertions(+), 88 deletions(-)

Comments

Borislav Petkov Oct. 25, 2022, 2:08 p.m. UTC | #1
On Tue, Oct 18, 2022 at 08:22:07AM +0000, Jia He wrote:
> Commit dc4e8c07e9e2 ("ACPI: APEI: explicit init of HEST and GHES in
> apci_init()") introduced a bug that ghes_edac_register() would be invoked
> before edac_init(). Because at that time, the bus "edac" hasn't been even
> registered, this created sysfs /devices/mc0 instead of
> /sys/devices/system/edac/mc/mc0 on an Ampere eMag server.
> 
> The solution is to make ghes_edac a proper module.
> 
> Changelog:
> v10:

All queued, thanks for the effort.

It'll appear in Linux next soon.
Jia He Oct. 26, 2022, 1:06 a.m. UTC | #2
> -----Original Message-----
> From: Borislav Petkov <bp@alien8.de>
> Sent: Tuesday, October 25, 2022 10:08 PM
> To: Justin He <Justin.He@arm.com>
> Cc: Ard Biesheuvel <ardb@kernel.org>; Len Brown <lenb@kernel.org>; Tony
> Luck <tony.luck@intel.com>; Mauro Carvalho Chehab
> <mchehab@kernel.org>; Robert Richter <rric@kernel.org>; Robert Moore
> <robert.moore@intel.com>; Qiuxu Zhuo <qiuxu.zhuo@intel.com>; Yazen
> Ghannam <yazen.ghannam@amd.com>; Jan Luebbe <jlu@pengutronix.de>;
> Khuong Dinh <khuong@os.amperecomputing.com>; Kani Toshi
> <toshi.kani@hpe.com>; James Morse <James.Morse@arm.com>;
> linux-acpi@vger.kernel.org; linux-kernel@vger.kernel.org;
> linux-edac@vger.kernel.org; devel@acpica.org; Rafael J . Wysocki
> <rafael@kernel.org>; Shuai Xue <xueshuai@linux.alibaba.com>; Jarkko
> Sakkinen <jarkko@kernel.org>; linux-efi@vger.kernel.org; nd <nd@arm.com>;
> Peter Zijlstra <peterz@infradead.org>
> Subject: Re: [PATCH v10 0/7] Make ghes_edac a proper module
> 
> On Tue, Oct 18, 2022 at 08:22:07AM +0000, Jia He wrote:
> > Commit dc4e8c07e9e2 ("ACPI: APEI: explicit init of HEST and GHES in
> > apci_init()") introduced a bug that ghes_edac_register() would be
> > invoked before edac_init(). Because at that time, the bus "edac"
> > hasn't been even registered, this created sysfs /devices/mc0 instead
> > of
> > /sys/devices/system/edac/mc/mc0 on an Ampere eMag server.
> >
> > The solution is to make ghes_edac a proper module.
> >
> > Changelog:
> > v10:
> 
> All queued, thanks for the effort.
> 
> It'll appear in Linux next soon.
> 
Thanks for your help and patience 😊


--
Cheers,
Justin (Jia He)