Message ID | 20250115123149.3324733-1-tanxiaofei@huawei.com |
---|---|
State | New |
Headers | show |
Series | [v2] acpi: Fix HED module initialization order when it is built-in | expand |
On Wed, Jan 15, 2025 at 1:38 PM Xiaofei Tan <tanxiaofei@huawei.com> wrote: > > When the module HED is built-in, the init order is determined by > Makefile order. That order violates expectations. Because the module > HED init is behind evged. RAS records can't be handled in the > special time window that evged has initialized while HED not. > If the number of such RAS records is more than the APEI HEST error > source number, the HEST resources could be occupied all, and then > could affect subsequent RAS error reporting. > > If build HED as a module, the problem remains. To solve this problem > completely, change the ACPI_HED from tristate to bool. > > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> > --- > drivers/acpi/Kconfig | 2 +- > drivers/acpi/Makefile | 8 +++++++- > 2 files changed, 8 insertions(+), 2 deletions(-) > > diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig > index d81b55f5068c..7f10aa38269d 100644 > --- a/drivers/acpi/Kconfig > +++ b/drivers/acpi/Kconfig > @@ -452,7 +452,7 @@ config ACPI_SBS > the modules will be called sbs and sbshc. > > config ACPI_HED > - tristate "Hardware Error Device" > + bool "Hardware Error Device" > help > This driver supports the Hardware Error Device (PNP0C33), > which is used to report some hardware errors notified via > diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile > index 40208a0f5dfb..b50d1baeb71f 100644 > --- a/drivers/acpi/Makefile > +++ b/drivers/acpi/Makefile > @@ -15,6 +15,13 @@ endif > > obj-$(CONFIG_ACPI) += tables.o > > +# > +# The hed.o needs to be in front of evged.o to avoid the problem that > +# RAS errors cannot be handled in the special time window of startup > +# phase that evged has initialized while hed not. > +# > +obj-$(CONFIG_ACPI_HED) += hed.o > + I'm not sure why you are insisting on this Makefile ordering change. It would be much more robust to run the hed driver init at a different initcall level than evged. If there is a problem with this approach, it needs to be mentioned in the changelog or in the comment above. > # > # ACPI Core Subsystem (Interpreter) > # > @@ -95,7 +102,6 @@ obj-$(CONFIG_ACPI_HOTPLUG_IOAPIC) += ioapic.o > obj-$(CONFIG_ACPI_BATTERY) += battery.o > obj-$(CONFIG_ACPI_SBS) += sbshc.o > obj-$(CONFIG_ACPI_SBS) += sbs.o > -obj-$(CONFIG_ACPI_HED) += hed.o > obj-$(CONFIG_ACPI_EC_DEBUGFS) += ec_sys.o > obj-$(CONFIG_ACPI_BGRT) += bgrt.o > obj-$(CONFIG_ACPI_CPPC_LIB) += cppc_acpi.o > --
在 2025/1/15 23:51, Rafael J. Wysocki 写道: > On Wed, Jan 15, 2025 at 1:38 PM Xiaofei Tan <tanxiaofei@huawei.com> wrote: >> When the module HED is built-in, the init order is determined by >> Makefile order. That order violates expectations. Because the module >> HED init is behind evged. RAS records can't be handled in the >> special time window that evged has initialized while HED not. >> If the number of such RAS records is more than the APEI HEST error >> source number, the HEST resources could be occupied all, and then >> could affect subsequent RAS error reporting. >> >> If build HED as a module, the problem remains. To solve this problem >> completely, change the ACPI_HED from tristate to bool. >> >> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> >> Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> >> --- >> drivers/acpi/Kconfig | 2 +- >> drivers/acpi/Makefile | 8 +++++++- >> 2 files changed, 8 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig >> index d81b55f5068c..7f10aa38269d 100644 >> --- a/drivers/acpi/Kconfig >> +++ b/drivers/acpi/Kconfig >> @@ -452,7 +452,7 @@ config ACPI_SBS >> the modules will be called sbs and sbshc. >> >> config ACPI_HED >> - tristate "Hardware Error Device" >> + bool "Hardware Error Device" >> help >> This driver supports the Hardware Error Device (PNP0C33), >> which is used to report some hardware errors notified via >> diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile >> index 40208a0f5dfb..b50d1baeb71f 100644 >> --- a/drivers/acpi/Makefile >> +++ b/drivers/acpi/Makefile >> @@ -15,6 +15,13 @@ endif >> >> obj-$(CONFIG_ACPI) += tables.o >> >> +# >> +# The hed.o needs to be in front of evged.o to avoid the problem that >> +# RAS errors cannot be handled in the special time window of startup >> +# phase that evged has initialized while hed not. >> +# >> +obj-$(CONFIG_ACPI_HED) += hed.o >> + > I'm not sure why you are insisting on this Makefile ordering change. > > It would be much more robust to run the hed driver init at a different > initcall level than evged. > > If there is a problem with this approach, it needs to be mentioned in > the changelog or in the comment above. Hi Rafael, The approach of changing the initcall level can work too. Will send v3 patch later, thanks. >> # >> # ACPI Core Subsystem (Interpreter) >> # >> @@ -95,7 +102,6 @@ obj-$(CONFIG_ACPI_HOTPLUG_IOAPIC) += ioapic.o >> obj-$(CONFIG_ACPI_BATTERY) += battery.o >> obj-$(CONFIG_ACPI_SBS) += sbshc.o >> obj-$(CONFIG_ACPI_SBS) += sbs.o >> -obj-$(CONFIG_ACPI_HED) += hed.o >> obj-$(CONFIG_ACPI_EC_DEBUGFS) += ec_sys.o >> obj-$(CONFIG_ACPI_BGRT) += bgrt.o >> obj-$(CONFIG_ACPI_CPPC_LIB) += cppc_acpi.o >> -- > .
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig index d81b55f5068c..7f10aa38269d 100644 --- a/drivers/acpi/Kconfig +++ b/drivers/acpi/Kconfig @@ -452,7 +452,7 @@ config ACPI_SBS the modules will be called sbs and sbshc. config ACPI_HED - tristate "Hardware Error Device" + bool "Hardware Error Device" help This driver supports the Hardware Error Device (PNP0C33), which is used to report some hardware errors notified via diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile index 40208a0f5dfb..b50d1baeb71f 100644 --- a/drivers/acpi/Makefile +++ b/drivers/acpi/Makefile @@ -15,6 +15,13 @@ endif obj-$(CONFIG_ACPI) += tables.o +# +# The hed.o needs to be in front of evged.o to avoid the problem that +# RAS errors cannot be handled in the special time window of startup +# phase that evged has initialized while hed not. +# +obj-$(CONFIG_ACPI_HED) += hed.o + # # ACPI Core Subsystem (Interpreter) # @@ -95,7 +102,6 @@ obj-$(CONFIG_ACPI_HOTPLUG_IOAPIC) += ioapic.o obj-$(CONFIG_ACPI_BATTERY) += battery.o obj-$(CONFIG_ACPI_SBS) += sbshc.o obj-$(CONFIG_ACPI_SBS) += sbs.o -obj-$(CONFIG_ACPI_HED) += hed.o obj-$(CONFIG_ACPI_EC_DEBUGFS) += ec_sys.o obj-$(CONFIG_ACPI_BGRT) += bgrt.o obj-$(CONFIG_ACPI_CPPC_LIB) += cppc_acpi.o