diff mbox series

[v3] acpi: Fix HED module initialization order when it is built-in

Message ID 20250117022957.25227-1-tanxiaofei@huawei.com
State New
Headers show
Series [v3] acpi: Fix HED module initialization order when it is built-in | expand

Commit Message

Xiaofei Tan Jan. 17, 2025, 2:29 a.m. UTC
When the module HED is built-in, the module HED init is behind EVGED
as the driver are in the same initcall level, then the order is determined
by Makefile order. That order violates expectations. Because RAS records
can't be handled in the special time window that EVGED has initialized
while HED not.

If the number of such RAS records is more than the APEI HEST error source
number, the HEST resources could be occupied all, and then could affect
subsequent RAS error reporting.

Change the initcall level of HED to subsys_init to fix the issue. If build
HED as a module, the problem remains. To solve this problem completely,
change the ACPI_HED from tristate to bool.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
---
 drivers/acpi/Kconfig | 2 +-
 drivers/acpi/hed.c   | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

Comments

Jonathan Cameron Jan. 20, 2025, 11:04 a.m. UTC | #1
On Fri, 17 Jan 2025 10:29:57 +0800
Xiaofei Tan <tanxiaofei@huawei.com> wrote:

> When the module HED is built-in, the module HED init is behind EVGED
> as the driver are in the same initcall level, then the order is determined
> by Makefile order. That order violates expectations. Because RAS records
> can't be handled in the special time window that EVGED has initialized
> while HED not.
> 
> If the number of such RAS records is more than the APEI HEST error source
> number, the HEST resources could be occupied all, and then could affect
> subsequent RAS error reporting.
> 
> Change the initcall level of HED to subsys_init to fix the issue. If build
> HED as a module, the problem remains. To solve this problem completely,
> change the ACPI_HED from tristate to bool.
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Given the change in approach (even though I reviewed this internally)
should probably have dropped my RB.   Anyhow, consider this me
giving it again on list.

Thanks,

Jonathan

> Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
> ---
>  drivers/acpi/Kconfig | 2 +-
>  drivers/acpi/hed.c   | 1 +
>  2 files changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
> index d81b55f5068c..7f10aa38269d 100644
> --- a/drivers/acpi/Kconfig
> +++ b/drivers/acpi/Kconfig
> @@ -452,7 +452,7 @@ config ACPI_SBS
>  	  the modules will be called sbs and sbshc.
>  
>  config ACPI_HED
> -	tristate "Hardware Error Device"
> +	bool "Hardware Error Device"
>  	help
>  	  This driver supports the Hardware Error Device (PNP0C33),
>  	  which is used to report some hardware errors notified via
> diff --git a/drivers/acpi/hed.c b/drivers/acpi/hed.c
> index 7652515a6be1..677dfcce2990 100644
> --- a/drivers/acpi/hed.c
> +++ b/drivers/acpi/hed.c
> @@ -81,6 +81,7 @@ static struct acpi_driver acpi_hed_driver = {
>  	},
>  };
>  module_acpi_driver(acpi_hed_driver);
> +subsys_initcall(acpi_hed_driver_init);
>  
>  MODULE_AUTHOR("Huang Ying");
>  MODULE_DESCRIPTION("ACPI Hardware Error Device Driver");
Xiaofei Tan Jan. 21, 2025, 2:23 a.m. UTC | #2
在 2025/1/20 19:04, Jonathan Cameron 写道:
> On Fri, 17 Jan 2025 10:29:57 +0800
> Xiaofei Tan <tanxiaofei@huawei.com> wrote:
>
>> When the module HED is built-in, the module HED init is behind EVGED
>> as the driver are in the same initcall level, then the order is determined
>> by Makefile order. That order violates expectations. Because RAS records
>> can't be handled in the special time window that EVGED has initialized
>> while HED not.
>>
>> If the number of such RAS records is more than the APEI HEST error source
>> number, the HEST resources could be occupied all, and then could affect
>> subsequent RAS error reporting.
>>
>> Change the initcall level of HED to subsys_init to fix the issue. If build
>> HED as a module, the problem remains. To solve this problem completely,
>> change the ACPI_HED from tristate to bool.
>>
>> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Given the change in approach (even though I reviewed this internally)
> should probably have dropped my RB.   Anyhow, consider this me
> giving it again on list.
OK. thanks.
> Thanks,
>
> Jonathan
>
>> Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
>> ---
>>   drivers/acpi/Kconfig | 2 +-
>>   drivers/acpi/hed.c   | 1 +
>>   2 files changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
>> index d81b55f5068c..7f10aa38269d 100644
>> --- a/drivers/acpi/Kconfig
>> +++ b/drivers/acpi/Kconfig
>> @@ -452,7 +452,7 @@ config ACPI_SBS
>>   	  the modules will be called sbs and sbshc.
>>   
>>   config ACPI_HED
>> -	tristate "Hardware Error Device"
>> +	bool "Hardware Error Device"
>>   	help
>>   	  This driver supports the Hardware Error Device (PNP0C33),
>>   	  which is used to report some hardware errors notified via
>> diff --git a/drivers/acpi/hed.c b/drivers/acpi/hed.c
>> index 7652515a6be1..677dfcce2990 100644
>> --- a/drivers/acpi/hed.c
>> +++ b/drivers/acpi/hed.c
>> @@ -81,6 +81,7 @@ static struct acpi_driver acpi_hed_driver = {
>>   	},
>>   };
>>   module_acpi_driver(acpi_hed_driver);
>> +subsys_initcall(acpi_hed_driver_init);
>>   
>>   MODULE_AUTHOR("Huang Ying");
>>   MODULE_DESCRIPTION("ACPI Hardware Error Device Driver");
> .
diff mbox series

Patch

diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index d81b55f5068c..7f10aa38269d 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -452,7 +452,7 @@  config ACPI_SBS
 	  the modules will be called sbs and sbshc.
 
 config ACPI_HED
-	tristate "Hardware Error Device"
+	bool "Hardware Error Device"
 	help
 	  This driver supports the Hardware Error Device (PNP0C33),
 	  which is used to report some hardware errors notified via
diff --git a/drivers/acpi/hed.c b/drivers/acpi/hed.c
index 7652515a6be1..677dfcce2990 100644
--- a/drivers/acpi/hed.c
+++ b/drivers/acpi/hed.c
@@ -81,6 +81,7 @@  static struct acpi_driver acpi_hed_driver = {
 	},
 };
 module_acpi_driver(acpi_hed_driver);
+subsys_initcall(acpi_hed_driver_init);
 
 MODULE_AUTHOR("Huang Ying");
 MODULE_DESCRIPTION("ACPI Hardware Error Device Driver");