diff mbox series

[3/3] ACPI: APEI: EINJ: Do not fail einj_init() on faux_device_create() failure

Message ID 20250607033228.1475625-4-dan.j.williams@intel.com
State New
Headers show
Series CXL: ACPI: faux: Fix cxl_core.ko module load regression | expand

Commit Message

Dan Williams June 7, 2025, 3:32 a.m. UTC
CXL has a symbol dependency on einj_core.ko, so if einj_init() fails then
cxl_core.ko fails to load. Prior to the faux_device_create() conversion,
einj_probe() failures were tracked by the einj_initialized flag without
failing einj_init().

Revert to that behavior and always succeed einj_init() given there is no
way, and no pressing need, to discern faux device-create vs device-probe
failures.

This situation arose because CXL knows proper kernel named objects to
trigger errors against, but acpi-einj knows how to perform the error
injection. The injection mechanism is shared with non-CXL use cases. The
result is CXL now has a module dependency on einj-core.ko, and init/probe
failures are handled at runtime.

Fixes: 6cb9441bfe8d ("ACPI: APEI: EINJ: Transition to the faux device interface")
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Ben Cheatham <Benjamin.Cheatham@amd.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/acpi/apei/einj-core.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

Comments

Jonathan Cameron June 9, 2025, 4:05 p.m. UTC | #1
On Mon, 9 Jun 2025 12:42:53 +0200
Greg KH <gregkh@linuxfoundation.org> wrote:

> On Mon, Jun 09, 2025 at 11:17:58AM +0100, Jonathan Cameron wrote:
> > On Fri, 6 Jun 2025 20:32:28 -0700
> > Dan Williams <dan.j.williams@intel.com> wrote:
> >   
> > > CXL has a symbol dependency on einj_core.ko, so if einj_init() fails then
> > > cxl_core.ko fails to load. Prior to the faux_device_create() conversion,
> > > einj_probe() failures were tracked by the einj_initialized flag without
> > > failing einj_init().
> > > 
> > > Revert to that behavior and always succeed einj_init() given there is no
> > > way, and no pressing need, to discern faux device-create vs device-probe
> > > failures.
> > > 
> > > This situation arose because CXL knows proper kernel named objects to
> > > trigger errors against, but acpi-einj knows how to perform the error
> > > injection. The injection mechanism is shared with non-CXL use cases. The
> > > result is CXL now has a module dependency on einj-core.ko, and init/probe
> > > failures are handled at runtime.
> > > 
> > > Fixes: 6cb9441bfe8d ("ACPI: APEI: EINJ: Transition to the faux device interface")
> > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> > > Cc: Sudeep Holla <sudeep.holla@arm.com>
> > > Cc: Ben Cheatham <Benjamin.Cheatham@amd.com>
> > > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > > ---
> > >  drivers/acpi/apei/einj-core.c | 9 +++------
> > >  1 file changed, 3 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/drivers/acpi/apei/einj-core.c b/drivers/acpi/apei/einj-core.c
> > > index fea11a35eea3..9b041415a9d0 100644
> > > --- a/drivers/acpi/apei/einj-core.c
> > > +++ b/drivers/acpi/apei/einj-core.c
> > > @@ -883,19 +883,16 @@ static int __init einj_init(void)
> > >  	}
> > >  
> > >  	einj_dev = faux_device_create("acpi-einj", NULL, &einj_device_ops);
> > > -	if (!einj_dev)
> > > -		return -ENODEV;
> > >  
> > > -	einj_initialized = true;
> > > +	if (einj_dev)
> > > +		einj_initialized = true;
> > >  
> > >  	return 0;
> > >  }
> > >  
> > >  static void __exit einj_exit(void)
> > >  {
> > > -	if (einj_initialized)
> > > -		faux_device_destroy(einj_dev);
> > > -
> > > +	faux_device_destroy(einj_dev);  
> > 
> > Hi Dan,
> > 
> > Thi bit is sort of fine though not really related, because
> > faux_device_destroy() checks
> > 
> > void faux_device_destroy(struct faux_device *faux_dev)
> > {
> > 	struct device *dev = &faux_dev->dev;
> > 
> > 	if (!faux_dev)
> > 		return;
> > 
> > Though that check is after a dereference of faux_dev
> > which doesn't look right to me.  Might be fine because
> > of how the kernel is built (I can't remember where we ended
> > up on topic of compilers making undefined behavior based
> > optimizations).  Still not that nice from a logical point of view!  
> 
> I think this is fine as we just put "0 + offset of dev" into dev, and
> didn't do anything with that (i.e. no actual read of that memory
> location happened).  The compiler shouldn't be doing anything that could
> happen after the return before we check for a valid pointer here, right?

Hmm. I did some digging. Seems that was debated 10 years ago without
a huge amount of clarity on the answer beyond all sane people telling
compiler folk not to use this in optimizations :)

Comes down to whether any dereference of NULL is UB whether or not
the compiler can just do a simple offset calculation.

Anyhow, whilst fine, it's still a little ugly to my eyes :(

Jonathan



> 
> thanks,
> 
> greg k-h
>
Dan Williams June 10, 2025, 5:22 p.m. UTC | #2
Jonathan Cameron wrote:
[..]
> Hmm. I did some digging. Seems that was debated 10 years ago without
> a huge amount of clarity on the answer beyond all sane people telling
> compiler folk not to use this in optimizations :)
> 
> Comes down to whether any dereference of NULL is UB whether or not
> the compiler can just do a simple offset calculation.
> 
> Anyhow, whilst fine, it's still a little ugly to my eyes :(

I recall we had this conversation with Dan Carpenter on a smatch patch
and resolved that while it looks "interesting" it does no harm.

For this patch I am not motivated to spin it because even if the the
compiler took advantage of the NULL check to drop UB work, that would
only mean dropping the assignment.

Otherwise, this conversion lines up with the intent of both
einj_initialized and faux_device_destroy() whereby faux_device_destroy()
is already prepared for the case where faux_device_create() fails.
diff mbox series

Patch

diff --git a/drivers/acpi/apei/einj-core.c b/drivers/acpi/apei/einj-core.c
index fea11a35eea3..9b041415a9d0 100644
--- a/drivers/acpi/apei/einj-core.c
+++ b/drivers/acpi/apei/einj-core.c
@@ -883,19 +883,16 @@  static int __init einj_init(void)
 	}
 
 	einj_dev = faux_device_create("acpi-einj", NULL, &einj_device_ops);
-	if (!einj_dev)
-		return -ENODEV;
 
-	einj_initialized = true;
+	if (einj_dev)
+		einj_initialized = true;
 
 	return 0;
 }
 
 static void __exit einj_exit(void)
 {
-	if (einj_initialized)
-		faux_device_destroy(einj_dev);
-
+	faux_device_destroy(einj_dev);
 }
 
 module_init(einj_init);