Message ID | 20221013064459.121933-2-u.kleine-koenig@pengutronix.de |
---|---|
State | New |
Headers | show |
Series | [1/2] ACPI: APEI: Drop unsetting driver data on remove | expand |
On Thu, Oct 13, 2022 at 8:53 AM Uwe Kleine-König <u.kleine-koenig@pengutronix.de> wrote: > > If the remove callback failed, it leaves some unfreed resources behind > that will never be cleared. I didn't manage to understand the driver > good enough to judge how critical that really is. > > This patch is part of an effort to change the remove callbacks for > platform devices to return void in the hope this will prevent the wrong > assumption that returning an error code from .remove() is proper error > handling. > > Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> > --- > drivers/acpi/apei/ghes.c | 17 ++++++++++++++++- > 1 file changed, 16 insertions(+), 1 deletion(-) > > Hello, > > on a side note: The remove callback calls (in some cases) free_irq() for > a shared interrupt. A requirement in this case is to disable the > device's interrupt beforehand. It's not obvious (to me that is) that > said irq is disabled here. This is another opportunity for ugly things > to happen. > > Best regards > Uwe > > diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c > index 307fbb97a116..78d2e4df74ee 100644 > --- a/drivers/acpi/apei/ghes.c > +++ b/drivers/acpi/apei/ghes.c > @@ -1393,7 +1393,7 @@ static int ghes_probe(struct platform_device *ghes_dev) > return rc; > } > > -static int ghes_remove(struct platform_device *ghes_dev) > +static int _ghes_remove(struct platform_device *ghes_dev) > { > int rc; > struct ghes *ghes; > @@ -1447,6 +1447,21 @@ static int ghes_remove(struct platform_device *ghes_dev) > return 0; > } > > +static int ghes_remove(struct platform_device *ghes_dev) > +{ > + /* > + * If _ghes_remove() returns an error, we're in trouble. Some of the > + * cleanup was skipped then and this will never be catched up. So some > + * resources will stay around, maybe even used although the platform > + * device will be gone. > + */ > + int err = _ghes_remove(ghes_dev); > + > + WARN_ON(err); > + > + return 0; > +} No, no, no, we don't cut corners like this. It is not very hard to observe that the only case in which ghes_remove() can return an error is when it calls apei_sdei_unregister_ghes(). So if you look at that one, you'll notice that it only propagates the return value of sdei_unregister_ghes() (apart from the trivial "no support" case). Now, sdei_unregister_ghes() really does things that shouldn't fail (because how can firmware refuse to disable an event on unregister it), so that's where the warning should be emitted (in case they fail nevertheless). Also I don't think that dumping the stack is worth it, because the code path is known. > + > static struct platform_driver ghes_platform_driver = { > .driver = { > .name = "GHES", > --
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index 307fbb97a116..78d2e4df74ee 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -1393,7 +1393,7 @@ static int ghes_probe(struct platform_device *ghes_dev) return rc; } -static int ghes_remove(struct platform_device *ghes_dev) +static int _ghes_remove(struct platform_device *ghes_dev) { int rc; struct ghes *ghes; @@ -1447,6 +1447,21 @@ static int ghes_remove(struct platform_device *ghes_dev) return 0; } +static int ghes_remove(struct platform_device *ghes_dev) +{ + /* + * If _ghes_remove() returns an error, we're in trouble. Some of the + * cleanup was skipped then and this will never be catched up. So some + * resources will stay around, maybe even used although the platform + * device will be gone. + */ + int err = _ghes_remove(ghes_dev); + + WARN_ON(err); + + return 0; +} + static struct platform_driver ghes_platform_driver = { .driver = { .name = "GHES",
If the remove callback failed, it leaves some unfreed resources behind that will never be cleared. I didn't manage to understand the driver good enough to judge how critical that really is. This patch is part of an effort to change the remove callbacks for platform devices to return void in the hope this will prevent the wrong assumption that returning an error code from .remove() is proper error handling. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> --- drivers/acpi/apei/ghes.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) Hello, on a side note: The remove callback calls (in some cases) free_irq() for a shared interrupt. A requirement in this case is to disable the device's interrupt beforehand. It's not obvious (to me that is) that said irq is disabled here. This is another opportunity for ugly things to happen. Best regards Uwe