Message ID | 20201119161604.2633521-1-u.kleine-koenig@pengutronix.de |
---|---|
State | Superseded |
Headers | show |
Series | [v2,1/3] spi: fix resource leak for drivers without .remove callback | expand |
Hi Uwe, On 19.11.2020 17:16, Uwe Kleine-König wrote: > The eventual goal is to get rid of the callbacks in struct > device_driver. Other than not using driver callbacks there should be no > side effect of this patch. > > Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> This patch landed recently in linux-next as commit 9db34ee64ce4 ("spi: Use bus_type functions for probe, remove and shutdown"). It causes a regression on some of my test boards: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000018 Mem abort info: ESR = 0x96000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000004 CM = 0, WnR = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=00000000318ed000 [0000000000000018] pgd=0000000000000000, p4d=0000000000000000 Internal error: Oops: 96000004 [#1] PREEMPT SMP Modules linked in: cpufreq_powersave cpufreq_conservative brcmfmac brcmutil cfg80211 crct10dif_ce s3fwrn5_i2c s3fwrn5 nci nfc s5p_mfc s5p_jpeg hci_uart btqca btbc buf2_dma_contig videobuf2_memops videobuf2_v4l2 bluetooth videobuf2_common videodev panfrost gpu_sched ecdh_generic mc ecc rfkill ip_tables x_tables ipv6 CPU: 0 PID: 1 Comm: systemd-shutdow Not tainted 5.10.0-rc5-next-20201124+ #9771 Hardware name: Samsung TM2E board (DT) pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--) pc : spi_shutdown+0x10/0x38 lr : device_shutdown+0x10c/0x350 sp : ffff80001311bc70 ... Call trace: spi_shutdown+0x10/0x38 kernel_restart_prepare+0x34/0x40 kernel_restart+0x14/0x88 __do_sys_reboot+0x148/0x248 __arm64_sys_reboot+0x1c/0x28 el0_svc_common.constprop.3+0x74/0x198 do_el0_svc+0x20/0x98 el0_sync_handler+0x140/0x1a8 el0_sync+0x140/0x180 Code: f9403402 d1008041 f100005f 9a9f1021 (f9400c21) ---[ end trace 266c07205a2d632e ]--- Kernel panic - not syncing: Oops: Fatal exception Kernel Offset: disabled CPU features: 0x0240022,65006087 Memory Limit: none ---[ end Kernel panic - not syncing: Oops: Fatal exception ]--- > --- > drivers/spi/spi.c | 33 ++++++++++++++++----------------- > 1 file changed, 16 insertions(+), 17 deletions(-) > > diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c > index 5becf6c2c409..e8c0a000ee19 100644 > --- a/drivers/spi/spi.c > +++ b/drivers/spi/spi.c > @@ -374,16 +374,7 @@ static int spi_uevent(struct device *dev, struct kobj_uevent_env *env) > return add_uevent_var(env, "MODALIAS=%s%s", SPI_MODULE_PREFIX, spi->modalias); > } > > -struct bus_type spi_bus_type = { > - .name = "spi", > - .dev_groups = spi_dev_groups, > - .match = spi_match_device, > - .uevent = spi_uevent, > -}; > -EXPORT_SYMBOL_GPL(spi_bus_type); > - > - > -static int spi_drv_probe(struct device *dev) > +static int spi_probe(struct device *dev) > { > const struct spi_driver *sdrv = to_spi_driver(dev->driver); > struct spi_device *spi = to_spi_device(dev); > @@ -414,7 +405,7 @@ static int spi_drv_probe(struct device *dev) > return ret; > } > > -static int spi_drv_remove(struct device *dev) > +static int spi_remove(struct device *dev) > { > const struct spi_driver *sdrv = to_spi_driver(dev->driver); > int ret = 0; > @@ -426,13 +417,25 @@ static int spi_drv_remove(struct device *dev) > return ret; > } > > -static void spi_drv_shutdown(struct device *dev) > +static void spi_shutdown(struct device *dev) > { > const struct spi_driver *sdrv = to_spi_driver(dev->driver); > > - sdrv->shutdown(to_spi_device(dev)); > + if (sdrv->shutdown) > + sdrv->shutdown(to_spi_device(dev)); > } In the above function dev->driver might be NULL, so its use in to_spi_driver() and sdrv->shutdown leads to NULL pointer dereference. I didn't check the details, but a simple check for NULL dev->driver and return is enough to fix this issue. I can send such fix if you want. > +struct bus_type spi_bus_type = { > + .name = "spi", > + .dev_groups = spi_dev_groups, > + .match = spi_match_device, > + .uevent = spi_uevent, > + .probe = spi_probe, > + .remove = spi_remove, > + .shutdown = spi_shutdown, > +}; > +EXPORT_SYMBOL_GPL(spi_bus_type); > + > /** > * __spi_register_driver - register a SPI driver > * @owner: owner module of the driver to register > @@ -445,10 +448,6 @@ int __spi_register_driver(struct module *owner, struct spi_driver *sdrv) > { > sdrv->driver.owner = owner; > sdrv->driver.bus = &spi_bus_type; > - sdrv->driver.probe = spi_drv_probe; > - sdrv->driver.remove = spi_drv_remove; > - if (sdrv->shutdown) > - sdrv->driver.shutdown = spi_drv_shutdown; > return driver_register(&sdrv->driver); > } > EXPORT_SYMBOL_GPL(__spi_register_driver); Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland
Hello Marek, On Tue, Nov 24, 2020 at 01:03:25PM +0100, Marek Szyprowski wrote: > On 19.11.2020 17:16, Uwe Kleine-König wrote: > > The eventual goal is to get rid of the callbacks in struct > > device_driver. Other than not using driver callbacks there should be no > > side effect of this patch. > > > > Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> > > This patch landed recently in linux-next as commit 9db34ee64ce4 ("spi: > Use bus_type functions for probe, remove and shutdown"). > > It causes a regression on some of my test boards: > > Unable to handle kernel NULL pointer dereference at virtual address > 0000000000000018 > Mem abort info: > ESR = 0x96000004 > EC = 0x25: DABT (current EL), IL = 32 bits > SET = 0, FnV = 0 > EA = 0, S1PTW = 0 > Data abort info: > ISV = 0, ISS = 0x00000004 > CM = 0, WnR = 0 > user pgtable: 4k pages, 48-bit VAs, pgdp=00000000318ed000 > [0000000000000018] pgd=0000000000000000, p4d=0000000000000000 > Internal error: Oops: 96000004 [#1] PREEMPT SMP > Modules linked in: cpufreq_powersave cpufreq_conservative brcmfmac > brcmutil cfg80211 crct10dif_ce s3fwrn5_i2c s3fwrn5 nci nfc s5p_mfc > s5p_jpeg hci_uart btqca btbc > buf2_dma_contig videobuf2_memops videobuf2_v4l2 bluetooth > videobuf2_common videodev panfrost gpu_sched ecdh_generic mc ecc rfkill > ip_tables x_tables ipv6 > CPU: 0 PID: 1 Comm: systemd-shutdow Not tainted > 5.10.0-rc5-next-20201124+ #9771 > Hardware name: Samsung TM2E board (DT) > pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--) > pc : spi_shutdown+0x10/0x38 > lr : device_shutdown+0x10c/0x350 > sp : ffff80001311bc70 > ... > Call trace: > spi_shutdown+0x10/0x38 > kernel_restart_prepare+0x34/0x40 > kernel_restart+0x14/0x88 > __do_sys_reboot+0x148/0x248 > __arm64_sys_reboot+0x1c/0x28 > el0_svc_common.constprop.3+0x74/0x198 > do_el0_svc+0x20/0x98 > el0_sync_handler+0x140/0x1a8 > el0_sync+0x140/0x180 > Code: f9403402 d1008041 f100005f 9a9f1021 (f9400c21) > ---[ end trace 266c07205a2d632e ]--- > Kernel panic - not syncing: Oops: Fatal exception > Kernel Offset: disabled > CPU features: 0x0240022,65006087 > Memory Limit: none > ---[ end Kernel panic - not syncing: Oops: Fatal exception ]--- > > > --- > > drivers/spi/spi.c | 33 ++++++++++++++++----------------- > > 1 file changed, 16 insertions(+), 17 deletions(-) > > > > diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c > > index 5becf6c2c409..e8c0a000ee19 100644 > > --- a/drivers/spi/spi.c > > +++ b/drivers/spi/spi.c > > @@ -374,16 +374,7 @@ static int spi_uevent(struct device *dev, struct kobj_uevent_env *env) > > return add_uevent_var(env, "MODALIAS=%s%s", SPI_MODULE_PREFIX, spi->modalias); > > } > > > > -struct bus_type spi_bus_type = { > > - .name = "spi", > > - .dev_groups = spi_dev_groups, > > - .match = spi_match_device, > > - .uevent = spi_uevent, > > -}; > > -EXPORT_SYMBOL_GPL(spi_bus_type); > > - > > - > > -static int spi_drv_probe(struct device *dev) > > +static int spi_probe(struct device *dev) > > { > > const struct spi_driver *sdrv = to_spi_driver(dev->driver); > > struct spi_device *spi = to_spi_device(dev); > > @@ -414,7 +405,7 @@ static int spi_drv_probe(struct device *dev) > > return ret; > > } > > > > -static int spi_drv_remove(struct device *dev) > > +static int spi_remove(struct device *dev) > > { > > const struct spi_driver *sdrv = to_spi_driver(dev->driver); > > int ret = 0; > > @@ -426,13 +417,25 @@ static int spi_drv_remove(struct device *dev) > > return ret; > > } > > > > -static void spi_drv_shutdown(struct device *dev) > > +static void spi_shutdown(struct device *dev) > > { > > const struct spi_driver *sdrv = to_spi_driver(dev->driver); > > > > - sdrv->shutdown(to_spi_device(dev)); > > + if (sdrv->shutdown) > > + sdrv->shutdown(to_spi_device(dev)); > > } > > In the above function dev->driver might be NULL, so its use in > to_spi_driver() and sdrv->shutdown leads to NULL pointer dereference. I > didn't check the details, but a simple check for NULL dev->driver and > return is enough to fix this issue. I can send such fix if you want. Ah, I see. shutdown is called for unbound devices, too. Assuming that Mark prefers a fix on top instead of an updated patch: Yes, please send a fix. Otherwise I can do this, too, as I introduced the problem. Best regards and thanks, Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | https://www.pengutronix.de/ |
On Tue, Nov 24, 2020 at 02:01:07PM +0100, Uwe Kleine-König wrote: > On Tue, Nov 24, 2020 at 01:03:25PM +0100, Marek Szyprowski wrote: > > > + if (sdrv->shutdown) > > > + sdrv->shutdown(to_spi_device(dev)); > > > } > > In the above function dev->driver might be NULL, so its use in > > to_spi_driver() and sdrv->shutdown leads to NULL pointer dereference. I > > didn't check the details, but a simple check for NULL dev->driver and > > return is enough to fix this issue. I can send such fix if you want. > Ah, I see. shutdown is called for unbound devices, too. Assuming that > Mark prefers a fix on top instead of an updated patch: Yes, please send > a fix. Otherwise I can do this, too, as I introduced the problem. Yes, please send an incremental fix (in general in a situation like this I'd just send a fix as part of the original report, it's quicker if the fix is OK).
diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c index 0cab239d8e7f..5becf6c2c409 100644 --- a/drivers/spi/spi.c +++ b/drivers/spi/spi.c @@ -405,9 +405,11 @@ static int spi_drv_probe(struct device *dev) if (ret) return ret; - ret = sdrv->probe(spi); - if (ret) - dev_pm_domain_detach(dev, true); + if (sdrv->probe) { + ret = sdrv->probe(spi); + if (ret) + dev_pm_domain_detach(dev, true); + } return ret; } @@ -415,9 +417,10 @@ static int spi_drv_probe(struct device *dev) static int spi_drv_remove(struct device *dev) { const struct spi_driver *sdrv = to_spi_driver(dev->driver); - int ret; + int ret = 0; - ret = sdrv->remove(to_spi_device(dev)); + if (sdrv->remove) + ret = sdrv->remove(to_spi_device(dev)); dev_pm_domain_detach(dev, true); return ret; @@ -442,10 +445,8 @@ int __spi_register_driver(struct module *owner, struct spi_driver *sdrv) { sdrv->driver.owner = owner; sdrv->driver.bus = &spi_bus_type; - if (sdrv->probe) - sdrv->driver.probe = spi_drv_probe; - if (sdrv->remove) - sdrv->driver.remove = spi_drv_remove; + sdrv->driver.probe = spi_drv_probe; + sdrv->driver.remove = spi_drv_remove; if (sdrv->shutdown) sdrv->driver.shutdown = spi_drv_shutdown; return driver_register(&sdrv->driver);
Consider an spi driver with a .probe but without a .remove callback (e.g. rtc-ds1347). The function spi_drv_probe() is called to bind a device and so dev_pm_domain_attach() is called. As there is no remove callback spi_drv_remove() isn't called at unbind time however and so calling dev_pm_domain_detach() is missed and the pm domain keeps active. To fix this always use both spi_drv_probe() and spi_drv_remove() and make them handle the respective callback not being set. This has the side effect that for a (hypothetical) driver that has neither .probe nor remove the clk and pm domain setup is done. Fixes: 33cf00e57082 ("spi: attach/detach SPI device to the ACPI power domain") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> --- Changes since (implicit) v1: - make use of spi_drv_probe and spi_drv_remove unconditionally. drivers/spi/spi.c | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) base-commit: 09162bc32c880a791c6c0668ce0745cf7958f576