diff mbox series

Revert "leds: led-core: Fix refcount leak in of_led_get()"

Message ID 20240625-led-class-device-leak-v1-1-9eb4436310c2@bootlin.com
State Superseded
Headers show
Series Revert "leds: led-core: Fix refcount leak in of_led_get()" | expand

Commit Message

Luca Ceresoli June 25, 2024, 7:26 a.m. UTC
This reverts commit da1afe8e6099980fe1e2fd7436dca284af9d3f29.

Commit 699a8c7c4bd3 ("leds: Add of_led_get() and led_put()"), introduced in
5.5, added of_led_get() and led_put() but missed a put_device() in
led_put(), thus creating a leak in case the consumer device is removed.

Arguably device removal was not very popular, so this went apparently
unnoticed until 2022. In January 2023 two different patches got merged to
fix the same bug:

 - commit da1afe8e6099 ("leds: led-core: Fix refcount leak in of_led_get()")
 - commit 445110941eb9 ("leds: led-class: Add missing put_device() to led_put()")

They fix the bug in two different ways, which creates no patch conflicts,
and both were merged in v6.2. The result is that now there is one more
put_device() than get_device()s, instead of one less.

Arguably device removal is not very popular yet, so this apparently hasn't
been noticed as well up to now. But it blew up here while I'm working with
device tree overlay insertion and removal. The symptom is an apparently
unrelated list of oopses on device removal, with reasons:

  kernfs: can not remove 'uevent', no directory
  kernfs: can not remove 'brightness', no directory
  kernfs: can not remove 'max_brightness', no directory
  ...

Here sysfs fails removing attribute files, which is because the device name
changed and so the sysfs path. This is because the device name string got
corrupted, which is because it got freed too early and its memory reused.

Different symptoms could appear in different use cases.

Fix by removing one of the two fixes.

The choice was to remove commit da1afe8e6099 because:

 * it is calling put_device() inside of_led_get() just after getting the
   device, thus it is basically not refcounting the LED device at all
   during its entire lifetime
 * it does not add a corresponding put_device() in led_get(), so it fixes
   only the OF case

The other fix (445110941eb9) is adding the put_device() in led_put() so it
covers the entire lifetime, and it works even in the non-DT case.

Fixes: da1afe8e6099 ("leds: led-core: Fix refcount leak in of_led_get()")
Co-developed-by: Hervé Codina <herve.codina@bootlin.com>
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
---
 drivers/leds/led-class.c | 1 -
 1 file changed, 1 deletion(-)


---
base-commit: 28ef3e64d0a22f6a29a1ea489293715a29623e52
change-id: 20240625-led-class-device-leak-6637a2821678

Best regards,

Comments

Herve Codina June 25, 2024, 8:07 a.m. UTC | #1
Hi Luca,

On Tue, 25 Jun 2024 09:26:52 +0200
Luca Ceresoli <luca.ceresoli@bootlin.com> wrote:

> This reverts commit da1afe8e6099980fe1e2fd7436dca284af9d3f29.
> 
> Commit 699a8c7c4bd3 ("leds: Add of_led_get() and led_put()"), introduced in
> 5.5, added of_led_get() and led_put() but missed a put_device() in
> led_put(), thus creating a leak in case the consumer device is removed.
> 
> Arguably device removal was not very popular, so this went apparently
> unnoticed until 2022. In January 2023 two different patches got merged to
> fix the same bug:
> 
>  - commit da1afe8e6099 ("leds: led-core: Fix refcount leak in of_led_get()")
>  - commit 445110941eb9 ("leds: led-class: Add missing put_device() to led_put()")
> 
> They fix the bug in two different ways, which creates no patch conflicts,
> and both were merged in v6.2. The result is that now there is one more
> put_device() than get_device()s, instead of one less.
> 
> Arguably device removal is not very popular yet, so this apparently hasn't
> been noticed as well up to now. But it blew up here while I'm working with
> device tree overlay insertion and removal. The symptom is an apparently
> unrelated list of oopses on device removal, with reasons:
> 
>   kernfs: can not remove 'uevent', no directory
>   kernfs: can not remove 'brightness', no directory
>   kernfs: can not remove 'max_brightness', no directory
>   ...
> 
> Here sysfs fails removing attribute files, which is because the device name
> changed and so the sysfs path. This is because the device name string got
> corrupted, which is because it got freed too early and its memory reused.
> 
> Different symptoms could appear in different use cases.
> 
> Fix by removing one of the two fixes.
> 
> The choice was to remove commit da1afe8e6099 because:
> 
>  * it is calling put_device() inside of_led_get() just after getting the
>    device, thus it is basically not refcounting the LED device at all
>    during its entire lifetime
>  * it does not add a corresponding put_device() in led_get(), so it fixes
>    only the OF case
> 
> The other fix (445110941eb9) is adding the put_device() in led_put() so it
> covers the entire lifetime, and it works even in the non-DT case.
> 
> Fixes: da1afe8e6099 ("leds: led-core: Fix refcount leak in of_led_get()")
> Co-developed-by: Hervé Codina <herve.codina@bootlin.com>
> Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>

As there is a Co-developer, you have to add his/her Signed-off-by:
https://elixir.bootlin.com/linux/v6.10-rc5/source/Documentation/process/submitting-patches.rst#L494

So feel free to:
  a) Add Signed-off-by: Hervé Codina <herve.codina@bootlin.com>
or
  b) Remove Co-developed-by: Hervé Codina <herve.codina@bootlin.com>

Even if I participate in that fix, I will not be upset if you remove the
Co-developed-by :)

Best regards,
Hervé
diff mbox series

Patch

diff --git a/drivers/leds/led-class.c b/drivers/leds/led-class.c
index 24fcff682b24..b23d2138cd83 100644
--- a/drivers/leds/led-class.c
+++ b/drivers/leds/led-class.c
@@ -258,7 +258,6 @@  struct led_classdev *of_led_get(struct device_node *np, int index)
 
 	led_dev = class_find_device_by_of_node(&leds_class, led_node);
 	of_node_put(led_node);
-	put_device(led_dev);
 
 	return led_module_get(led_dev);
 }