Message ID | 20221202095455.83356-1-brgl@bgdev.pl |
---|---|
State | New |
Headers | show |
Series | [GIT,PULL] gpio: fixes for v6.1-rc8 | expand |
On Fri, Dec 2, 2022 at 10:54 AM Bartosz Golaszewski <brgl@bgdev.pl> wrote: > > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> > > Linus, > > Here's the last round of fixes for the upcoming release. The two resource leak > fixes are self-explanatory. The two character device commits need some more > backstory: > > I recently listened to Laurent Pinchart's talk from this year's LPC[1] where he > discussed an issue with many subsystems that export device nodes to user-space > where one can open the device file, unbind the underlying device from the > driver, then call any of the relevant system calls and observe the kernel crash > with a NULL-pointer derefence. > > I verified that the problem exists with the GPIO subsystem as well. The reason > for that is: when a GPIO chip is removed, we drop the chip's data from the > GPIO device and set the relevant pointer to NULL but don't check it in syscall > callbacks in the character device's code. That's fixed by the first patch. > > However that fix alone leaves the character device vulnerable to a race > condition in which the GPIO chip is removed when a syscall is in progress. > To avoid that, we have a second fix that uses a lock to protect the device's > structure from being disabled before all syscall callbacks returned. > > Laurent blamed the issue on devres but I believe the problem comes from the > fact that many driver developers are simply unaware that resources exported > to user-space need to survive the driver unbind and only be released when the > device's reference count goes down to 0. Devres' docs are pretty clear on that: > the resources get freed on driver unbind. Resources that should survive it, > must not be managed. This is however a topic for a separate discussion which > I intend to start soon. > > Please pull, > Bartosz Golaszewski > > [1] https://www.youtube.com/watch?v=kW8LHWlJPTU > > The following changes since commit b7b275e60bcd5f89771e865a8239325f86d9927d: > > Linux 6.1-rc7 (2022-11-27 13:31:48 -0800) > > are available in the Git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux.git tags/gpio-fixes-for-v6.1-rc8 > > for you to fetch changes up to 98d8b93c617139aeaf745c1573c02d86830f25d1: > > gpiolib: protect the GPIO device against being dropped while in use by user-space (2022-12-01 19:17:47 +0100) > > ---------------------------------------------------------------- > gpio fixes for v6.1-rc8 > > - fix a memory leak in gpiochip_setup_dev() in core gpiolib > - fix a reference leak in gpio-amd8111 > - fix a problem with user-space being able to trigger a NULL-pointer > dereference ovet the GPIO character device > > ---------------------------------------------------------------- > Bartosz Golaszewski (2): > gpiolib: cdev: fix NULL-pointer dereferences > gpiolib: protect the GPIO device against being dropped while in use by user-space > > Xiongfeng Wang (1): > gpio: amd8111: Fix PCI device reference count leak > > Zeng Heng (1): > gpiolib: fix memory leak in gpiochip_setup_dev() > > drivers/gpio/gpio-amd8111.c | 4 + > drivers/gpio/gpiolib-cdev.c | 193 +++++++++++++++++++++++++++++++++++++++----- > drivers/gpio/gpiolib.c | 46 +++++++---- > drivers/gpio/gpiolib.h | 5 ++ > 4 files changed, 210 insertions(+), 38 deletions(-) Linus, Please don't pull this just yet, there's a build issue that was reported to me with the last commit in that PR. I need to fix it and redo the PR. Bartosz
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Linus, Here's the last round of fixes for the upcoming release. The two resource leak fixes are self-explanatory. The two character device commits need some more backstory: I recently listened to Laurent Pinchart's talk from this year's LPC[1] where he discussed an issue with many subsystems that export device nodes to user-space where one can open the device file, unbind the underlying device from the driver, then call any of the relevant system calls and observe the kernel crash with a NULL-pointer derefence. I verified that the problem exists with the GPIO subsystem as well. The reason for that is: when a GPIO chip is removed, we drop the chip's data from the GPIO device and set the relevant pointer to NULL but don't check it in syscall callbacks in the character device's code. That's fixed by the first patch. However that fix alone leaves the character device vulnerable to a race condition in which the GPIO chip is removed when a syscall is in progress. To avoid that, we have a second fix that uses a lock to protect the device's structure from being disabled before all syscall callbacks returned. Laurent blamed the issue on devres but I believe the problem comes from the fact that many driver developers are simply unaware that resources exported to user-space need to survive the driver unbind and only be released when the device's reference count goes down to 0. Devres' docs are pretty clear on that: the resources get freed on driver unbind. Resources that should survive it, must not be managed. This is however a topic for a separate discussion which I intend to start soon. Please pull, Bartosz Golaszewski [1] https://www.youtube.com/watch?v=kW8LHWlJPTU The following changes since commit b7b275e60bcd5f89771e865a8239325f86d9927d: Linux 6.1-rc7 (2022-11-27 13:31:48 -0800) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux.git tags/gpio-fixes-for-v6.1-rc8 for you to fetch changes up to 98d8b93c617139aeaf745c1573c02d86830f25d1: gpiolib: protect the GPIO device against being dropped while in use by user-space (2022-12-01 19:17:47 +0100) ---------------------------------------------------------------- gpio fixes for v6.1-rc8 - fix a memory leak in gpiochip_setup_dev() in core gpiolib - fix a reference leak in gpio-amd8111 - fix a problem with user-space being able to trigger a NULL-pointer dereference ovet the GPIO character device ---------------------------------------------------------------- Bartosz Golaszewski (2): gpiolib: cdev: fix NULL-pointer dereferences gpiolib: protect the GPIO device against being dropped while in use by user-space Xiongfeng Wang (1): gpio: amd8111: Fix PCI device reference count leak Zeng Heng (1): gpiolib: fix memory leak in gpiochip_setup_dev() drivers/gpio/gpio-amd8111.c | 4 + drivers/gpio/gpiolib-cdev.c | 193 +++++++++++++++++++++++++++++++++++++++----- drivers/gpio/gpiolib.c | 46 +++++++---- drivers/gpio/gpiolib.h | 5 ++ 4 files changed, 210 insertions(+), 38 deletions(-)