mbox series

[v4,00/11] gpiolib: work towards removing gpiochip_find()

Message ID 20230927142931.19798-1-brgl@bgdev.pl
Headers show
Series gpiolib: work towards removing gpiochip_find() | expand

Message

Bartosz Golaszewski Sept. 27, 2023, 2:29 p.m. UTC
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

This is a reduced subset of patches from the initial sumbission[1]
limited only to changes inside GPIOLIB. Once this is upstream, we can
then slowly merge patches for other subsystems (like HTE) and then
eventually remove gpiochip_find() entirely.

The GPIO subsystem does not handle hot-unplug events very well. We have
recently patched the user-space part of it so that at least a rouge user
cannot crash the kernel but in-kernel users are still affected by a lot of
issues: from incorrect locking or lack thereof to using structures that are
private to GPIO drivers. Since almost all GPIO controllers can be unbound,
not to mention that we have USB devices registering GPIO expanders as well as
I2C-on-USB HID devices on which I2C GPIO expanders can live, various media
gadgets etc., we really need to make GPIO hotplug/unplug friendly.

Before we can even get to fixing the locking, we need to address a serious
abuse of the GPIO driver API - accessing struct gpio_chip by anyone who isn't
the driver owning this object. This structure is owned by the GPIO provider
and its lifetime is tied to that of that provider. It is destroyed when the
device is unregistered and this may happen at any moment. struct gpio_device
is the opaque, reference counted interface to struct gpio_chip (which is the
low-level implementation) and all access should pass through it.

The end-goal is to make all gpio_device manipulators check the existence of
gdev->chip and then lock it for the duration of any of the calls using SRCU.
Before we can get there, we need to first provide a set of functions that will
replace any gpio_chip functions and convert all in-kernel users.

This series adds several new helpers to the public GPIO API and uses
them across the core GPIO code.

Note that this does not make everything correct just yet. Especially the
GPIOLIB internal users release the reference returned by the lookup function
after getting the descriptor of interest but before requesting it. This will
eventually be addressed. This is not a regression either.

[1] https://lore.kernel.org/lkml/20230905185309.131295-1-brgl@bgdev.pl/T/

v3 -> v4:
- initialize managed pointers when declaring them
- drop unneeded casting
- collect more tags

v2 -> v3:
- use gpio_device_get_chip() consistently
- clarify comments
- fix buggy chip assignment
- check for PTR_ERR() in automatic cleanup
- rearrange code as requested by Andy

v1 -> v2:
- drop all non-GPIOLIB patches
- collect tags
- fix kernel docs

Bartosz Golaszewski (11):
  gpiolib: make gpio_device_get() and gpio_device_put() public
  gpiolib: add support for scope-based management to gpio_device
  gpiolib: provide gpio_device_find()
  gpiolib: provide gpio_device_find_by_label()
  gpiolib: provide gpio_device_get_desc()
  gpiolib: reluctantly provide gpio_device_get_chip()
  gpiolib: replace find_chip_by_name() with gpio_device_find_by_label()
  gpio: of: replace gpiochip_find_* with gpio_device_find_*
  gpio: acpi: replace gpiochip_find() with gpio_device_find()
  gpio: swnode: replace gpiochip_find() with gpio_device_find_by_label()
  gpio: sysfs: drop the mention of gpiochip_find() from sysfs code

 drivers/gpio/gpiolib-acpi.c   |  12 +-
 drivers/gpio/gpiolib-of.c     |  33 +++---
 drivers/gpio/gpiolib-swnode.c |  33 +++---
 drivers/gpio/gpiolib-sysfs.c  |   2 +-
 drivers/gpio/gpiolib.c        | 202 ++++++++++++++++++++++++++--------
 drivers/gpio/gpiolib.h        |  10 --
 include/linux/gpio/driver.h   |  16 +++
 7 files changed, 215 insertions(+), 93 deletions(-)

Comments

Andy Shevchenko Oct. 2, 2023, 9:39 a.m. UTC | #1
On Wed, Sep 27, 2023 at 04:29:21PM +0200, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> 
> In order to start migrating away from accessing struct gpio_chip by
> users other than their owners, let's first make the reference management
> functions for the opaque struct gpio_device public in the driver.h
> header.

Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Andy Shevchenko Oct. 2, 2023, 9:42 a.m. UTC | #2
On Wed, Sep 27, 2023 at 04:29:23PM +0200, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> 
> gpiochip_find() is wrong and its kernel doc is misleading as the
> function doesn't return a reference to the gpio_chip but just a raw
> pointer. The chip itself is not guaranteed to stay alive, in fact it can
> be deleted at any point. Also: other than GPIO drivers themselves,
> nobody else has any business accessing gpio_chip structs.
> 
> Provide a new gpio_device_find() function that returns a real reference
> to the opaque gpio_device structure that is guaranteed to stay alive for
> as long as there are active users of it.

...

>  struct gpio_chip *gpiochip_find(void *data,
>  				int (*match)(struct gpio_chip *gc,

> +struct gpio_device *gpio_device_find(void *data,
> +				     int (*match)(struct gpio_chip *gc,
> +						  void *data))

Why not

typedef int (*gpio_chip_match_fn)(struct gpio_chip *gc, void *data);

?
Andy Shevchenko Oct. 2, 2023, 9:46 a.m. UTC | #3
On Wed, Sep 27, 2023 at 04:29:25PM +0200, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> 
> Getting the GPIO descriptor directly from the gpio_chip struct is
> dangerous as we don't take the reference to the underlying GPIO device.
> In order to start working towards removing gpiochip_get_desc(), let's
> provide a safer variant that works with an existing reference to struct
> gpio_device.

...

> +EXPORT_SYMBOL_GPL(gpiochip_get_desc);

> +struct gpio_desc *
> +gpio_device_get_desc(struct gpio_device *gdev, unsigned int hwnum)

I'm wondering if you move this to be upper than gpiochip_get_desc() and
diff will look better...
Bartosz Golaszewski Oct. 2, 2023, 9:52 a.m. UTC | #4
On Mon, Oct 2, 2023 at 11:42 AM Andy Shevchenko
<andriy.shevchenko@linux.intel.com> wrote:
>
> On Wed, Sep 27, 2023 at 04:29:23PM +0200, Bartosz Golaszewski wrote:
> > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> >
> > gpiochip_find() is wrong and its kernel doc is misleading as the
> > function doesn't return a reference to the gpio_chip but just a raw
> > pointer. The chip itself is not guaranteed to stay alive, in fact it can
> > be deleted at any point. Also: other than GPIO drivers themselves,
> > nobody else has any business accessing gpio_chip structs.
> >
> > Provide a new gpio_device_find() function that returns a real reference
> > to the opaque gpio_device structure that is guaranteed to stay alive for
> > as long as there are active users of it.
>
> ...
>
> >  struct gpio_chip *gpiochip_find(void *data,
> >                               int (*match)(struct gpio_chip *gc,
>
> > +struct gpio_device *gpio_device_find(void *data,
> > +                                  int (*match)(struct gpio_chip *gc,
> > +                                               void *data))
>
> Why not
>
> typedef int (*gpio_chip_match_fn)(struct gpio_chip *gc, void *data);
>

Because gpiochip_find() will go away as soon as we convert all users.

Bart

> ?
>
> --
> With Best Regards,
> Andy Shevchenko
>
>
Bartosz Golaszewski Oct. 2, 2023, 9:54 a.m. UTC | #5
On Mon, Oct 2, 2023 at 11:46 AM Andy Shevchenko
<andriy.shevchenko@linux.intel.com> wrote:
>
> On Wed, Sep 27, 2023 at 04:29:25PM +0200, Bartosz Golaszewski wrote:
> > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> >
> > Getting the GPIO descriptor directly from the gpio_chip struct is
> > dangerous as we don't take the reference to the underlying GPIO device.
> > In order to start working towards removing gpiochip_get_desc(), let's
> > provide a safer variant that works with an existing reference to struct
> > gpio_device.
>
> ...
>
> > +EXPORT_SYMBOL_GPL(gpiochip_get_desc);
>
> > +struct gpio_desc *
> > +gpio_device_get_desc(struct gpio_device *gdev, unsigned int hwnum)
>
> I'm wondering if you move this to be upper than gpiochip_get_desc() and
> diff will look better...
>

There's a limit to bikeshedding in my book and "making the diff look
better" is definitely it. :)

Bart
Andy Shevchenko Oct. 2, 2023, 9:57 a.m. UTC | #6
On Mon, Oct 02, 2023 at 11:52:52AM +0200, Bartosz Golaszewski wrote:
> On Mon, Oct 2, 2023 at 11:42 AM Andy Shevchenko
> <andriy.shevchenko@linux.intel.com> wrote:
> > On Wed, Sep 27, 2023 at 04:29:23PM +0200, Bartosz Golaszewski wrote:
> > > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

...

> > >  struct gpio_chip *gpiochip_find(void *data,
> > >                               int (*match)(struct gpio_chip *gc,
> >
> > > +struct gpio_device *gpio_device_find(void *data,
> > > +                                  int (*match)(struct gpio_chip *gc,
> > > +                                               void *data))
> >
> > Why not
> >
> > typedef int (*gpio_chip_match_fn)(struct gpio_chip *gc, void *data);
> 
> Because gpiochip_find() will go away as soon as we convert all users.

And gpio_device_find() does not. So, I didn't get this argument.
Andy Shevchenko Oct. 2, 2023, 9:58 a.m. UTC | #7
On Mon, Oct 02, 2023 at 11:54:40AM +0200, Bartosz Golaszewski wrote:
> On Mon, Oct 2, 2023 at 11:46 AM Andy Shevchenko
> <andriy.shevchenko@linux.intel.com> wrote:
> > On Wed, Sep 27, 2023 at 04:29:25PM +0200, Bartosz Golaszewski wrote:
> > > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > >
> > > Getting the GPIO descriptor directly from the gpio_chip struct is
> > > dangerous as we don't take the reference to the underlying GPIO device.
> > > In order to start working towards removing gpiochip_get_desc(), let's
> > > provide a safer variant that works with an existing reference to struct
> > > gpio_device.

...

> > > +EXPORT_SYMBOL_GPL(gpiochip_get_desc);
> >
> > > +struct gpio_desc *
> > > +gpio_device_get_desc(struct gpio_device *gdev, unsigned int hwnum)
> >
> > I'm wondering if you move this to be upper than gpiochip_get_desc() and
> > diff will look better...
> 
> There's a limit to bikeshedding in my book and "making the diff look
> better" is definitely it. :)

Right, but if you are going to send a new version it might makes sense
to try, no?
Bartosz Golaszewski Oct. 2, 2023, 9:59 a.m. UTC | #8
On Mon, Oct 2, 2023 at 11:57 AM Andy Shevchenko
<andriy.shevchenko@linux.intel.com> wrote:
>
> On Mon, Oct 02, 2023 at 11:52:52AM +0200, Bartosz Golaszewski wrote:
> > On Mon, Oct 2, 2023 at 11:42 AM Andy Shevchenko
> > <andriy.shevchenko@linux.intel.com> wrote:
> > > On Wed, Sep 27, 2023 at 04:29:23PM +0200, Bartosz Golaszewski wrote:
> > > > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>
> ...
>
> > > >  struct gpio_chip *gpiochip_find(void *data,
> > > >                               int (*match)(struct gpio_chip *gc,
> > >
> > > > +struct gpio_device *gpio_device_find(void *data,
> > > > +                                  int (*match)(struct gpio_chip *gc,
> > > > +                                               void *data))
> > >
> > > Why not
> > >
> > > typedef int (*gpio_chip_match_fn)(struct gpio_chip *gc, void *data);
> >
> > Because gpiochip_find() will go away as soon as we convert all users.
>
> And gpio_device_find() does not. So, I didn't get this argument.
>

This symbol would only be used in a single place. But whatever, I can
do it as there will be another respin anyway.

Bart
Bartosz Golaszewski Oct. 4, 2023, 11:58 a.m. UTC | #9
On Wed, Sep 27, 2023 at 4:29 PM Bartosz Golaszewski <brgl@bgdev.pl> wrote:
>
> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>
> This is a reduced subset of patches from the initial sumbission[1]
> limited only to changes inside GPIOLIB. Once this is upstream, we can
> then slowly merge patches for other subsystems (like HTE) and then
> eventually remove gpiochip_find() entirely.
>
> The GPIO subsystem does not handle hot-unplug events very well. We have
> recently patched the user-space part of it so that at least a rouge user
> cannot crash the kernel but in-kernel users are still affected by a lot of
> issues: from incorrect locking or lack thereof to using structures that are
> private to GPIO drivers. Since almost all GPIO controllers can be unbound,
> not to mention that we have USB devices registering GPIO expanders as well as
> I2C-on-USB HID devices on which I2C GPIO expanders can live, various media
> gadgets etc., we really need to make GPIO hotplug/unplug friendly.
>
> Before we can even get to fixing the locking, we need to address a serious
> abuse of the GPIO driver API - accessing struct gpio_chip by anyone who isn't
> the driver owning this object. This structure is owned by the GPIO provider
> and its lifetime is tied to that of that provider. It is destroyed when the
> device is unregistered and this may happen at any moment. struct gpio_device
> is the opaque, reference counted interface to struct gpio_chip (which is the
> low-level implementation) and all access should pass through it.
>
> The end-goal is to make all gpio_device manipulators check the existence of
> gdev->chip and then lock it for the duration of any of the calls using SRCU.
> Before we can get there, we need to first provide a set of functions that will
> replace any gpio_chip functions and convert all in-kernel users.
>
> This series adds several new helpers to the public GPIO API and uses
> them across the core GPIO code.
>
> Note that this does not make everything correct just yet. Especially the
> GPIOLIB internal users release the reference returned by the lookup function
> after getting the descriptor of interest but before requesting it. This will
> eventually be addressed. This is not a regression either.
>
> [1] https://lore.kernel.org/lkml/20230905185309.131295-1-brgl@bgdev.pl/T/
>
> v3 -> v4:
> - initialize managed pointers when declaring them
> - drop unneeded casting
> - collect more tags
>
> v2 -> v3:
> - use gpio_device_get_chip() consistently
> - clarify comments
> - fix buggy chip assignment
> - check for PTR_ERR() in automatic cleanup
> - rearrange code as requested by Andy
>
> v1 -> v2:
> - drop all non-GPIOLIB patches
> - collect tags
> - fix kernel docs
>
> Bartosz Golaszewski (11):
>   gpiolib: make gpio_device_get() and gpio_device_put() public
>   gpiolib: add support for scope-based management to gpio_device
>   gpiolib: provide gpio_device_find()
>   gpiolib: provide gpio_device_find_by_label()
>   gpiolib: provide gpio_device_get_desc()
>   gpiolib: reluctantly provide gpio_device_get_chip()
>   gpiolib: replace find_chip_by_name() with gpio_device_find_by_label()
>   gpio: of: replace gpiochip_find_* with gpio_device_find_*
>   gpio: acpi: replace gpiochip_find() with gpio_device_find()
>   gpio: swnode: replace gpiochip_find() with gpio_device_find_by_label()
>   gpio: sysfs: drop the mention of gpiochip_find() from sysfs code
>
>  drivers/gpio/gpiolib-acpi.c   |  12 +-
>  drivers/gpio/gpiolib-of.c     |  33 +++---
>  drivers/gpio/gpiolib-swnode.c |  33 +++---
>  drivers/gpio/gpiolib-sysfs.c  |   2 +-
>  drivers/gpio/gpiolib.c        | 202 ++++++++++++++++++++++++++--------
>  drivers/gpio/gpiolib.h        |  10 --
>  include/linux/gpio/driver.h   |  16 +++
>  7 files changed, 215 insertions(+), 93 deletions(-)
>
> --
> 2.39.2
>

I queued this series in this form. Other than the constness of the
data pointer passed to gpio_device_find() (which - as explained under
the relevant patch - should remain non constant) Andy only had two
cosmetic issues with some patches which I'm choosing to leave out.

Let's give it some time in next before the merge window and hopefully
get the rest of the gpiochip_find() removal done before it.

Bart