diff mbox series

gpiolib: Avoid side effects in gpio_is_visible()

Message ID 20230512042806.3438373-1-chris.packham@alliedtelesis.co.nz
State Superseded
Headers show
Series gpiolib: Avoid side effects in gpio_is_visible() | expand

Commit Message

Chris Packham May 12, 2023, 4:28 a.m. UTC
Calling gpiod_to_irq() creates an irq_desc for the GPIO. This is not
something that should happen when just exporting the GPIO via sysfs. The
effect of this was observed as triggering a warning in
gpiochip_disable_irq() when kexec()ing after exporting a GPIO.

Remove the call to gpiod_to_irq() from gpio_is_visible(). The actual
intended creation of the irq_desc comes via edge_store() when requested
by the user.

Fixes: ebbeba120ab2 ("gpio: sysfs: fix gpio attribute-creation race")
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
---

Notes:
    This is technically a v2 of
    https://lore.kernel.org/lkml/20230510001151.3946931-1-chris.packham@alliedtelesis.co.nz/
    but the solution is so different it's probably best to treat it as a new
    patch.

 drivers/gpio/gpiolib-sysfs.c | 2 --
 1 file changed, 2 deletions(-)

Comments

Johan Hovold May 12, 2023, 7:24 a.m. UTC | #1
On Fri, May 12, 2023 at 04:28:06PM +1200, Chris Packham wrote:
> Calling gpiod_to_irq() creates an irq_desc for the GPIO. This is not
> something that should happen when just exporting the GPIO via sysfs. The
> effect of this was observed as triggering a warning in
> gpiochip_disable_irq() when kexec()ing after exporting a GPIO.

You need a better explanation as to why this is an issue. What does the
warning look like for example?

> Remove the call to gpiod_to_irq() from gpio_is_visible(). The actual
> intended creation of the irq_desc comes via edge_store() when requested
> by the user.

And why does that avoid whatever problem you're seeing?

> Fixes: ebbeba120ab2 ("gpio: sysfs: fix gpio attribute-creation race")

This is clearly not the right Fixes tag. The above commit only moved the
creation of the attribute to before registering the sysfs device and
specifically gpiod_to_irq() was used also prior to that commit.

As a matter of fact, back then there was no call to
irq_create_mapping() in gpiod_to_irq() either. That was added years
later by commit

	dc749a09ea5e ("gpiolib: allow gpio irqchip to map irqs dynamically")

> Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
> ---
> 
> Notes:
>     This is technically a v2 of
>     https://lore.kernel.org/lkml/20230510001151.3946931-1-chris.packham@alliedtelesis.co.nz/
>     but the solution is so different it's probably best to treat it as a new
>     patch.
> 
>  drivers/gpio/gpiolib-sysfs.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/gpio/gpiolib-sysfs.c b/drivers/gpio/gpiolib-sysfs.c
> index 530dfd19d7b5..f859dcd1cbf3 100644
> --- a/drivers/gpio/gpiolib-sysfs.c
> +++ b/drivers/gpio/gpiolib-sysfs.c
> @@ -362,8 +362,6 @@ static umode_t gpio_is_visible(struct kobject *kobj, struct attribute *attr,
>  		if (!show_direction)
>  			mode = 0;
>  	} else if (attr == &dev_attr_edge.attr) {
> -		if (gpiod_to_irq(desc) < 0)
> -			mode = 0;
>  		if (!show_direction && test_bit(FLAG_IS_OUT, &desc->flags))
>  			mode = 0;
>  	}

Johan
Chris Packham May 14, 2023, 10:27 p.m. UTC | #2
On 12/05/23 19:56, Linus Walleij wrote:
> On Fri, May 12, 2023 at 6:28 AM Chris Packham
> <chris.packham@alliedtelesis.co.nz> wrote:
>
>> Calling gpiod_to_irq() creates an irq_desc for the GPIO.
> Normally gpiod_to_irq() should not have side effects, it's just
> a helper function that is there to translate a descriptor to the
> corresponding IRQ number.
>
>> This is not
>> something that should happen when just exporting the GPIO via sysfs. The
>> effect of this was observed as triggering a warning in
>> gpiochip_disable_irq() when kexec()ing after exporting a GPIO.

For clarity this is the problem I see on kexec

WARNING: CPU: 1 PID: 235 at drivers/gpio/gpiolib.c:3440 
gpiochip_disable_irq+0x64/0x6c
Call trace:
  gpiochip_disable_irq+0x64/0x6c
  machine_crash_shutdown+0xac/0x114
  __crash_kexec+0x74/0x154
  panic+0x300/0x370
  sysrq_reset_seq_param_set+0x0/0x8c
  __handle_sysrq+0xb8/0x1b8
  write_sysrq_trigger+0x74/0xa0
  proc_reg_write+0x9c/0xf0
  vfs_write+0xc4/0x298
  ksys_write+0x68/0xf4
  __arm64_sys_write+0x1c/0x28
  invoke_syscall+0x48/0x114
  el0_svc_common.constprop.0+0x44/0xf4
  do_el0_svc+0x3c/0xa8
  el0_svc+0x2c/0x84
  el0t_64_sync_handler+0xbc/0x138
  el0t_64_sync+0x190/0x194

>> Remove the call to gpiod_to_irq() from gpio_is_visible(). The actual
>> intended creation of the irq_desc comes via edge_store() when requested
>> by the user.
>>
>> Fixes: ebbeba120ab2 ("gpio: sysfs: fix gpio attribute-creation race")
>> Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
> I have a hard time understanding this fix.

The problem (as I far as I've been able to determine). Is that 
gpio_is_visible() uses gpiod_to_irq() to decide whether to provide the 
"edge" attribute. The problem is that instead of just looking up the 
irq_desc from the GPIO pin gpiod_to_irq() actually creates the irq_desc 
(via  gpiochip_to_irq()).

Because gpio_sysfs_request_irq() calls gpiochip_to_irq() anyway to 
create the irq_desc I thought removing gpiochip_to_irq() from 
gpio_is_visible() should be safe.

>
> The problem is rather this see gpiolib.c:
>
> int gpiod_to_irq(const struct gpio_desc *desc)
> {
>          struct gpio_chip *gc;
>          int offset;
>
>          /*
>           * Cannot VALIDATE_DESC() here as gpiod_to_irq() consumer semantics
>           * requires this function to not return zero on an invalid descriptor
>           * but rather a negative error number.
>           */
>          if (!desc || IS_ERR(desc) || !desc->gdev || !desc->gdev->chip)
>                  return -EINVAL;
>
>          gc = desc->gdev->chip;
>          offset = gpio_chip_hwgpio(desc);
>          if (gc->to_irq) {
>                  int retirq = gc->to_irq(gc, offset);
>
> Here gc->to_irq() is called unconditionally.
>
> Since this is using gpiolib_irqchip this ->to_irq() will be
> gpiochip_to_irq() and that finally ends in the call:
>
> return irq_create_mapping(domain, offset);
>
> which seems to be the problem here. Why is this a problem?
> The IRQ mappings are dynamic, meaning they are created
> on-demand, but for this hardware it should be fine to essentially
> just call irq_create_mapping() on all of them as the device
> is created, we just don't do it in order to save space.

In my original case which is a kernel module that exports a GPIO for 
userspace using gpiod_export() it's inappropriate because the GPIO in 
question is configured as an output but gpio_is_visible() still causes 
an irq_desc to be created even though the very next line of code knows 
that it should not make the edge attribute visible.

The manually exporting things via sysfs case is a bit more murky because 
it's not known if the GPIO is an input or output so the code must assume 
both are possible (obviously this is one thing libgpio fixes). I still 
think creating the irq_desc on export is wrong, it seems unnecessary as 
it'll happen if an interrupt is actually requested via edge_store().

> I don't understand why calling irq_create_mapping(domain, offset);
> creates a problem? It's just a mapping between a GPIO and the
> corresponding IRQ. What am I missing here?

The crux of the problem is that the irq_desc is created when it hasn't 
been requested. In some cases we know the GPIO pin is an output so we 
could avoid it, in others we could delay the creation until an interrupt 
is actually requested (which is what I'm attempting to do).

>
> Yours,
> Linus Walleij
Chris Packham May 15, 2023, 9:01 p.m. UTC | #3
On 15/05/23 18:43, andy.shevchenko@gmail.com wrote:
> Sun, May 14, 2023 at 09:57:58PM +0000, Chris Packham kirjoitti:
>> On 12/05/23 19:24, Johan Hovold wrote:
>>> On Fri, May 12, 2023 at 04:28:06PM +1200, Chris Packham wrote:
> ...
>
>>> You need a better explanation as to why this is an issue. What does the
>>> warning look like for example?
>> Ironically I had that in my first attempt to address the issue but was
>> told it was too much detail. So now I've gone too far the other way.
>> I'll include it in the response I'm about to send to LinusW.
> You have been (implicitly) told to reduce the scope of the details to have
> the only important ones, removing the traceback completely wasn't on the
> table.
>
> Citation: "Besides the very noisy traceback in the commit message (read
> https://kernel.org/doc/html/latest/process/submitting-patches.html#backtraces-in-commit-messages"

Yes fair point. I just over compensated an thought the explanation of 
warning in gpiochip_disable_irq() was sufficient.
Kent Gibson May 16, 2023, 10:47 p.m. UTC | #4
On Tue, May 16, 2023 at 10:19:14PM +0000, Chris Packham wrote:
> 
> On 17/05/23 01:57, Linus Walleij wrote:
> > On Mon, May 15, 2023 at 12:27 AM Chris Packham
> > <Chris.Packham@alliedtelesis.co.nz> wrote:
> >
> >> In my original case which is a kernel module that exports a GPIO for
> >> userspace using gpiod_export()
> > We should not add new users for that API as it increase the usage
> > of the sysfs ABI but if it's an existing in-tree usecase I buy it.
> >
> >> The crux of the problem is that the irq_desc is created when it hasn't
> >> been requested.
> > The right solution to me seems to be to not use gpiod_export()
> > and not use sysfs TBH.
> 
> That's not really a feasible solution. I'm dealing with application code 
> that has been happily using the sysfs interface for many years.
> 
> I actually did look at getting that code updated to use libgpio earlier 
> this year but found the API was in a state of flux and I wasn't going to 
> recommend re-writing the application code to use libgpio if I knew the 
> API was going to change and we'd have to re-write it again.
> 

Its 'libgpiod'.

> Even now with the 2.0.1 libgpio there doesn't seem to be a way of asking 
> about just GPIO lines in the system, application code would still need 
> to open every /dev/gpiochipN device and ask what lines are on the chip 
> and keep the fds open for the chips that have lines the application 
> cares about but make sure to close the fd for the ones that don't. So 
> now the application code has to care about GPIO chips in addition to the 
> GPIO lines.
> 

Trying to better understand your use case - how does your application
identify lines of interest - just whatever lines pop up in
/sys/class/gpio?

Cheers,
Kent.
Chris Packham May 16, 2023, 11:50 p.m. UTC | #5
Hi Kent,

On 17/05/23 10:47, Kent Gibson wrote:
> On Tue, May 16, 2023 at 10:19:14PM +0000, Chris Packham wrote:
>> On 17/05/23 01:57, Linus Walleij wrote:
>>> On Mon, May 15, 2023 at 12:27 AM Chris Packham
>>> <Chris.Packham@alliedtelesis.co.nz> wrote:
>>>
>>>> In my original case which is a kernel module that exports a GPIO for
>>>> userspace using gpiod_export()
>>> We should not add new users for that API as it increase the usage
>>> of the sysfs ABI but if it's an existing in-tree usecase I buy it.
>>>
>>>> The crux of the problem is that the irq_desc is created when it hasn't
>>>> been requested.
>>> The right solution to me seems to be to not use gpiod_export()
>>> and not use sysfs TBH.
>> That's not really a feasible solution. I'm dealing with application code
>> that has been happily using the sysfs interface for many years.
>>
>> I actually did look at getting that code updated to use libgpio earlier
>> this year but found the API was in a state of flux and I wasn't going to
>> recommend re-writing the application code to use libgpio if I knew the
>> API was going to change and we'd have to re-write it again.
>>
> Its 'libgpiod'.
>
>> Even now with the 2.0.1 libgpio there doesn't seem to be a way of asking
>> about just GPIO lines in the system, application code would still need
>> to open every /dev/gpiochipN device and ask what lines are on the chip
>> and keep the fds open for the chips that have lines the application
>> cares about but make sure to close the fd for the ones that don't. So
>> now the application code has to care about GPIO chips in addition to the
>> GPIO lines.
>>
> Trying to better understand your use case - how does your application
> identify lines of interest - just whatever lines pop up in
> /sys/class/gpio?

Thanks for taking an interest. We actually have 2 applications that make 
use of this functionality

The first is a userspace driver for a Power Over Ethernet Controller+PSE 
chipset (I'll refer to this as an MCU since the thing we talk to is 
really a micro controller with a vendor supplied firmware on it that 
does most of the PoE stuff). Communication to the MCU is based around 
commands sent via i2c. But there are a few extra GPIOs that are used to 
reset the MCU as well as provide a mechanism for quickly dropping power 
on certain events (e.g. if the temperature monitoring detects a problem).

We do have a small kernel module that grabs the GPIOs based on the 
device tree and exports them with a known names (e.g. "poe-reset", 
"poe-dis") that the userspace driver can use. Back when that code was 
written we did consider not exporting the GPIOs and instead having some 
other sysfs/ioctl interface into this kernel module but that seemed more 
work than just calling gpiod_export() for little gain. This is where 
adding the gpio-names property in our .dts would allow libgpiod to do 
something similar.

Having the GPIOs in sysfs is also convenient as we can have a systemd 
ExecStopPost script that can drop power and/or reset the MCU if our 
application crashes. I'm not sure if the GPIO chardev interface deals 
with releasing the GPIO lines if the process that requested them exits 
abnormally (I assume it does) and obviously our ExecStopPost script 
would need updating to use some of the libgpiod applications to do what 
it currently does with a simple 'echo 1 >.../poe-reset'

The second application is a userspace driver for a L3 network switch 
(actually two of them for different silicon vendors). Again this needs 
to deal with resets for PHYs connected to the switch that the kernel has 
no visibility of as well as the GPIOs for the SFP cages. Again we have a 
slightly less simple kernel module that grabs all these GPIOs and 
exports them with known names. This time there are considerably more of 
these GPIOs (our largest system currently has 96 SFP+ ports with 4 GPIOs 
per port) so we're much more reliant on being able to do things like 
`for x in port*tx-dis; do echo 1 >$x; done`

I'm sure both of these applications could be re-written around libgpiod 
but that would incur a significant amount of regression testing on 
existing platforms. And I still consider dealing with GPIO chips an 
extra headache that the applications don't need (particularly with the 
sheer number of them the SFP case).

>
> Cheers,
> Kent.
Chris Packham May 17, 2023, 9:30 p.m. UTC | #6
On 17/05/23 20:54, Andy Shevchenko wrote:
> On Wed, May 17, 2023 at 2:50 AM Chris Packham
> <Chris.Packham@alliedtelesis.co.nz> wrote:
>> On 17/05/23 10:47, Kent Gibson wrote:
> ...
>
>> The first is a userspace driver for a Power Over Ethernet Controller+PSE
>> chipset (I'll refer to this as an MCU since the thing we talk to is
>> really a micro controller with a vendor supplied firmware on it that
>> does most of the PoE stuff). Communication to the MCU is based around
>> commands sent via i2c. But there are a few extra GPIOs that are used to
>> reset the MCU as well as provide a mechanism for quickly dropping power
>> on certain events (e.g. if the temperature monitoring detects a problem).
> Why does the MCU have no in-kernel driver?

There isn't any PoE PSE infrastructure in the kernel. I'm not really 
sure what it'd look like either as the hardware designs are all highly 
customized and often have very specialized requirements. Even the vendor 
reference boards tend to use the i2c userspace interface and punt 
everything to a specialist application.

Of course if anyone is thinking about adding PoE PSE support in-kernel 
I'd be very keen to be involved.

>> We do have a small kernel module that grabs the GPIOs based on the
>> device tree and exports them with a known names (e.g. "poe-reset",
>> "poe-dis") that the userspace driver can use.
> So, besides that you repeat gpio-aggregator functionality, you already
> have a "proxy" driver in the kernel. What prevents you from doing more
> in-kernel?

Yes true. The main issue is that without total support for the class of 
device in the kernel there's little more that you can do other than 
exposing gpios (either as gpio_export_link() or some other bespoke 
interface).

>>   Back when that code was
>> written we did consider not exporting the GPIOs and instead having some
>> other sysfs/ioctl interface into this kernel module but that seemed more
>> work than just calling gpiod_export() for little gain. This is where
>> adding the gpio-names property in our .dts would allow libgpiod to do
>> something similar.
>>
>> Having the GPIOs in sysfs is also convenient as we can have a systemd
>> ExecStopPost script that can drop power and/or reset the MCU if our
>> application crashes.
> I'm a bit lost. What your app is doing and how that is related to the
> (userspace) drivers?

Probably one of the primary things it's doing is bringing the chip out 
of reset by driving the GPIO (we don't want the PoE PSE supplying power 
if nothing is monitoring the temperature of the system). There's also 
some corner cases involving not resetting the PoE chipset on a hot restart.

>
>> I'm not sure if the GPIO chardev interface deals
>> with releasing the GPIO lines if the process that requested them exits
>> abnormally (I assume it does) and obviously our ExecStopPost script
>> would need updating to use some of the libgpiod applications to do what
>> it currently does with a simple 'echo 1 >.../poe-reset'
>>
>> The second application is a userspace driver for a L3 network switch
>> (actually two of them for different silicon vendors). Again this needs
>> to deal with resets for PHYs connected to the switch that the kernel has
>> no visibility of as well as the GPIOs for the SFP cages. Again we have a
>> slightly less simple kernel module that grabs all these GPIOs and
>> exports them with known names. This time there are considerably more of
>> these GPIOs (our largest system currently has 96 SFP+ ports with 4 GPIOs
>> per port) so we're much more reliant on being able to do things like
>> `for x in port*tx-dis; do echo 1 >$x; done`
> Hmm... Have you talked to the net subsystem guys? I know that there is
> a lot going on around SFP cage enumeration for some of the modules
> (Marvell?) and perhaps they can advise something different.

Yes I'm aware of the switchdev work and I'm very enthusiastic about it 
(as an aside I do have a fairly functional switchdev driver for some of 
the older Marvell Poncat2 silicon, never quite got to submitting it 
upstream before we ran out of time on the project).

Again the problem boils down to the fact that we have a userspace switch 
driver (which uses a vendor supplied non-free SDK). So despite the 
kernel having quite good support for SFPs I can't use it without a 
netdev to attach it to.

>> I'm sure both of these applications could be re-written around libgpiod
>> but that would incur a significant amount of regression testing on
>> existing platforms. And I still consider dealing with GPIO chips an
>> extra headache that the applications don't need (particularly with the
>> sheer number of them the SFP case).
> It seems to me that having no in-kernel driver for your stuff is the
> main point of all headache here. But I might be mistaken.

It certainly doesn't help, but I do think that is all orthogonal to the 
fact that gpio_is_visible() changes things rather than just determining 
if an attribute should be exported or not.
Andy Shevchenko May 23, 2023, 4:38 p.m. UTC | #7
Wed, May 17, 2023 at 09:30:51PM +0000, Chris Packham kirjoitti:
> On 17/05/23 20:54, Andy Shevchenko wrote:
> > On Wed, May 17, 2023 at 2:50 AM Chris Packham
> > <Chris.Packham@alliedtelesis.co.nz> wrote:
> >> On 17/05/23 10:47, Kent Gibson wrote:

...

> >> The first is a userspace driver for a Power Over Ethernet Controller+PSE
> >> chipset (I'll refer to this as an MCU since the thing we talk to is
> >> really a micro controller with a vendor supplied firmware on it that
> >> does most of the PoE stuff). Communication to the MCU is based around
> >> commands sent via i2c. But there are a few extra GPIOs that are used to
> >> reset the MCU as well as provide a mechanism for quickly dropping power
> >> on certain events (e.g. if the temperature monitoring detects a problem).
> > Why does the MCU have no in-kernel driver?
> 
> There isn't any PoE PSE infrastructure in the kernel. I'm not really 
> sure what it'd look like either as the hardware designs are all highly 
> customized and often have very specialized requirements. Even the vendor 
> reference boards tend to use the i2c userspace interface and punt 
> everything to a specialist application.
> 
> Of course if anyone is thinking about adding PoE PSE support in-kernel 
> I'd be very keen to be involved.

But what do net subsystem guys know about this? Have you had a chance
to ask them?

> >> We do have a small kernel module that grabs the GPIOs based on the
> >> device tree and exports them with a known names (e.g. "poe-reset",
> >> "poe-dis") that the userspace driver can use.
> > So, besides that you repeat gpio-aggregator functionality, you already
> > have a "proxy" driver in the kernel. What prevents you from doing more
> > in-kernel?
> 
> Yes true. The main issue is that without total support for the class of 
> device in the kernel there's little more that you can do other than 
> exposing gpios (either as gpio_export_link() or some other bespoke 
> interface).
> 
> >>   Back when that code was
> >> written we did consider not exporting the GPIOs and instead having some
> >> other sysfs/ioctl interface into this kernel module but that seemed more
> >> work than just calling gpiod_export() for little gain. This is where
> >> adding the gpio-names property in our .dts would allow libgpiod to do
> >> something similar.
> >>
> >> Having the GPIOs in sysfs is also convenient as we can have a systemd
> >> ExecStopPost script that can drop power and/or reset the MCU if our
> >> application crashes.
> > I'm a bit lost. What your app is doing and how that is related to the
> > (userspace) drivers?
> 
> Probably one of the primary things it's doing is bringing the chip out 
> of reset by driving the GPIO (we don't want the PoE PSE supplying power 
> if nothing is monitoring the temperature of the system). There's also 
> some corner cases involving not resetting the PoE chipset on a hot restart.

So, do I understand correct the following?
There is a PoE PSE which has a proprietary user space driver and to make it
work reliably we have to add a quirk which utilizes the GPIO sysfs?

> >> I'm not sure if the GPIO chardev interface deals
> >> with releasing the GPIO lines if the process that requested them exits
> >> abnormally (I assume it does) and obviously our ExecStopPost script
> >> would need updating to use some of the libgpiod applications to do what
> >> it currently does with a simple 'echo 1 >.../poe-reset'
> >>
> >> The second application is a userspace driver for a L3 network switch
> >> (actually two of them for different silicon vendors). Again this needs
> >> to deal with resets for PHYs connected to the switch that the kernel has
> >> no visibility of as well as the GPIOs for the SFP cages. Again we have a
> >> slightly less simple kernel module that grabs all these GPIOs and
> >> exports them with known names. This time there are considerably more of
> >> these GPIOs (our largest system currently has 96 SFP+ ports with 4 GPIOs
> >> per port) so we're much more reliant on being able to do things like
> >> `for x in port*tx-dis; do echo 1 >$x; done`
> > Hmm... Have you talked to the net subsystem guys? I know that there is
> > a lot going on around SFP cage enumeration for some of the modules
> > (Marvell?) and perhaps they can advise something different.
> 
> Yes I'm aware of the switchdev work and I'm very enthusiastic about it 
> (as an aside I do have a fairly functional switchdev driver for some of 
> the older Marvell Poncat2 silicon, never quite got to submitting it 
> upstream before we ran out of time on the project).
> 
> Again the problem boils down to the fact that we have a userspace switch 
> driver (which uses a vendor supplied non-free SDK). So despite the 
> kernel having quite good support for SFPs I can't use it without a 
> netdev to attach it to.

That user space driver is using what from the kernel? GPIO sysfs?

> >> I'm sure both of these applications could be re-written around libgpiod
> >> but that would incur a significant amount of regression testing on
> >> existing platforms. And I still consider dealing with GPIO chips an
> >> extra headache that the applications don't need (particularly with the
> >> sheer number of them the SFP case).
> > It seems to me that having no in-kernel driver for your stuff is the
> > main point of all headache here. But I might be mistaken.
> 
> It certainly doesn't help, but I do think that is all orthogonal to the 
> fact that gpio_is_visible() changes things rather than just determining 
> if an attribute should be exported or not.

Sorry for being unhelpful here. But without understanding the issue we can't
propose better solutions.
Bartosz Golaszewski May 25, 2023, 2:35 p.m. UTC | #8
On Thu, May 25, 2023 at 11:13 AM Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
>
> On Thu, May 25, 2023 at 2:53 AM Chris Packham
> <Chris.Packham@alliedtelesis.co.nz> wrote:
> > On 24/05/23 17:41, Kent Gibson wrote:
>
> ...
>
> > It'd also be great if there was some way of ensuring that a line's state
> > is kept after the application has released the request (i.e. the txdis
> > case I mentioned). But that probably needs work on the kernel side to
> > make such guarantees.
>
> Won't happen. It will require too much of strictness to be added into
> the kernel with likely breakage of the existing code and
> documentation. What is being discussed is a D-Bus (like?) daemon +
> Policy in user space that will allow user / process / cgroup / etc to
> "own" the line and track its state.
>

It's already WiP[1]. I'm trying to keep the footprint minimal with
only GLib and dbus required at run-time.

Bart

[1] https://github.com/brgl/libgpiod-private/tree/topic/dbus
Bartosz Golaszewski May 26, 2023, 12:46 p.m. UTC | #9
On Wed, May 24, 2023 at 7:41 AM Kent Gibson <warthog618@gmail.com> wrote:
>
> On Tue, May 23, 2023 at 09:17:26PM +0000, Chris Packham wrote:
> >
> > On 24/05/23 04:38, andy.shevchenko@gmail.com wrote:
> > > Wed, May 17, 2023 at 09:30:51PM +0000, Chris Packham kirjoitti:
> > >> On 17/05/23 20:54, Andy Shevchenko wrote:
> > >>> On Wed, May 17, 2023 at 2:50 AM Chris Packham
> > >>> <Chris.Packham@alliedtelesis.co.nz> wrote:
> > >>>> On 17/05/23 10:47, Kent Gibson wrote:
> > > ...
> > >
> > >> Again the problem boils down to the fact that we have a userspace switch
> > >> driver (which uses a vendor supplied non-free SDK). So despite the
> > >> kernel having quite good support for SFPs I can't use it without a
> > >> netdev to attach it to.
> > > That user space driver is using what from the kernel? GPIO sysfs?
> >
> > Yes GPIO sysfs and exported links with known names, which allows things
> > to be done per-port but also wildcarded from shell scripts if necessary.
> > I think the key point here is that it doesn't care about the GPIO chips
> > just the individual GPIO lines. Anything involving libgpiod currently
> > has to start caring about GPIO chips (or I'm misreading the docs).
> >
>
> As previously mentioned, the libgpiod tools now support identification of
> lines by name.
> As long as your line names are unique at system scope you should be
> good.  Otherwise you have no option but to identify by (chip,offset).
>
> Wrt the library itself, I was thinking about relocating the line name
> resolution logic from the tools into the library itself, so it would be
> more generally accessible, but haven't gotten there yet.
>
> I'm also of the opinion that libgpiod is too low level for common
> tasks.  That is necessary to access all the features of the uAPI, but
> for basic tasks it would be nice to have a higher level abstraction to
> reduce the barrier to entry.
>
> e.g. in Rust I can do this:
>
>     let led0 = gpiocdev::find_named_line("LED0").unwrap();
>     let req = Request::builder()
>         .with_found_line(&led0)
>         .as_output(Value::Active)
>         .request()?;
>

I would argue that existing high-level bindings for mainline libgpiod
(C++ and Python) allow similar functionality in a comparable number of
LOC. On the other hand - core C library should remain relatively
simple and limited in features.

Chris: are you forced to use C or could you use C++ for line lookup
and management?

I'm also in the process of designing the DBus API and the base for it
will be GLib/GObject bindings for the core C lib. Maybe this is the
place where we could place more advanced features in C as GLib already
makes C coding so much easier.

Bart

>     // change value later
>     req.set_value(led0.offset, Value::Inactive)
>
> which is the equivalent of the sysfs
>
> echo 1 > /some/sysfs/line
> ...
> echo 0 > /some/sysfs/line
>
> That is bad enough. It pains me to see how complex the equivalent is using
> the libgpiod v2 API (or v1), and that is not putting any shade on Bart or
> anyone else who worked on it - there are a lot of constraints on how it
> is designed.  It just doesn't feel complete yet, particularly from a
> casual user's perspective.
>
> One of the things I would like to see added to libgpiod is a set of working
> examples of simple use cases.  Formerly the tools took double duty to
> fill that role, but they've now grown too complex.
> Those examples would highlight where we could provide simplified
> higher level APIs.
> Then rinse and repeat until the simple use cases are simple.
>
> > >>>> I'm sure both of these applications could be re-written around libgpiod
> > >>>> but that would incur a significant amount of regression testing on
> > >>>> existing platforms. And I still consider dealing with GPIO chips an
> > >>>> extra headache that the applications don't need (particularly with the
> > >>>> sheer number of them the SFP case).
> > >>> It seems to me that having no in-kernel driver for your stuff is the
> > >>> main point of all headache here. But I might be mistaken.
> > >> It certainly doesn't help, but I do think that is all orthogonal to the
> > >> fact that gpio_is_visible() changes things rather than just determining
> > >> if an attribute should be exported or not.
> > > Sorry for being unhelpful here. But without understanding the issue we can't
> > > propose better solutions.
> > No problem, this is probably the most engagement I've had out of a Linux
> > patch submission. Hopefully it's not too annoying for those on the Cc list.
>
> Well, now you mention it....
>
> Along the same lines as Andy, it is always useful to get feedback about
> problems people are facing, and how the available solutions aren't
> working for them.
> If we don't know the problem exists then we can't fix it - except by
> accident.
>
> Cheers,
> Kent.
Linus Walleij May 29, 2023, 9:07 a.m. UTC | #10
On Wed, May 17, 2023 at 12:19 AM Chris Packham
<Chris.Packham@alliedtelesis.co.nz> wrote:
> On 17/05/23 01:57, Linus Walleij wrote:
> > On Mon, May 15, 2023 at 12:27 AM Chris Packham
> > <Chris.Packham@alliedtelesis.co.nz> wrote:

> >> The crux of the problem is that the irq_desc is created when it hasn't
> >> been requested.

> > The right solution to me seems to be to not use gpiod_export()
> > and not use sysfs TBH.
>
> That's not really a feasible solution. I'm dealing with application code
> that has been happily using the sysfs interface for many years.

I wonder how many years.

The GPIO sysfs has been deprecated for seven years:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/Documentation/ABI/obsolete/sysfs-gpio?id=fe95046e960b4b76e73dc1486955d93f47276134

My fear is that deprecation is ignored and people still develop stuff
like this ignoring the fact that the ABI is deprecated.

Yours,
Linus Walleij
Chris Packham May 29, 2023, 10 p.m. UTC | #11
On 29/05/23 21:07, Linus Walleij wrote:
> On Wed, May 17, 2023 at 12:19 AM Chris Packham
> <Chris.Packham@alliedtelesis.co.nz> wrote:
>> On 17/05/23 01:57, Linus Walleij wrote:
>>> On Mon, May 15, 2023 at 12:27 AM Chris Packham
>>> <Chris.Packham@alliedtelesis.co.nz> wrote:
>>>> The crux of the problem is that the irq_desc is created when it hasn't
>>>> been requested.
>>> The right solution to me seems to be to not use gpiod_export()
>>> and not use sysfs TBH.
>> That's not really a feasible solution. I'm dealing with application code
>> that has been happily using the sysfs interface for many years.
> I wonder how many years.
>
> The GPIO sysfs has been deprecated for seven years:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/Documentation/ABI/obsolete/sysfs-gpio?id=fe95046e960b4b76e73dc1486955d93f47276134
>
> My fear is that deprecation is ignored and people still develop stuff
> like this ignoring the fact that the ABI is deprecated.

I can't claim that the code is earlier than that deprecation (maybe 
2.6.32 era) but we certainly didn't know about the deprecation when we 
started using it. Unfortunately the internet has a long memory so if you 
search for Linux GPIOs you'll find plenty of examples that point to the 
sysfs ABI as the way that it's done.

Switching to the libgpiod is viable for a few miscellaneous uses (I'll 
try to push that from my end), there probably will be a long tail of 
code that will continue to use the sysfs ABI.

>
> Yours,
> Linus Walleij
diff mbox series

Patch

diff --git a/drivers/gpio/gpiolib-sysfs.c b/drivers/gpio/gpiolib-sysfs.c
index 530dfd19d7b5..f859dcd1cbf3 100644
--- a/drivers/gpio/gpiolib-sysfs.c
+++ b/drivers/gpio/gpiolib-sysfs.c
@@ -362,8 +362,6 @@  static umode_t gpio_is_visible(struct kobject *kobj, struct attribute *attr,
 		if (!show_direction)
 			mode = 0;
 	} else if (attr == &dev_attr_edge.attr) {
-		if (gpiod_to_irq(desc) < 0)
-			mode = 0;
 		if (!show_direction && test_bit(FLAG_IS_OUT, &desc->flags))
 			mode = 0;
 	}