Message ID | 20221126162211.93322-3-mailhol.vincent@wanadoo.fr |
---|---|
State | Superseded |
Headers | show |
Series | can: etas_es58x: report firmware, bootloader and hardware version | expand |
On Sun, Nov 27, 2022 at 02:10:32PM +0900, Vincent MAILHOL wrote: > > Should devlink_free() be after usb_set_inftdata()? > > A look at > $ git grep -W "usb_set_intfdata(.*NULL)" > > shows that the two patterns (freeing before or after > usb_set_intfdata()) coexist. > > You are raising an important question here. usb_set_intfdata() does > not have documentation that freeing before it is risky. And the > documentation of usb_driver::disconnect says that: > "@disconnect: Called when the interface is no longer accessible, > usually because its device has been (or is being) disconnected > or the driver module is being unloaded." > Ref: https://elixir.bootlin.com/linux/v6.1-rc6/source/include/linux/usb.h#L1130 > > So the interface no longer being accessible makes me assume that the > order does not matter. If it indeed matters, then this is a foot gun > and there is some clean-up work waiting for us on many drivers. > > @Greg, any thoughts on whether or not the order of usb_set_intfdata() > and resource freeing matters or not? In fact, drivers don't have to call usb_set_intfdata(NULL) at all; the USB core does it for them after the ->disconnect() callback returns. But if a driver does make the call, it should be careful to ensure that the call happens _after_ the driver is finished using the interface-data pointer. For example, after all outstanding URBs have completed, if the completion handlers will need to call usb_get_intfdata(). Remember, the interface-data pointer is owned by the driver. Nothing else in the kernel uses it. So the driver merely has to be careful not to clear the pointer while it is still using it. Alan Stern
On Mon. 28 Nov. 2022 at 00:41, Alan Stern <stern@rowland.harvard.edu> wrote: > On Sun, Nov 27, 2022 at 02:10:32PM +0900, Vincent MAILHOL wrote: > > > Should devlink_free() be after usb_set_inftdata()? > > > > A look at > > $ git grep -W "usb_set_intfdata(.*NULL)" > > > > shows that the two patterns (freeing before or after > > usb_set_intfdata()) coexist. > > > > You are raising an important question here. usb_set_intfdata() does > > not have documentation that freeing before it is risky. And the > > documentation of usb_driver::disconnect says that: > > "@disconnect: Called when the interface is no longer accessible, > > usually because its device has been (or is being) disconnected > > or the driver module is being unloaded." > > Ref: https://elixir.bootlin.com/linux/v6.1-rc6/source/include/linux/usb.h#L1130 > > > > So the interface no longer being accessible makes me assume that the > > order does not matter. If it indeed matters, then this is a foot gun > > and there is some clean-up work waiting for us on many drivers. > > > > @Greg, any thoughts on whether or not the order of usb_set_intfdata() > > and resource freeing matters or not? > > In fact, drivers don't have to call usb_set_intfdata(NULL) at all; the > USB core does it for them after the ->disconnect() callback returns. Interesting. This fact is widely unknown, cf: $ git grep "usb_set_intfdata(.*NULL)" | wc -l 215 I will do some clean-up later on, at least for the CAN USB drivers. > But if a driver does make the call, it should be careful to ensure that > the call happens _after_ the driver is finished using the interface-data > pointer. For example, after all outstanding URBs have completed, if the > completion handlers will need to call usb_get_intfdata(). ACK. I understand that it should be called *after* the completion of any ongoing task. My question was more on: devlink_free(priv_to_devlink(es58x_dev)); usb_set_intfdata(intf, NULL); VS. usb_set_intfdata(intf, NULL); devlink_free(priv_to_devlink(es58x_dev));
On Mon. 28 Nov. 2022 at 10:34, Vincent MAILHOL <mailhol.vincent@wanadoo.fr> wrote: > On Mon. 28 Nov. 2022 at 00:41, Alan Stern <stern@rowland.harvard.edu> wrote: > > On Sun, Nov 27, 2022 at 02:10:32PM +0900, Vincent MAILHOL wrote: > > > > Should devlink_free() be after usb_set_inftdata()? > > > > > > A look at > > > $ git grep -W "usb_set_intfdata(.*NULL)" > > > > > > shows that the two patterns (freeing before or after > > > usb_set_intfdata()) coexist. > > > > > > You are raising an important question here. usb_set_intfdata() does > > > not have documentation that freeing before it is risky. And the > > > documentation of usb_driver::disconnect says that: > > > "@disconnect: Called when the interface is no longer accessible, > > > usually because its device has been (or is being) disconnected > > > or the driver module is being unloaded." > > > Ref: https://elixir.bootlin.com/linux/v6.1-rc6/source/include/linux/usb.h#L1130 > > > > > > So the interface no longer being accessible makes me assume that the > > > order does not matter. If it indeed matters, then this is a foot gun > > > and there is some clean-up work waiting for us on many drivers. > > > > > > @Greg, any thoughts on whether or not the order of usb_set_intfdata() > > > and resource freeing matters or not? > > > > In fact, drivers don't have to call usb_set_intfdata(NULL) at all; the > > USB core does it for them after the ->disconnect() callback returns. > > Interesting. This fact is widely unknown, cf: > $ git grep "usb_set_intfdata(.*NULL)" | wc -l > 215 > > I will do some clean-up later on, at least for the CAN USB drivers. > > > But if a driver does make the call, it should be careful to ensure that > > the call happens _after_ the driver is finished using the interface-data > > pointer. For example, after all outstanding URBs have completed, if the > > completion handlers will need to call usb_get_intfdata(). > > ACK. I understand that it should be called *after* the completion of > any ongoing task. > > My question was more on: > > devlink_free(priv_to_devlink(es58x_dev)); > usb_set_intfdata(intf, NULL); > > VS. > > usb_set_intfdata(intf, NULL); > devlink_free(priv_to_devlink(es58x_dev)); > > From your comments, I understand that both are fine. Do we agree that the usb-skeleton is doing it wrong? https://elixir.bootlin.com/linux/latest/source/drivers/usb/usb-skeleton.c#L567 usb_set_intfdata(interface, NULL) is called before deregistering the interface and terminating the outstanding URBs! > > Remember, the interface-data pointer is owned by the driver. Nothing > > else in the kernel uses it. So the driver merely has to be careful not > > to clear the pointer while it is still using it. > > Thanks for your comments! > > > Yours sincerely, > Vincent Mailhol
> > But if a driver does make the call, it should be careful to ensure that > > the call happens _after_ the driver is finished using the interface-data > > pointer. For example, after all outstanding URBs have completed, if the > > completion handlers will need to call usb_get_intfdata(). > > ACK. I understand that it should be called *after* the completion of > any ongoing task. What sometimes gets people is /sys, /proc. etc. A process can have such a file open when the device is unplugged. If the read needs to make use of your private data structure, you need to guarantee it still exists. Ideally the core needs to wait and not call the disconnect until all such files are closed. Probably the USB core does, it is such an obvious issue, but i have no knowledge of USB. Andrew
On Mon. 28 Nov. 2022 at 22:45, Andrew Lunn <andrew@lunn.ch> wrote: > > > But if a driver does make the call, it should be careful to ensure that > > > the call happens _after_ the driver is finished using the interface-data > > > pointer. For example, after all outstanding URBs have completed, if the > > > completion handlers will need to call usb_get_intfdata(). > > > > ACK. I understand that it should be called *after* the completion of > > any ongoing task. > > What sometimes gets people is /sys, /proc. etc. A process can have > such a file open when the device is unplugged. If the read needs to > make use of your private data structure, you need to guarantee it > still exists. Ideally the core needs to wait and not call the > disconnect until all such files are closed. Probably the USB core > does, it is such an obvious issue, but i have no knowledge of USB. For USB drivers, the parallel of what you are describing are the URBs (USB request Buffers). The URB are sent asynchronously to the device. Each URB has a completion handler: https://elixir.bootlin.com/linux/v6.0/source/include/linux/usb.h#L1443 It is important to wait for all outstanding URB to complete before releasing your resources. But once you are able to guarantee that any ongoing actions were completed, the order in which you kfree() or usb_set_intfdata() to NULL matters less. Of course, the USB drivers could also have some /sys/ or /proc/ files opened, but this is not the case of the etas_es58x. By the way, the handling of outstanding URBs is done by es58x_free_urbs(): https://elixir.bootlin.com/linux/v6.0/source/drivers/net/can/usb/etas_es58x/es58x_core.c#L1745 which is called just before the devlink_free() and the usb_set_intfdata().
On Mon, Nov 28, 2022 at 02:32:23PM +0900, Vincent MAILHOL wrote: > On Mon. 28 Nov. 2022 at 10:34, Vincent MAILHOL > <mailhol.vincent@wanadoo.fr> wrote: > > On Mon. 28 Nov. 2022 at 00:41, Alan Stern <stern@rowland.harvard.edu> wrote: > > > On Sun, Nov 27, 2022 at 02:10:32PM +0900, Vincent MAILHOL wrote: > > > > > Should devlink_free() be after usb_set_inftdata()? > > > > > > > > A look at > > > > $ git grep -W "usb_set_intfdata(.*NULL)" > > > > > > > > shows that the two patterns (freeing before or after > > > > usb_set_intfdata()) coexist. > > > > > > > > You are raising an important question here. usb_set_intfdata() does > > > > not have documentation that freeing before it is risky. And the > > > > documentation of usb_driver::disconnect says that: > > > > "@disconnect: Called when the interface is no longer accessible, > > > > usually because its device has been (or is being) disconnected > > > > or the driver module is being unloaded." > > > > Ref: https://elixir.bootlin.com/linux/v6.1-rc6/source/include/linux/usb.h#L1130 > > > > > > > > So the interface no longer being accessible makes me assume that the > > > > order does not matter. If it indeed matters, then this is a foot gun > > > > and there is some clean-up work waiting for us on many drivers. > > > > > > > > @Greg, any thoughts on whether or not the order of usb_set_intfdata() > > > > and resource freeing matters or not? > > > > > > In fact, drivers don't have to call usb_set_intfdata(NULL) at all; the > > > USB core does it for them after the ->disconnect() callback returns. > > > > Interesting. This fact is widely unknown, cf: > > $ git grep "usb_set_intfdata(.*NULL)" | wc -l > > 215 > > > > I will do some clean-up later on, at least for the CAN USB drivers. > > > > > But if a driver does make the call, it should be careful to ensure that > > > the call happens _after_ the driver is finished using the interface-data > > > pointer. For example, after all outstanding URBs have completed, if the > > > completion handlers will need to call usb_get_intfdata(). > > > > ACK. I understand that it should be called *after* the completion of > > any ongoing task. > > > > My question was more on: > > > > devlink_free(priv_to_devlink(es58x_dev)); > > usb_set_intfdata(intf, NULL); > > > > VS. > > > > usb_set_intfdata(intf, NULL); > > devlink_free(priv_to_devlink(es58x_dev)); > > > > From your comments, I understand that both are fine. > > Do we agree that the usb-skeleton is doing it wrong? > https://elixir.bootlin.com/linux/latest/source/drivers/usb/usb-skeleton.c#L567 > usb_set_intfdata(interface, NULL) is called before deregistering the > interface and terminating the outstanding URBs! Going through the usb-skeleton.c source code, you will find that usb_get_intfdata() is called from only a few routines: skel_open() skel_disconnect() skel_suspend() skel_pre_reset() skel_post_reset() Of those, all but the first are called only by the USB core and they are mutually exclusive with disconnect processing (except for skel_disconnect() itself, of course). So they don't matter. The first, skel_open(), can be called as a result of actions by the user, so the driver needs to ensure that this can't happen after it clears the interface-data pointer. The user can open the device file at any time before the minor number is given back, so it is not proper to call usb_set_intfdata(interface, NULL) before usb_deregister_dev() -- but the driver does exactly this! (Well, it's not quite that bad. skel_open() does check whether the interface-data pointer value it gets from usb_get_intfdata() is NULL. But it's still a race.) So yes, the current code is wrong. And in fact, it will still be wrong even after the usb_set_intfdata(interface, NULL) line is removed, because there is no synchronization between skel_open() and skel_disconnect(). It is possible for skel_disconnect() to run to completion and the USB core to clear the interface-data pointer all while skel_open() is running. The driver needs a static private mutex to synchronize opens with unregistrations. (This is a general phenomenon, true of all drivers that have a user interface such as a device file.) The driver _does_ have a per-instance mutex, dev->io_mutex, to synchronize I/O with disconnects. But that's separate from synchronizing opens with unregistrations, because at open time the driver doesn't yet know the address of the private data structure or even if the structure is still allocated. So obviously it can't use a mutex that is embedded within the private data structure for this purpose. Alan Stern
On Tue. 29 Nov. 2022 at 00:50, Alan Stern <stern@rowland.harvard.edu> wrote: > On Mon, Nov 28, 2022 at 02:32:23PM +0900, Vincent MAILHOL wrote: > > On Mon. 28 Nov. 2022 at 10:34, Vincent MAILHOL > > <mailhol.vincent@wanadoo.fr> wrote: > > > On Mon. 28 Nov. 2022 at 00:41, Alan Stern <stern@rowland.harvard.edu> wrote: > > > > On Sun, Nov 27, 2022 at 02:10:32PM +0900, Vincent MAILHOL wrote: > > > > > > Should devlink_free() be after usb_set_inftdata()? > > > > > > > > > > A look at > > > > > $ git grep -W "usb_set_intfdata(.*NULL)" > > > > > > > > > > shows that the two patterns (freeing before or after > > > > > usb_set_intfdata()) coexist. > > > > > > > > > > You are raising an important question here. usb_set_intfdata() does > > > > > not have documentation that freeing before it is risky. And the > > > > > documentation of usb_driver::disconnect says that: > > > > > "@disconnect: Called when the interface is no longer accessible, > > > > > usually because its device has been (or is being) disconnected > > > > > or the driver module is being unloaded." > > > > > Ref: https://elixir.bootlin.com/linux/v6.1-rc6/source/include/linux/usb.h#L1130 > > > > > > > > > > So the interface no longer being accessible makes me assume that the > > > > > order does not matter. If it indeed matters, then this is a foot gun > > > > > and there is some clean-up work waiting for us on many drivers. > > > > > > > > > > @Greg, any thoughts on whether or not the order of usb_set_intfdata() > > > > > and resource freeing matters or not? > > > > > > > > In fact, drivers don't have to call usb_set_intfdata(NULL) at all; the > > > > USB core does it for them after the ->disconnect() callback returns. > > > > > > Interesting. This fact is widely unknown, cf: > > > $ git grep "usb_set_intfdata(.*NULL)" | wc -l > > > 215 > > > > > > I will do some clean-up later on, at least for the CAN USB drivers. > > > > > > > But if a driver does make the call, it should be careful to ensure that > > > > the call happens _after_ the driver is finished using the interface-data > > > > pointer. For example, after all outstanding URBs have completed, if the > > > > completion handlers will need to call usb_get_intfdata(). > > > > > > ACK. I understand that it should be called *after* the completion of > > > any ongoing task. > > > > > > My question was more on: > > > > > > devlink_free(priv_to_devlink(es58x_dev)); > > > usb_set_intfdata(intf, NULL); > > > > > > VS. > > > > > > usb_set_intfdata(intf, NULL); > > > devlink_free(priv_to_devlink(es58x_dev)); > > > > > > From your comments, I understand that both are fine. > > > > Do we agree that the usb-skeleton is doing it wrong? > > https://elixir.bootlin.com/linux/latest/source/drivers/usb/usb-skeleton.c#L567 > > usb_set_intfdata(interface, NULL) is called before deregistering the > > interface and terminating the outstanding URBs! > > Going through the usb-skeleton.c source code, you will find that > usb_get_intfdata() is called from only a few routines: > > skel_open() > skel_disconnect() > skel_suspend() > skel_pre_reset() > skel_post_reset() > > Of those, all but the first are called only by the USB core and they are > mutually exclusive with disconnect processing (except for > skel_disconnect() itself, of course). So they don't matter. > > The first, skel_open(), can be called as a result of actions by the > user, so the driver needs to ensure that this can't happen after it > clears the interface-data pointer. The user can open the device file at > any time before the minor number is given back, so it is not proper to > call usb_set_intfdata(interface, NULL) before usb_deregister_dev() -- > but the driver does exactly this! > > (Well, it's not quite that bad. skel_open() does check whether the > interface-data pointer value it gets from usb_get_intfdata() is NULL. > But it's still a race.) > > So yes, the current code is wrong. And in fact, it will still be wrong > even after the usb_set_intfdata(interface, NULL) line is removed, > because there is no synchronization between skel_open() and > skel_disconnect(). ACK. I did not look outside of skel_disconnect(). Regardless, I think that removing the usb_set_intdata(interface, NULL) is still one step in the good direction despite the other synchronisation issues. I sent a patch for that which Greg already pick-up: https://git.kernel.org/gregkh/usb/c/c568f8bb41a4 >It is possible for skel_disconnect() to run to > completion and the USB core to clear the interface-data pointer all > while skel_open() is running. The driver needs a static private mutex > to synchronize opens with unregistrations. (This is a general > phenomenon, true of all drivers that have a user interface such as a > device file.) > > The driver _does_ have a per-instance mutex, dev->io_mutex, to > synchronize I/O with disconnects. But that's separate from > synchronizing opens with unregistrations, because at open time the > driver doesn't yet know the address of the private data structure or > even if the structure is still allocated. So obviously it can't use a > mutex that is embedded within the private data structure for this > purpose. ACK. However, I have other priorities, I will not invest more time to dig in the usb-skeleton.c Thank you for the answer! That was a long but interesting diversion from the initial topic :) Yours sincerely, Vincent Mailhol
diff --git a/drivers/net/can/usb/Kconfig b/drivers/net/can/usb/Kconfig index 8c6fea661530..445504ababce 100644 --- a/drivers/net/can/usb/Kconfig +++ b/drivers/net/can/usb/Kconfig @@ -30,6 +30,7 @@ config CAN_ESD_USB config CAN_ETAS_ES58X tristate "ETAS ES58X CAN/USB interfaces" select CRC16 + select NET_DEVLINK help This driver supports the ES581.4, ES582.1 and ES584.1 interfaces from ETAS GmbH (https://www.etas.com/en/products/es58x.php). diff --git a/drivers/net/can/usb/etas_es58x/Makefile b/drivers/net/can/usb/etas_es58x/Makefile index a129b4aa0215..d6667ebe259f 100644 --- a/drivers/net/can/usb/etas_es58x/Makefile +++ b/drivers/net/can/usb/etas_es58x/Makefile @@ -1,3 +1,3 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_CAN_ETAS_ES58X) += etas_es58x.o -etas_es58x-y = es58x_core.o es581_4.o es58x_fd.o +etas_es58x-y = es58x_core.o es58x_devlink.o es581_4.o es58x_fd.o diff --git a/drivers/net/can/usb/etas_es58x/es58x_core.c b/drivers/net/can/usb/etas_es58x/es58x_core.c index 4c97102202bf..c6e598e4800c 100644 --- a/drivers/net/can/usb/etas_es58x/es58x_core.c +++ b/drivers/net/can/usb/etas_es58x/es58x_core.c @@ -16,6 +16,7 @@ #include <linux/kernel.h> #include <linux/module.h> #include <linux/usb.h> +#include <net/devlink.h> #include "es58x_core.h" @@ -2174,6 +2175,7 @@ static struct es58x_device *es58x_init_es58x_dev(struct usb_interface *intf, { struct device *dev = &intf->dev; struct es58x_device *es58x_dev; + struct devlink *devlink; const struct es58x_parameters *param; const struct es58x_operators *ops; struct usb_device *udev = interface_to_usbdev(intf); @@ -2196,11 +2198,12 @@ static struct es58x_device *es58x_init_es58x_dev(struct usb_interface *intf, ops = &es581_4_ops; } - es58x_dev = devm_kzalloc(dev, es58x_sizeof_es58x_device(param), - GFP_KERNEL); - if (!es58x_dev) + devlink = devlink_alloc(&es58x_dl_ops, es58x_sizeof_es58x_device(param), + dev); + if (!devlink) return ERR_PTR(-ENOMEM); + es58x_dev = devlink_priv(devlink); es58x_dev->param = param; es58x_dev->ops = ops; es58x_dev->dev = dev; @@ -2247,6 +2250,8 @@ static int es58x_probe(struct usb_interface *intf, if (ret) return ret; + devlink_register(priv_to_devlink(es58x_dev)); + for (ch_idx = 0; ch_idx < es58x_dev->num_can_ch; ch_idx++) { ret = es58x_init_netdev(es58x_dev, ch_idx); if (ret) { @@ -2272,8 +2277,10 @@ static void es58x_disconnect(struct usb_interface *intf) dev_info(&intf->dev, "Disconnecting %s %s\n", es58x_dev->udev->manufacturer, es58x_dev->udev->product); + devlink_unregister(priv_to_devlink(es58x_dev)); es58x_free_netdevs(es58x_dev); es58x_free_urbs(es58x_dev); + devlink_free(priv_to_devlink(es58x_dev)); usb_set_intfdata(intf, NULL); } diff --git a/drivers/net/can/usb/etas_es58x/es58x_core.h b/drivers/net/can/usb/etas_es58x/es58x_core.h index 4a082fd69e6f..bf24375580e5 100644 --- a/drivers/net/can/usb/etas_es58x/es58x_core.h +++ b/drivers/net/can/usb/etas_es58x/es58x_core.h @@ -674,6 +674,7 @@ static inline enum es58x_flag es58x_get_flags(const struct sk_buff *skb) return es58x_flags; } +/* es58x_core.c. */ int es58x_can_get_echo_skb(struct net_device *netdev, u32 packet_idx, u64 *tstamps, unsigned int pkts); int es58x_tx_ack_msg(struct net_device *netdev, u16 tx_free_entries, @@ -691,9 +692,14 @@ int es58x_rx_cmd_ret_u32(struct net_device *netdev, int es58x_send_msg(struct es58x_device *es58x_dev, u8 cmd_type, u8 cmd_id, const void *msg, u16 cmd_len, int channel_idx); +/* es58x_devlink.c. */ +extern const struct devlink_ops es58x_dl_ops; + +/* es581_4.c. */ extern const struct es58x_parameters es581_4_param; extern const struct es58x_operators es581_4_ops; +/* es58x_fd.c. */ extern const struct es58x_parameters es58x_fd_param; extern const struct es58x_operators es58x_fd_ops; diff --git a/drivers/net/can/usb/etas_es58x/es58x_devlink.c b/drivers/net/can/usb/etas_es58x/es58x_devlink.c new file mode 100644 index 000000000000..af6ca7ada23f --- /dev/null +++ b/drivers/net/can/usb/etas_es58x/es58x_devlink.c @@ -0,0 +1,13 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* Driver for ETAS GmbH ES58X USB CAN(-FD) Bus Interfaces. + * + * File es58x_devlink.c: report the product information using devlink. + * + * Copyright (c) 2022 Vincent Mailhol <mailhol.vincent@wanadoo.fr> + */ + +#include <net/devlink.h> + +const struct devlink_ops es58x_dl_ops = { +};
Add basic support for devlink. The callbacks of struct devlink_ops will be implemented next. Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> --- drivers/net/can/usb/Kconfig | 1 + drivers/net/can/usb/etas_es58x/Makefile | 2 +- drivers/net/can/usb/etas_es58x/es58x_core.c | 13 ++++++++++--- drivers/net/can/usb/etas_es58x/es58x_core.h | 6 ++++++ drivers/net/can/usb/etas_es58x/es58x_devlink.c | 13 +++++++++++++ 5 files changed, 31 insertions(+), 4 deletions(-) create mode 100644 drivers/net/can/usb/etas_es58x/es58x_devlink.c