mbox series

[0/3] wifi: Un-embed ath10k and ath11k dummy netdev

Message ID 20240405122123.4156104-1-leitao@debian.org
Headers show
Series wifi: Un-embed ath10k and ath11k dummy netdev | expand

Message

Breno Leitao April 5, 2024, 12:21 p.m. UTC
struct net_device shouldn't be embedded into any structure, instead,
the owner should use the private space to embed their state into
net_device.

This patch set fixes the problem above for ath10k and ath11k. This also
fixes the conversion of qtnfmac driver to the new helper.

This patch set depends on a series that is still under review:
https://lore.kernel.org/all/20240404114854.2498663-1-leitao@debian.org/#t

If it helps, I've pushed the tree to
https://github.com/leitao/linux/tree/wireless-dummy

PS: Due to lack of hardware, unfortunately all these patches are
compiled tested only.

Breno Leitao (3):
  wifi: qtnfmac: Use netdev dummy allocator helper
  wifi: ath10k: allocate dummy net_device dynamically
  wifi: ath11k: allocate dummy net_device dynamically

 drivers/net/wireless/ath/ath10k/core.c        |  9 ++++++--
 drivers/net/wireless/ath/ath10k/core.h        |  2 +-
 drivers/net/wireless/ath/ath10k/pci.c         |  2 +-
 drivers/net/wireless/ath/ath10k/sdio.c        |  2 +-
 drivers/net/wireless/ath/ath10k/snoc.c        |  4 ++--
 drivers/net/wireless/ath/ath10k/usb.c         |  2 +-
 drivers/net/wireless/ath/ath11k/ahb.c         |  9 ++++++--
 drivers/net/wireless/ath/ath11k/core.h        |  2 +-
 drivers/net/wireless/ath/ath11k/pcic.c        | 21 +++++++++++++++----
 .../wireless/quantenna/qtnfmac/pcie/pcie.c    |  3 +--
 10 files changed, 39 insertions(+), 17 deletions(-)

Comments

Breno Leitao April 8, 2024, 1:33 p.m. UTC | #1
Hello Kalle,

On Fri, Apr 05, 2024 at 06:15:05PM +0300, Kalle Valo wrote:
> Breno Leitao <leitao@debian.org> writes:
> 
> > struct net_device shouldn't be embedded into any structure, instead,
> > the owner should use the private space to embed their state into
> > net_device.
> >
> > This patch set fixes the problem above for ath10k and ath11k. This also
> > fixes the conversion of qtnfmac driver to the new helper.
> >
> > This patch set depends on a series that is still under review:
> > https://lore.kernel.org/all/20240404114854.2498663-1-leitao@debian.org/#t
> >
> > If it helps, I've pushed the tree to
> > https://github.com/leitao/linux/tree/wireless-dummy
> >
> > PS: Due to lack of hardware, unfortunately all these patches are
> > compiled tested only.
> >
> > Breno Leitao (3):
> >   wifi: qtnfmac: Use netdev dummy allocator helper
> >   wifi: ath10k: allocate dummy net_device dynamically
> >   wifi: ath11k: allocate dummy net_device dynamically
> 
> Thanks for setting up the branch, that makes the testing very easy. I
> now tested the branch using the commit below with ath11k WCN6855 hw2.0
> on an x86 box:
> 
> 5be9a125d8e7 wifi: ath11k: allocate dummy net_device dynamically
> 
> But unfortunately it crashes, the stack trace below. I can easily test
> your branches, just let me know what to test. A direct 'git pull'
> command is the best.

Thanks for the test.

Reading the issue, I am afraid that freeing netdev explicitly
(free_netdev()) might not be the best approach at the exit path.

I would like to try to leverage the ->needs_free_netdev netdev
mechanism to do the clean-up, if that makes sense. I've updated the
ath11k patch, and I am curious if that is what we want.

Would you mind testing a net patch I have, please?

  https://github.com/leitao/linux/tree/wireless-dummy_v2

PS: I didn't updated the other drivers (ath10k, qtnfmac, etc).

Thank you!
Kalle Valo April 8, 2024, 4:43 p.m. UTC | #2
Breno Leitao <leitao@debian.org> writes:

> Hello Kalle,
>
> On Fri, Apr 05, 2024 at 06:15:05PM +0300, Kalle Valo wrote:
>> Breno Leitao <leitao@debian.org> writes:
>> 
>> > struct net_device shouldn't be embedded into any structure, instead,
>> > the owner should use the private space to embed their state into
>> > net_device.
>> >
>> > This patch set fixes the problem above for ath10k and ath11k. This also
>> > fixes the conversion of qtnfmac driver to the new helper.
>> >
>> > This patch set depends on a series that is still under review:
>> > https://lore.kernel.org/all/20240404114854.2498663-1-leitao@debian.org/#t
>> >
>> > If it helps, I've pushed the tree to
>> > https://github.com/leitao/linux/tree/wireless-dummy
>> >
>> > PS: Due to lack of hardware, unfortunately all these patches are
>> > compiled tested only.
>> >
>> > Breno Leitao (3):
>> >   wifi: qtnfmac: Use netdev dummy allocator helper
>> >   wifi: ath10k: allocate dummy net_device dynamically
>> >   wifi: ath11k: allocate dummy net_device dynamically
>> 
>> Thanks for setting up the branch, that makes the testing very easy. I
>> now tested the branch using the commit below with ath11k WCN6855 hw2.0
>> on an x86 box:
>> 
>> 5be9a125d8e7 wifi: ath11k: allocate dummy net_device dynamically
>> 
>> But unfortunately it crashes, the stack trace below. I can easily test
>> your branches, just let me know what to test. A direct 'git pull'
>> command is the best.
>
> Thanks for the test.
>
> Reading the issue, I am afraid that freeing netdev explicitly
> (free_netdev()) might not be the best approach at the exit path.
>
> I would like to try to leverage the ->needs_free_netdev netdev
> mechanism to do the clean-up, if that makes sense. I've updated the
> ath11k patch, and I am curious if that is what we want.
>
> Would you mind testing a net patch I have, please?
>
>   https://github.com/leitao/linux/tree/wireless-dummy_v2

I tested this again with my WCN6855 hw2.0 x86 test box on this commit:

a87674ac820e wifi: ath11k: allocate dummy net_device dynamically

It passes my tests and doesn't crash, but I see this kmemleak warning a
lot:

unreferenced object 0xffff888127109400 (size 128):
  comm "insmod", pid 2813, jiffies 4294926528
  hex dump (first 32 bytes):
    d0 93 d5 0a 81 88 ff ff d0 93 d5 0a 81 88 ff ff  ................
    01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace (crc 870e4f12):
    [<ffffffff99bcd375>] kmemleak_alloc+0x45/0x80
    [<ffffffff975707a8>] kmalloc_trace+0x278/0x2c0
    [<ffffffff992904c5>] __hw_addr_create+0x55/0x260
    [<ffffffff992909cb>] __hw_addr_add_ex+0x2fb/0x6d0
    [<ffffffff99294004>] dev_addr_init+0x144/0x230
    [<ffffffff992629ee>] alloc_netdev_mqs+0x12e/0xfe0
    [<ffffffff992638c5>] alloc_netdev_dummy+0x25/0x30
    [<ffffffffc0b6b0cd>] ath11k_pcic_ext_irq_config+0x1ad/0xc10 [ath11k]
    [<ffffffffc0b6c431>] ath11k_pcic_config_irq+0x2f1/0x4b0 [ath11k]
    [<ffffffffc0cb8314>] ath11k_pci_probe+0x874/0x1210 [ath11k_pci]
    [<ffffffff97febf06>] local_pci_probe+0xd6/0x180
    [<ffffffff97feefaa>] pci_call_probe+0x15a/0x400
    [<ffffffff97ff03d6>] pci_device_probe+0xa6/0x100
    [<ffffffff98abe315>] really_probe+0x1d5/0x920
    [<ffffffff98abed48>] __driver_probe_device+0x2e8/0x3f0
    [<ffffffff98abee9a>] driver_probe_device+0x4a/0x140
Breno Leitao April 8, 2024, 7:33 p.m. UTC | #3
On Mon, Apr 08, 2024 at 07:43:42PM +0300, Kalle Valo wrote:
> Breno Leitao <leitao@debian.org> writes:
> > On Fri, Apr 05, 2024 at 06:15:05PM +0300, Kalle Valo wrote:
> >> Breno Leitao <leitao@debian.org> writes:
> >> 
> >> > struct net_device shouldn't be embedded into any structure, instead,
> >> > the owner should use the private space to embed their state into
> >> > net_device.
> >> >
> >> > This patch set fixes the problem above for ath10k and ath11k. This also
> >> > fixes the conversion of qtnfmac driver to the new helper.
> >> >
> >> > This patch set depends on a series that is still under review:
> >> > https://lore.kernel.org/all/20240404114854.2498663-1-leitao@debian.org/#t
> >> >
> >> > If it helps, I've pushed the tree to
> >> > https://github.com/leitao/linux/tree/wireless-dummy
> >> >
> >> > PS: Due to lack of hardware, unfortunately all these patches are
> >> > compiled tested only.
> >> >
> >> > Breno Leitao (3):
> >> >   wifi: qtnfmac: Use netdev dummy allocator helper
> >> >   wifi: ath10k: allocate dummy net_device dynamically
> >> >   wifi: ath11k: allocate dummy net_device dynamically
> >> 
> >> Thanks for setting up the branch, that makes the testing very easy. I
> >> now tested the branch using the commit below with ath11k WCN6855 hw2.0
> >> on an x86 box:
> >> 
> >> 5be9a125d8e7 wifi: ath11k: allocate dummy net_device dynamically
> >> 
> >> But unfortunately it crashes, the stack trace below. I can easily test
> >> your branches, just let me know what to test. A direct 'git pull'
> >> command is the best.
> >
> > Thanks for the test.
> >
> > Reading the issue, I am afraid that freeing netdev explicitly
> > (free_netdev()) might not be the best approach at the exit path.
> >
> > I would like to try to leverage the ->needs_free_netdev netdev
> > mechanism to do the clean-up, if that makes sense. I've updated the
> > ath11k patch, and I am curious if that is what we want.
> >
> > Would you mind testing a net patch I have, please?
> >
> >   https://github.com/leitao/linux/tree/wireless-dummy_v2
> 
> I tested this again with my WCN6855 hw2.0 x86 test box on this commit:
> 
> a87674ac820e wifi: ath11k: allocate dummy net_device dynamically
> 
> It passes my tests and doesn't crash, but I see this kmemleak warning a
> lot:

Thanks Kalle, that was helpful. The device is not being clean-up
automatically.

Chatting with Jakub, he suggested coming back to the original approach,
but, adding a additional patch, at the free_netdev().

Would you mind running another test, please?

	https://github.com/leitao/linux/tree/wireless-dummy_v3

The branch above is basically the original branch (as in this patch
series), with this additional patch:

	Author: Breno Leitao <leitao@debian.org>
	Date:   Mon Apr 8 11:37:32 2024 -0700

	    net: free_netdev: exit earlier if dummy

	    For dummy devices, exit earlier at free_netdev() instead of executing
	    the whole function. This is necessary, because dummy devices are
	    special, and shouldn't have the second part of the function executed.

	    Otherwise reg_state, which is NETREG_DUMMY, will be overwritten and
	    there will be no way to identify that this is a dummy device. Also, this
	    device do not need the final put_device(), since dummy devices are not
	    registered (through register_netdevice()), where the device reference is
	    increased (at netdev_register_kobject()/device_add()).

	    Suggested-by: Jakub Kicinski <kuba@kernel.org>
	    Signed-off-by: Breno Leitao <leitao@debian.org>

	diff --git a/net/core/dev.c b/net/core/dev.c
	index 2b82bd1cd2f8..5d2cb97d0ae6 100644
	--- a/net/core/dev.c
	+++ b/net/core/dev.c
	@@ -11058,7 +11058,8 @@ void free_netdev(struct net_device *dev)
		dev->xdp_bulkq = NULL;

		/*  Compatibility with error handling in drivers */
	-       if (dev->reg_state == NETREG_UNINITIALIZED) {
	+       if (dev->reg_state == NETREG_UNINITIALIZED ||
	+           dev->reg_state == NETREG_DUMMY) {
			netdev_freemem(dev);
			return;
		}
Kalle Valo April 9, 2024, 10:03 a.m. UTC | #4
Breno Leitao <leitao@debian.org> writes:

>> > Reading the issue, I am afraid that freeing netdev explicitly
>> > (free_netdev()) might not be the best approach at the exit path.
>> >
>> > I would like to try to leverage the ->needs_free_netdev netdev
>> > mechanism to do the clean-up, if that makes sense. I've updated the
>> > ath11k patch, and I am curious if that is what we want.
>> >
>> > Would you mind testing a net patch I have, please?
>> >
>> >   https://github.com/leitao/linux/tree/wireless-dummy_v2
>> 
>> I tested this again with my WCN6855 hw2.0 x86 test box on this commit:
>> 
>> a87674ac820e wifi: ath11k: allocate dummy net_device dynamically
>> 
>> It passes my tests and doesn't crash, but I see this kmemleak warning a
>> lot:
>
> Thanks Kalle, that was helpful. The device is not being clean-up
> automatically.
>
> Chatting with Jakub, he suggested coming back to the original approach,
> but, adding a additional patch, at the free_netdev().
>
> Would you mind running another test, please?
>
> 	https://github.com/leitao/linux/tree/wireless-dummy_v3
>
> The branch above is basically the original branch (as in this patch
> series), with this additional patch:
>
> 	Author: Breno Leitao <leitao@debian.org>
> 	Date:   Mon Apr 8 11:37:32 2024 -0700
>
> 	    net: free_netdev: exit earlier if dummy

I tested with the same ath11k hardware and this one passes all my
(simple) ath11k tests, no issues found. I used this commit:

1c10aebaa8ce net: free_netdev: exit earlier if dummy
Breno Leitao April 9, 2024, 10:51 a.m. UTC | #5
On Tue, Apr 09, 2024 at 01:03:21PM +0300, Kalle Valo wrote:
> Breno Leitao <leitao@debian.org> writes:
> 
> >> > Reading the issue, I am afraid that freeing netdev explicitly
> >> > (free_netdev()) might not be the best approach at the exit path.
> >> >
> >> > I would like to try to leverage the ->needs_free_netdev netdev
> >> > mechanism to do the clean-up, if that makes sense. I've updated the
> >> > ath11k patch, and I am curious if that is what we want.
> >> >
> >> > Would you mind testing a net patch I have, please?
> >> >
> >> >   https://github.com/leitao/linux/tree/wireless-dummy_v2
> >> 
> >> I tested this again with my WCN6855 hw2.0 x86 test box on this commit:
> >> 
> >> a87674ac820e wifi: ath11k: allocate dummy net_device dynamically
> >> 
> >> It passes my tests and doesn't crash, but I see this kmemleak warning a
> >> lot:
> >
> > Thanks Kalle, that was helpful. The device is not being clean-up
> > automatically.
> >
> > Chatting with Jakub, he suggested coming back to the original approach,
> > but, adding a additional patch, at the free_netdev().
> >
> > Would you mind running another test, please?
> >
> > 	https://github.com/leitao/linux/tree/wireless-dummy_v3
> >
> > The branch above is basically the original branch (as in this patch
> > series), with this additional patch:
> >
> > 	Author: Breno Leitao <leitao@debian.org>
> > 	Date:   Mon Apr 8 11:37:32 2024 -0700
> >
> > 	    net: free_netdev: exit earlier if dummy
> 
> I tested with the same ath11k hardware and this one passes all my
> (simple) ath11k tests, no issues found. I used this commit:
> 
> 1c10aebaa8ce net: free_netdev: exit earlier if dummy

Thank you so much for the test.

I will respin a v2 of this patchset with the additional patch included.
Kalle Valo April 9, 2024, 11:40 a.m. UTC | #6
Breno Leitao <leitao@debian.org> writes:

> On Tue, Apr 09, 2024 at 01:03:21PM +0300, Kalle Valo wrote:
>
>> Breno Leitao <leitao@debian.org> writes:
>> 
>> >> > Reading the issue, I am afraid that freeing netdev explicitly
>> >> > (free_netdev()) might not be the best approach at the exit path.
>> >> >
>> >> > I would like to try to leverage the ->needs_free_netdev netdev
>> >> > mechanism to do the clean-up, if that makes sense. I've updated the
>> >> > ath11k patch, and I am curious if that is what we want.
>> >> >
>> >> > Would you mind testing a net patch I have, please?
>> >> >
>> >> >   https://github.com/leitao/linux/tree/wireless-dummy_v2
>> >> 
>> >> I tested this again with my WCN6855 hw2.0 x86 test box on this commit:
>> >> 
>> >> a87674ac820e wifi: ath11k: allocate dummy net_device dynamically
>> >> 
>> >> It passes my tests and doesn't crash, but I see this kmemleak warning a
>> >> lot:
>> >
>> > Thanks Kalle, that was helpful. The device is not being clean-up
>> > automatically.
>> >
>> > Chatting with Jakub, he suggested coming back to the original approach,
>> > but, adding a additional patch, at the free_netdev().
>> >
>> > Would you mind running another test, please?
>> >
>> > 	https://github.com/leitao/linux/tree/wireless-dummy_v3
>> >
>> > The branch above is basically the original branch (as in this patch
>> > series), with this additional patch:
>> >
>> > 	Author: Breno Leitao <leitao@debian.org>
>> > 	Date:   Mon Apr 8 11:37:32 2024 -0700
>> >
>> > 	    net: free_netdev: exit earlier if dummy
>> 
>> I tested with the same ath11k hardware and this one passes all my
>> (simple) ath11k tests, no issues found. I used this commit:
>> 
>> 1c10aebaa8ce net: free_netdev: exit earlier if dummy
>
> Thank you so much for the test.
>
> I will respin a v2 of this patchset with the additional patch included.

Sounds good. Feel free to add:

Tested-by: Kalle Valo <kvalo@kernel.org>