Message ID | 20240405122123.4156104-1-leitao@debian.org |
---|---|
Headers | show |
Series | wifi: Un-embed ath10k and ath11k dummy netdev | expand |
Hello Kalle, On Fri, Apr 05, 2024 at 06:15:05PM +0300, Kalle Valo wrote: > Breno Leitao <leitao@debian.org> writes: > > > struct net_device shouldn't be embedded into any structure, instead, > > the owner should use the private space to embed their state into > > net_device. > > > > This patch set fixes the problem above for ath10k and ath11k. This also > > fixes the conversion of qtnfmac driver to the new helper. > > > > This patch set depends on a series that is still under review: > > https://lore.kernel.org/all/20240404114854.2498663-1-leitao@debian.org/#t > > > > If it helps, I've pushed the tree to > > https://github.com/leitao/linux/tree/wireless-dummy > > > > PS: Due to lack of hardware, unfortunately all these patches are > > compiled tested only. > > > > Breno Leitao (3): > > wifi: qtnfmac: Use netdev dummy allocator helper > > wifi: ath10k: allocate dummy net_device dynamically > > wifi: ath11k: allocate dummy net_device dynamically > > Thanks for setting up the branch, that makes the testing very easy. I > now tested the branch using the commit below with ath11k WCN6855 hw2.0 > on an x86 box: > > 5be9a125d8e7 wifi: ath11k: allocate dummy net_device dynamically > > But unfortunately it crashes, the stack trace below. I can easily test > your branches, just let me know what to test. A direct 'git pull' > command is the best. Thanks for the test. Reading the issue, I am afraid that freeing netdev explicitly (free_netdev()) might not be the best approach at the exit path. I would like to try to leverage the ->needs_free_netdev netdev mechanism to do the clean-up, if that makes sense. I've updated the ath11k patch, and I am curious if that is what we want. Would you mind testing a net patch I have, please? https://github.com/leitao/linux/tree/wireless-dummy_v2 PS: I didn't updated the other drivers (ath10k, qtnfmac, etc). Thank you!
Breno Leitao <leitao@debian.org> writes: > Hello Kalle, > > On Fri, Apr 05, 2024 at 06:15:05PM +0300, Kalle Valo wrote: >> Breno Leitao <leitao@debian.org> writes: >> >> > struct net_device shouldn't be embedded into any structure, instead, >> > the owner should use the private space to embed their state into >> > net_device. >> > >> > This patch set fixes the problem above for ath10k and ath11k. This also >> > fixes the conversion of qtnfmac driver to the new helper. >> > >> > This patch set depends on a series that is still under review: >> > https://lore.kernel.org/all/20240404114854.2498663-1-leitao@debian.org/#t >> > >> > If it helps, I've pushed the tree to >> > https://github.com/leitao/linux/tree/wireless-dummy >> > >> > PS: Due to lack of hardware, unfortunately all these patches are >> > compiled tested only. >> > >> > Breno Leitao (3): >> > wifi: qtnfmac: Use netdev dummy allocator helper >> > wifi: ath10k: allocate dummy net_device dynamically >> > wifi: ath11k: allocate dummy net_device dynamically >> >> Thanks for setting up the branch, that makes the testing very easy. I >> now tested the branch using the commit below with ath11k WCN6855 hw2.0 >> on an x86 box: >> >> 5be9a125d8e7 wifi: ath11k: allocate dummy net_device dynamically >> >> But unfortunately it crashes, the stack trace below. I can easily test >> your branches, just let me know what to test. A direct 'git pull' >> command is the best. > > Thanks for the test. > > Reading the issue, I am afraid that freeing netdev explicitly > (free_netdev()) might not be the best approach at the exit path. > > I would like to try to leverage the ->needs_free_netdev netdev > mechanism to do the clean-up, if that makes sense. I've updated the > ath11k patch, and I am curious if that is what we want. > > Would you mind testing a net patch I have, please? > > https://github.com/leitao/linux/tree/wireless-dummy_v2 I tested this again with my WCN6855 hw2.0 x86 test box on this commit: a87674ac820e wifi: ath11k: allocate dummy net_device dynamically It passes my tests and doesn't crash, but I see this kmemleak warning a lot: unreferenced object 0xffff888127109400 (size 128): comm "insmod", pid 2813, jiffies 4294926528 hex dump (first 32 bytes): d0 93 d5 0a 81 88 ff ff d0 93 d5 0a 81 88 ff ff ................ 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc 870e4f12): [<ffffffff99bcd375>] kmemleak_alloc+0x45/0x80 [<ffffffff975707a8>] kmalloc_trace+0x278/0x2c0 [<ffffffff992904c5>] __hw_addr_create+0x55/0x260 [<ffffffff992909cb>] __hw_addr_add_ex+0x2fb/0x6d0 [<ffffffff99294004>] dev_addr_init+0x144/0x230 [<ffffffff992629ee>] alloc_netdev_mqs+0x12e/0xfe0 [<ffffffff992638c5>] alloc_netdev_dummy+0x25/0x30 [<ffffffffc0b6b0cd>] ath11k_pcic_ext_irq_config+0x1ad/0xc10 [ath11k] [<ffffffffc0b6c431>] ath11k_pcic_config_irq+0x2f1/0x4b0 [ath11k] [<ffffffffc0cb8314>] ath11k_pci_probe+0x874/0x1210 [ath11k_pci] [<ffffffff97febf06>] local_pci_probe+0xd6/0x180 [<ffffffff97feefaa>] pci_call_probe+0x15a/0x400 [<ffffffff97ff03d6>] pci_device_probe+0xa6/0x100 [<ffffffff98abe315>] really_probe+0x1d5/0x920 [<ffffffff98abed48>] __driver_probe_device+0x2e8/0x3f0 [<ffffffff98abee9a>] driver_probe_device+0x4a/0x140
On Mon, Apr 08, 2024 at 07:43:42PM +0300, Kalle Valo wrote: > Breno Leitao <leitao@debian.org> writes: > > On Fri, Apr 05, 2024 at 06:15:05PM +0300, Kalle Valo wrote: > >> Breno Leitao <leitao@debian.org> writes: > >> > >> > struct net_device shouldn't be embedded into any structure, instead, > >> > the owner should use the private space to embed their state into > >> > net_device. > >> > > >> > This patch set fixes the problem above for ath10k and ath11k. This also > >> > fixes the conversion of qtnfmac driver to the new helper. > >> > > >> > This patch set depends on a series that is still under review: > >> > https://lore.kernel.org/all/20240404114854.2498663-1-leitao@debian.org/#t > >> > > >> > If it helps, I've pushed the tree to > >> > https://github.com/leitao/linux/tree/wireless-dummy > >> > > >> > PS: Due to lack of hardware, unfortunately all these patches are > >> > compiled tested only. > >> > > >> > Breno Leitao (3): > >> > wifi: qtnfmac: Use netdev dummy allocator helper > >> > wifi: ath10k: allocate dummy net_device dynamically > >> > wifi: ath11k: allocate dummy net_device dynamically > >> > >> Thanks for setting up the branch, that makes the testing very easy. I > >> now tested the branch using the commit below with ath11k WCN6855 hw2.0 > >> on an x86 box: > >> > >> 5be9a125d8e7 wifi: ath11k: allocate dummy net_device dynamically > >> > >> But unfortunately it crashes, the stack trace below. I can easily test > >> your branches, just let me know what to test. A direct 'git pull' > >> command is the best. > > > > Thanks for the test. > > > > Reading the issue, I am afraid that freeing netdev explicitly > > (free_netdev()) might not be the best approach at the exit path. > > > > I would like to try to leverage the ->needs_free_netdev netdev > > mechanism to do the clean-up, if that makes sense. I've updated the > > ath11k patch, and I am curious if that is what we want. > > > > Would you mind testing a net patch I have, please? > > > > https://github.com/leitao/linux/tree/wireless-dummy_v2 > > I tested this again with my WCN6855 hw2.0 x86 test box on this commit: > > a87674ac820e wifi: ath11k: allocate dummy net_device dynamically > > It passes my tests and doesn't crash, but I see this kmemleak warning a > lot: Thanks Kalle, that was helpful. The device is not being clean-up automatically. Chatting with Jakub, he suggested coming back to the original approach, but, adding a additional patch, at the free_netdev(). Would you mind running another test, please? https://github.com/leitao/linux/tree/wireless-dummy_v3 The branch above is basically the original branch (as in this patch series), with this additional patch: Author: Breno Leitao <leitao@debian.org> Date: Mon Apr 8 11:37:32 2024 -0700 net: free_netdev: exit earlier if dummy For dummy devices, exit earlier at free_netdev() instead of executing the whole function. This is necessary, because dummy devices are special, and shouldn't have the second part of the function executed. Otherwise reg_state, which is NETREG_DUMMY, will be overwritten and there will be no way to identify that this is a dummy device. Also, this device do not need the final put_device(), since dummy devices are not registered (through register_netdevice()), where the device reference is increased (at netdev_register_kobject()/device_add()). Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Breno Leitao <leitao@debian.org> diff --git a/net/core/dev.c b/net/core/dev.c index 2b82bd1cd2f8..5d2cb97d0ae6 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -11058,7 +11058,8 @@ void free_netdev(struct net_device *dev) dev->xdp_bulkq = NULL; /* Compatibility with error handling in drivers */ - if (dev->reg_state == NETREG_UNINITIALIZED) { + if (dev->reg_state == NETREG_UNINITIALIZED || + dev->reg_state == NETREG_DUMMY) { netdev_freemem(dev); return; }
Breno Leitao <leitao@debian.org> writes: >> > Reading the issue, I am afraid that freeing netdev explicitly >> > (free_netdev()) might not be the best approach at the exit path. >> > >> > I would like to try to leverage the ->needs_free_netdev netdev >> > mechanism to do the clean-up, if that makes sense. I've updated the >> > ath11k patch, and I am curious if that is what we want. >> > >> > Would you mind testing a net patch I have, please? >> > >> > https://github.com/leitao/linux/tree/wireless-dummy_v2 >> >> I tested this again with my WCN6855 hw2.0 x86 test box on this commit: >> >> a87674ac820e wifi: ath11k: allocate dummy net_device dynamically >> >> It passes my tests and doesn't crash, but I see this kmemleak warning a >> lot: > > Thanks Kalle, that was helpful. The device is not being clean-up > automatically. > > Chatting with Jakub, he suggested coming back to the original approach, > but, adding a additional patch, at the free_netdev(). > > Would you mind running another test, please? > > https://github.com/leitao/linux/tree/wireless-dummy_v3 > > The branch above is basically the original branch (as in this patch > series), with this additional patch: > > Author: Breno Leitao <leitao@debian.org> > Date: Mon Apr 8 11:37:32 2024 -0700 > > net: free_netdev: exit earlier if dummy I tested with the same ath11k hardware and this one passes all my (simple) ath11k tests, no issues found. I used this commit: 1c10aebaa8ce net: free_netdev: exit earlier if dummy
On Tue, Apr 09, 2024 at 01:03:21PM +0300, Kalle Valo wrote: > Breno Leitao <leitao@debian.org> writes: > > >> > Reading the issue, I am afraid that freeing netdev explicitly > >> > (free_netdev()) might not be the best approach at the exit path. > >> > > >> > I would like to try to leverage the ->needs_free_netdev netdev > >> > mechanism to do the clean-up, if that makes sense. I've updated the > >> > ath11k patch, and I am curious if that is what we want. > >> > > >> > Would you mind testing a net patch I have, please? > >> > > >> > https://github.com/leitao/linux/tree/wireless-dummy_v2 > >> > >> I tested this again with my WCN6855 hw2.0 x86 test box on this commit: > >> > >> a87674ac820e wifi: ath11k: allocate dummy net_device dynamically > >> > >> It passes my tests and doesn't crash, but I see this kmemleak warning a > >> lot: > > > > Thanks Kalle, that was helpful. The device is not being clean-up > > automatically. > > > > Chatting with Jakub, he suggested coming back to the original approach, > > but, adding a additional patch, at the free_netdev(). > > > > Would you mind running another test, please? > > > > https://github.com/leitao/linux/tree/wireless-dummy_v3 > > > > The branch above is basically the original branch (as in this patch > > series), with this additional patch: > > > > Author: Breno Leitao <leitao@debian.org> > > Date: Mon Apr 8 11:37:32 2024 -0700 > > > > net: free_netdev: exit earlier if dummy > > I tested with the same ath11k hardware and this one passes all my > (simple) ath11k tests, no issues found. I used this commit: > > 1c10aebaa8ce net: free_netdev: exit earlier if dummy Thank you so much for the test. I will respin a v2 of this patchset with the additional patch included.
Breno Leitao <leitao@debian.org> writes: > On Tue, Apr 09, 2024 at 01:03:21PM +0300, Kalle Valo wrote: > >> Breno Leitao <leitao@debian.org> writes: >> >> >> > Reading the issue, I am afraid that freeing netdev explicitly >> >> > (free_netdev()) might not be the best approach at the exit path. >> >> > >> >> > I would like to try to leverage the ->needs_free_netdev netdev >> >> > mechanism to do the clean-up, if that makes sense. I've updated the >> >> > ath11k patch, and I am curious if that is what we want. >> >> > >> >> > Would you mind testing a net patch I have, please? >> >> > >> >> > https://github.com/leitao/linux/tree/wireless-dummy_v2 >> >> >> >> I tested this again with my WCN6855 hw2.0 x86 test box on this commit: >> >> >> >> a87674ac820e wifi: ath11k: allocate dummy net_device dynamically >> >> >> >> It passes my tests and doesn't crash, but I see this kmemleak warning a >> >> lot: >> > >> > Thanks Kalle, that was helpful. The device is not being clean-up >> > automatically. >> > >> > Chatting with Jakub, he suggested coming back to the original approach, >> > but, adding a additional patch, at the free_netdev(). >> > >> > Would you mind running another test, please? >> > >> > https://github.com/leitao/linux/tree/wireless-dummy_v3 >> > >> > The branch above is basically the original branch (as in this patch >> > series), with this additional patch: >> > >> > Author: Breno Leitao <leitao@debian.org> >> > Date: Mon Apr 8 11:37:32 2024 -0700 >> > >> > net: free_netdev: exit earlier if dummy >> >> I tested with the same ath11k hardware and this one passes all my >> (simple) ath11k tests, no issues found. I used this commit: >> >> 1c10aebaa8ce net: free_netdev: exit earlier if dummy > > Thank you so much for the test. > > I will respin a v2 of this patchset with the additional patch included. Sounds good. Feel free to add: Tested-by: Kalle Valo <kvalo@kernel.org>