Message ID | 20211129101309.2931-1-quic_wgong@quicinc.com |
---|---|
State | New |
Headers | show |
Series | ath11k: enable IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS for WCN6855 | expand |
On Monday, 29 November 2021 11:13:09 CET Wen Gong wrote: > Currently mac80211 will send 3 scan request for each scan of WCN6855, > they are 2.4 GHz/5 GHz/6 GHz band scan. Firmware of WCN6855 will > cache the RNR IE(Reduced Neighbor Report element) which exist in the > beacon of 2.4 GHz/5 GHz of the AP which is co-located with 6 GHz, > and then use the cache to scan in 6 GHz band scan if the 6 GHz scan > is in the same scan with the 2.4 GHz/5 GHz band, this will helpful to > search more AP of 6 GHz. Also it will decrease the time cost of scan > because firmware will use dual-band scan for the 2.4 GHz/5 GHz, it > means the 2.4 GHz and 5 GHz scans are doing simultaneously. > > Set the flag IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS for WCN6855 since > it supports 2.4 GHz/5 GHz/6 GHz and it is single pdev which means > all the 2.4 GHz/5 GHz/6 GHz exist in the same wiphy/ieee80211_hw. > > Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1 I've tested this on ath-next on commit a93789ae541c ("ath11k: Avoid NULL ptr access during mgmt tx cleanup") with a WCN6856 card (EmWicon/jjplus WMX7205) with firmware WLAN.HSP.1.1-02892.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1. ath-next was required for me because 32 MSI vectors are not available on the used system. Without this patch, it works fine. With patch, I just have to connect to an AP via wpa_supplicant to crash the system. See the attached x86-64 .config, the stacktrace and the decoded stacktrace. Kind regards, Sven [ 51.095079] general protection fault, probably for non-canonical address 0x408210000b231a: 0000 [#1] PREEMPT SMP NOPTI [ 51.105795] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc1+ #1 [ 51.112157] Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014 [ 51.118339] RIP: 0010:skb_release_data (./include/linux/skbuff.h:1549 net/core/skbuff.c:669) [ 51.123061] Code: 4d 85 ed 74 4b 41 8b 85 bc 00 00 00 49 03 85 c0 00 00 00 0f b6 10 f6 c2 01 74 35 48 8b 70 28 48 85 f6 74 2c 40 f6 c6 01 75 21 <48> 8b 06 ba 01 00 00 00 4c 89 ef 0f ae e8 ff d0 41 8b 85 bc 00 00 All code ======== 0: 4d 85 ed test %r13,%r13 3: 74 4b je 0x50 5: 41 8b 85 bc 00 00 00 mov 0xbc(%r13),%eax c: 49 03 85 c0 00 00 00 add 0xc0(%r13),%rax 13: 0f b6 10 movzbl (%rax),%edx 16: f6 c2 01 test $0x1,%dl 19: 74 35 je 0x50 1b: 48 8b 70 28 mov 0x28(%rax),%rsi 1f: 48 85 f6 test %rsi,%rsi 22: 74 2c je 0x50 24: 40 f6 c6 01 test $0x1,%sil 28: 75 21 jne 0x4b 2a:* 48 8b 06 mov (%rsi),%rax <-- trapping instruction 2d: ba 01 00 00 00 mov $0x1,%edx 32: 4c 89 ef mov %r13,%rdi 35: 0f ae e8 lfence 38: ff d0 callq *%rax 3a: 41 rex.B 3b: 8b .byte 0x8b 3c: 85 .byte 0x85 3d: bc .byte 0xbc ... Code starting with the faulting instruction =========================================== 0: 48 8b 06 mov (%rsi),%rax 3: ba 01 00 00 00 mov $0x1,%edx 8: 4c 89 ef mov %r13,%rdi b: 0f ae e8 lfence e: ff d0 callq *%rax 10: 41 rex.B 11: 8b .byte 0x8b 12: 85 .byte 0x85 13: bc .byte 0xbc ... [ 51.141815] RSP: 0018:ffffbec4c0003e30 EFLAGS: 00010246 [ 51.147049] RAX: ffff9a9d11a6c2c0 RBX: ffff9a9d08341a68 RCX: 0000000000000000 [ 51.154189] RDX: 0000000000000003 RSI: 00408210000b231a RDI: ffff9a9d01162900 [ 51.161323] RBP: ffff9a9d01162900 R08: 0000000000000212 R09: ffffffffb4ed24e8 [ 51.168465] R10: 0000000000000000 R11: 00000000dca23000 R12: ffff9a9d11a6c2c0 [ 51.175605] R13: ffff9a9d01162900 R14: ffff9a9d083435d8 R15: 0000000000000005 [ 51.182740] FS: 0000000000000000(0000) GS:ffff9a9d1ac00000(0000) knlGS:0000000000000000 [ 51.190832] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 51.196578] CR2: 000055b14ef3a778 CR3: 0000000108c6e000 CR4: 00000000000006f0 [ 51.203713] Call Trace: [ 51.206170] <IRQ> [ 51.208196] consume_skb (net/core/skbuff.c:757 net/core/skbuff.c:912 net/core/skbuff.c:906) [ 51.211620] ath11k_ce_tx_process_cb+0x157/0x220 ath11k [ 51.217177] ath11k_ce_per_engine_service (drivers/net/wireless/ath/ath11k/ce.c:437 drivers/net/wireless/ath/ath11k/ce.c:675) ath11k [ 51.223130] ? _raw_spin_lock_irqsave (./arch/x86/include/asm/atomic.h:202 ./include/linux/atomic/atomic-instrumented.h:513 ./include/asm-generic/qspinlock.h:82 ./include/linux/spinlock.h:185 ./include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162) [ 51.227680] ath11k_pci_ce_tasklet (drivers/net/wireless/ath/ath11k/pci.c:633) ath11k_pci [ 51.233095] tasklet_action_common.constprop.0 (./arch/x86/include/asm/bitops.h:75 ./include/asm-generic/bitops/instrumented-atomic.h:42 kernel/softirq.c:879 kernel/softirq.c:787) [ 51.238425] __do_softirq (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:212 ./include/trace/events/irq.h:142 kernel/softirq.c:559) [ 51.242023] __irq_exit_rcu (kernel/softirq.c:432 kernel/softirq.c:636) [ 51.245780] common_interrupt (arch/x86/kernel/irq.c:240 (discriminator 14)) [ 51.249638] </IRQ> [ 51.251743] <TASK> [ 51.253850] asm_common_interrupt (./arch/x86/include/asm/idtentry.h:629) [ 51.258044] RIP: 0010:cpuidle_enter_state (drivers/cpuidle/cpuidle.c:259) [ 51.263026] Code: 31 ff e8 d9 c6 9e ff 45 84 ff 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 78 02 00 00 31 ff e8 bd 97 a5 ff fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 11 01 00 00 49 63 c6 4c 2b 2c 24 48 8d 14 40 48 8d All code ======== 0: 31 ff xor %edi,%edi 2: e8 d9 c6 9e ff callq 0xffffffffff9ec6e0 7: 45 84 ff test %r15b,%r15b a: 74 17 je 0x23 c: 9c pushfq d: 58 pop %rax e: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) 13: f6 c4 02 test $0x2,%ah 16: 0f 85 78 02 00 00 jne 0x294 1c: 31 ff xor %edi,%edi 1e: e8 bd 97 a5 ff callq 0xffffffffffa597e0 23: fb sti 24: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) 2a:* 45 85 f6 test %r14d,%r14d <-- trapping instruction 2d: 0f 88 11 01 00 00 js 0x144 33: 49 63 c6 movslq %r14d,%rax 36: 4c 2b 2c 24 sub (%rsp),%r13 3a: 48 8d 14 40 lea (%rax,%rax,2),%rdx 3e: 48 rex.W 3f: 8d .byte 0x8d Code starting with the faulting instruction =========================================== 0: 45 85 f6 test %r14d,%r14d 3: 0f 88 11 01 00 00 js 0x11a 9: 49 63 c6 movslq %r14d,%rax c: 4c 2b 2c 24 sub (%rsp),%r13 10: 48 8d 14 40 lea (%rax,%rax,2),%rdx 14: 48 rex.W 15: 8d .byte 0x8d [ 51.281781] RSP: 0018:ffffffffb4e03e60 EFLAGS: 00000246 [ 51.287017] RAX: ffff9a9d1ac00000 RBX: 0000000000000002 RCX: 000000000000001f [ 51.294157] RDX: 0000000000000000 RSI: ffffffffb494bd50 RDI: ffffffffb4927def [ 51.301290] RBP: ffff9a9d0151b000 R08: 0000000be57e1147 R09: 0000000000000018 [ 51.308424] R10: 0000000000000ed3 R11: 0000000000002406 R12: ffffffffb4fd05c0 [ 51.315565] R13: 0000000be57e1147 R14: 0000000000000002 R15: 0000000000000000 [ 51.322716] cpuidle_enter (drivers/cpuidle/cpuidle.c:353) [ 51.326305] do_idle (kernel/sched/idle.c:158 kernel/sched/idle.c:239 kernel/sched/idle.c:306) [ 51.329547] cpu_startup_entry (kernel/sched/idle.c:402 (discriminator 1)) [ 51.333479] start_kernel (init/main.c:1137) [ 51.337156] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:283) [ 51.342228] </TASK> [ 51.344424] Modules linked in: qrtr_mhi qrtr ath11k_pci mhi ath11k qmi_helpers mac80211 btusb btrtl btbcm btintel bluetooth libarc4 kvm_amd ccp cfg80211 jitterentropy_rng rng_core sha512_ssse3 evdev sha512_generic kvm snd_pcm snd_timer ctr leds_apu drbg snd ansi_cprng sg irqbypass ecdh_generic rfkill soundcore ecc pcspkr k10temp sp5100_tco watchdog button acpi_cpufreq drm fuse configfs ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 sd_mod t10_pi crc_t10dif crct10dif_generic crct10dif_common uas usb_storage ohci_pci ahci libahci libata ehci_pci ohci_hcd ehci_hcd r8169 realtek mdio_devres usbcore scsi_mod i2c_piix4 usb_common scsi_common libphy [ 51.403181] ---[ end trace 5511b9c3dbb0841e ]--- [ 51.407861] RIP: 0010:skb_release_data (./include/linux/skbuff.h:1549 net/core/skbuff.c:669) [ 51.412592] Code: 4d 85 ed 74 4b 41 8b 85 bc 00 00 00 49 03 85 c0 00 00 00 0f b6 10 f6 c2 01 74 35 48 8b 70 28 48 85 f6 74 2c 40 f6 c6 01 75 21 <48> 8b 06 ba 01 00 00 00 4c 89 ef 0f ae e8 ff d0 41 8b 85 bc 00 00 All code ======== 0: 4d 85 ed test %r13,%r13 3: 74 4b je 0x50 5: 41 8b 85 bc 00 00 00 mov 0xbc(%r13),%eax c: 49 03 85 c0 00 00 00 add 0xc0(%r13),%rax 13: 0f b6 10 movzbl (%rax),%edx 16: f6 c2 01 test $0x1,%dl 19: 74 35 je 0x50 1b: 48 8b 70 28 mov 0x28(%rax),%rsi 1f: 48 85 f6 test %rsi,%rsi 22: 74 2c je 0x50 24: 40 f6 c6 01 test $0x1,%sil 28: 75 21 jne 0x4b 2a:* 48 8b 06 mov (%rsi),%rax <-- trapping instruction 2d: ba 01 00 00 00 mov $0x1,%edx 32: 4c 89 ef mov %r13,%rdi 35: 0f ae e8 lfence 38: ff d0 callq *%rax 3a: 41 rex.B 3b: 8b .byte 0x8b 3c: 85 .byte 0x85 3d: bc .byte 0xbc ... Code starting with the faulting instruction =========================================== 0: 48 8b 06 mov (%rsi),%rax 3: ba 01 00 00 00 mov $0x1,%edx 8: 4c 89 ef mov %r13,%rdi b: 0f ae e8 lfence e: ff d0 callq *%rax 10: 41 rex.B 11: 8b .byte 0x8b 12: 85 .byte 0x85 13: bc .byte 0xbc ... [ 51.431366] RSP: 0018:ffffbec4c0003e30 EFLAGS: 00010246 [ 51.436623] RAX: ffff9a9d11a6c2c0 RBX: ffff9a9d08341a68 RCX: 0000000000000000 [ 51.443782] RDX: 0000000000000003 RSI: 00408210000b231a RDI: ffff9a9d01162900 [ 51.450939] RBP: ffff9a9d01162900 R08: 0000000000000212 R09: ffffffffb4ed24e8 [ 51.458099] R10: 0000000000000000 R11: 00000000dca23000 R12: ffff9a9d11a6c2c0 [ 51.465256] R13: ffff9a9d01162900 R14: ffff9a9d083435d8 R15: 0000000000000005 [ 51.472416] FS: 0000000000000000(0000) GS:ffff9a9d1ac00000(0000) knlGS:0000000000000000 [ 51.480528] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 51.486299] CR2: 000055b14ef3a778 CR3: 0000000108c6e000 CR4: 00000000000006f0 [ 51.493459] Kernel panic - not syncing: Fatal exception in interrupt [ 51.499831] Kernel Offset: 0x32800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 51.510610] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]--- [ 51.095079] general protection fault, probably for non-canonical address 0x408210000b231a: 0000 [#1] PREEMPT SMP NOPTI [ 51.105795] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc1+ #1 [ 51.112157] Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014 [ 51.118339] RIP: 0010:skb_release_data+0x81/0x170 [ 51.123061] Code: 4d 85 ed 74 4b 41 8b 85 bc 00 00 00 49 03 85 c0 00 00 00 0f b6 10 f6 c2 01 74 35 48 8b 70 28 48 85 f6 74 2c 40 f6 c6 01 75 21 <48> 8b 06 ba 01 00 00 00 4c 89 ef 0f ae e8 ff d0 41 8b 85 bc 00 00 [ 51.141815] RSP: 0018:ffffbec4c0003e30 EFLAGS: 00010246 [ 51.147049] RAX: ffff9a9d11a6c2c0 RBX: ffff9a9d08341a68 RCX: 0000000000000000 [ 51.154189] RDX: 0000000000000003 RSI: 00408210000b231a RDI: ffff9a9d01162900 [ 51.161323] RBP: ffff9a9d01162900 R08: 0000000000000212 R09: ffffffffb4ed24e8 [ 51.168465] R10: 0000000000000000 R11: 00000000dca23000 R12: ffff9a9d11a6c2c0 [ 51.175605] R13: ffff9a9d01162900 R14: ffff9a9d083435d8 R15: 0000000000000005 [ 51.182740] FS: 0000000000000000(0000) GS:ffff9a9d1ac00000(0000) knlGS:0000000000000000 [ 51.190832] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 51.196578] CR2: 000055b14ef3a778 CR3: 0000000108c6e000 CR4: 00000000000006f0 [ 51.203713] Call Trace: [ 51.206170] <IRQ> [ 51.208196] consume_skb+0x39/0xb0 [ 51.211620] ath11k_ce_tx_process_cb+0x157/0x220 [ath11k] [ 51.217177] ath11k_ce_per_engine_service+0x3c0/0x3d0 [ath11k] [ 51.223130] ? _raw_spin_lock_irqsave+0x26/0x50 [ 51.227680] ath11k_pci_ce_tasklet+0x1c/0x40 [ath11k_pci] [ 51.233095] tasklet_action_common.constprop.0+0xaf/0xe0 [ 51.238425] __do_softirq+0xec/0x2e9 [ 51.242023] __irq_exit_rcu+0xbc/0x110 [ 51.245780] common_interrupt+0xb8/0xd0 [ 51.249638] </IRQ> [ 51.251743] <TASK> [ 51.253850] asm_common_interrupt+0x1e/0x40 [ 51.258044] RIP: 0010:cpuidle_enter_state+0xda/0x370 [ 51.263026] Code: 31 ff e8 d9 c6 9e ff 45 84 ff 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 78 02 00 00 31 ff e8 bd 97 a5 ff fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 11 01 00 00 49 63 c6 4c 2b 2c 24 48 8d 14 40 48 8d [ 51.281781] RSP: 0018:ffffffffb4e03e60 EFLAGS: 00000246 [ 51.287017] RAX: ffff9a9d1ac00000 RBX: 0000000000000002 RCX: 000000000000001f [ 51.294157] RDX: 0000000000000000 RSI: ffffffffb494bd50 RDI: ffffffffb4927def [ 51.301290] RBP: ffff9a9d0151b000 R08: 0000000be57e1147 R09: 0000000000000018 [ 51.308424] R10: 0000000000000ed3 R11: 0000000000002406 R12: ffffffffb4fd05c0 [ 51.315565] R13: 0000000be57e1147 R14: 0000000000000002 R15: 0000000000000000 [ 51.322716] cpuidle_enter+0x29/0x40 [ 51.326305] do_idle+0x200/0x2b0 [ 51.329547] cpu_startup_entry+0x19/0x20 [ 51.333479] start_kernel+0x6b7/0x6dc [ 51.337156] secondary_startup_64_no_verify+0xb0/0xbb [ 51.342228] </TASK> [ 51.344424] Modules linked in: qrtr_mhi qrtr ath11k_pci mhi ath11k qmi_helpers mac80211 btusb btrtl btbcm btintel bluetooth libarc4 kvm_amd ccp cfg80211 jitterentropy_rng rng_core sha512_ssse3 evdev sha512_generic kvm snd_pcm snd_timer ctr leds_apu drbg snd ansi_cprng sg irqbypass ecdh_generic rfkill soundcore ecc pcspkr k10temp sp5100_tco watchdog button acpi_cpufreq drm fuse configfs ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 sd_mod t10_pi crc_t10dif crct10dif_generic crct10dif_common uas usb_storage ohci_pci ahci libahci libata ehci_pci ohci_hcd ehci_hcd r8169 realtek mdio_devres usbcore scsi_mod i2c_piix4 usb_common scsi_common libphy [ 51.403181] ---[ end trace 5511b9c3dbb0841e ]--- [ 51.407861] RIP: 0010:skb_release_data+0x81/0x170 [ 51.412592] Code: 4d 85 ed 74 4b 41 8b 85 bc 00 00 00 49 03 85 c0 00 00 00 0f b6 10 f6 c2 01 74 35 48 8b 70 28 48 85 f6 74 2c 40 f6 c6 01 75 21 <48> 8b 06 ba 01 00 00 00 4c 89 ef 0f ae e8 ff d0 41 8b 85 bc 00 00 [ 51.431366] RSP: 0018:ffffbec4c0003e30 EFLAGS: 00010246 [ 51.436623] RAX: ffff9a9d11a6c2c0 RBX: ffff9a9d08341a68 RCX: 0000000000000000 [ 51.443782] RDX: 0000000000000003 RSI: 00408210000b231a RDI: ffff9a9d01162900 [ 51.450939] RBP: ffff9a9d01162900 R08: 0000000000000212 R09: ffffffffb4ed24e8 [ 51.458099] R10: 0000000000000000 R11: 00000000dca23000 R12: ffff9a9d11a6c2c0 [ 51.465256] R13: ffff9a9d01162900 R14: ffff9a9d083435d8 R15: 0000000000000005 [ 51.472416] FS: 0000000000000000(0000) GS:ffff9a9d1ac00000(0000) knlGS:0000000000000000 [ 51.480528] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 51.486299] CR2: 000055b14ef3a778 CR3: 0000000108c6e000 CR4: 00000000000006f0 [ 51.493459] Kernel panic - not syncing: Fatal exception in interrupt [ 51.499831] Kernel Offset: 0x32800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 51.510610] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
On 12/3/2021 10:09 PM, Sven Eckelmann wrote: > On Monday, 29 November 2021 11:13:09 CET Wen Gong wrote: ... > I've tested this on ath-next on commit a93789ae541c ("ath11k: Avoid NULL ptr > access during mgmt tx cleanup") with a WCN6856 card (EmWicon/jjplus WMX7205) > with firmware WLAN.HSP.1.1-02892.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1. ath-next > was required for me because 32 MSI vectors are not available on the > used system. > > Without this patch, it works fine. With patch, I just have to connect to an AP > via wpa_supplicant to crash the system. See the attached x86-64 .config, the > stacktrace and the decoded stacktrace. I did test in my setup, not see the crash. I am afraid you also need this patch("ath11k: change to use dynamic memory for channel list of scan", https://patchwork.kernel.org/project/linux-wireless/patch/20211129110939.15711-1-quic_wgong@quicinc.com ) Could you apply this patch and try again? > Kind regards, > Sven
On Monday, 6 December 2021 04:29:39 CET Wen Gong wrote: [...] > I did test in my setup, not see the crash. > > I am afraid you also need this patch("ath11k: change to use dynamic > memory for channel list of scan", > > https://patchwork.kernel.org/project/linux-wireless/patch/20211129110939.15711-1-quic_wgong@quicinc.com > ) > > Could you apply this patch and try again? Tried it and I see the same problem. Kind regards, Sven
On 12/6/2021 2:56 PM, Sven Eckelmann wrote: > On Monday, 6 December 2021 04:29:39 CET Wen Gong wrote: > [...] >> I did test in my setup, not see the crash. >> >> I am afraid you also need this patch("ath11k: change to use dynamic >> memory for channel list of scan", >> >> https://patchwork.kernel.org/project/linux-wireless/patch/20211129110939.15711-1-quic_wgong@quicinc.com >> ) >> >> Could you apply this patch and try again? > Tried it and I see the same problem. Could you tell what is your test steps? > > Kind regards, > Sven
On Monday, 6 December 2021 08:10:40 CET Wen Gong wrote: > > On Monday, 6 December 2021 04:29:39 CET Wen Gong wrote: > > [...] > >> I did test in my setup, not see the crash. > >> > >> I am afraid you also need this patch("ath11k: change to use dynamic > >> memory for channel list of scan", > >> > >> https://patchwork.kernel.org/project/linux-wireless/patch/20211129110939.15711-1-quic_wgong@quicinc.com > >> ) > >> > >> Could you apply this patch and try again? > > Tried it and I see the same problem. > Could you tell what is your test steps? Start kernel with commit a93789ae541c ("ath11k: Avoid NULL ptr access during mgmt tx cleanup") + patches: * ath11k: enable IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS for WCN6855 * ath11k: change to use dynamic memory for channel list of scan You can find the config in the first mail. But I have now enabled KASAN inline to hopefully create some better error messages. The firmware + board data (see mail "ath11k: incorrect board_id retrieval") was prepared like this: git clone https://github.com/kvalo/ath11k-firmware /root/ath11k-firmware mkdir -p /lib/firmware/ath11k/WCN6855/hw2.0/ cp /root/ath11k-firmware/WCN6855/hw2.0/*.bin /lib/firmware/ath11k/WCN6855/hw2.0/ cp /root/ath11k-firmware/WCN6855/hw2.0/1.1/WLAN.HSP.1.1-02892.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1/*.bin /lib/firmware/ath11k/WCN6855/hw2.0/ git clone https://github.com/qca/qca-swiss-army-knife /root/qca-swiss-army-knife apt install python2 python2 /root/qca-swiss-army-knife/tools/scripts/ath11k/ath11k-bdencoder -e /lib/firmware/ath11k/WCN6855/hw2.0/board-2.bin rm /lib/firmware/ath11k/WCN6855/hw2.0/board-2.bin cp 'bus=pci,vendor=17cb,device=1103,subsystem-vendor=17cb,subsystem-device=3374,qmi-chip-id=2,qmi-board-id=266.bin' /lib/firmware/ath11k/WCN6855/hw2.0/board.bin Then I am just starting up the device as usual, and start wpa_supplicant (with defconfig + CONFIG_MESH=y) from commit 14ab4a816c68 ("Reject ap_vendor_elements if its length is odd") cat << "EOF" > station_test.cfg network={ ssid="MyTestAP" key_mgmt=WPA-PSK FT-PSK proto=RSN psk="testtest" } EOF ip link set up dev wlp6s0 ~/hostap/wpa_supplicant/wpa_supplicant -D nl80211 -i wlp6s0 -c station_test.cfg The actual SSID + PSK is valid and multiple access points (4) have this BSS on 2.4GHz + 5GHz. So you are basically always calling dev_kfree_skb_any in ath11k_ce_tx_process_cb because wcn6855 hw2.0 has credit_flow has set. But it seems like one of the entries returned by ath11k_ce_completed_send_next is bogus and causes this problems during the ath11k_ce_tx_process_cb. And for some reason, this is triggered here by this firmware feature. ./scripts/faddr2line --list vmlinux consume_skb+0x9f/0x1c0 consume_skb+0x9f/0x1c0: __kfree_skb at net/core/skbuff.c:757 752 */ 753 754 void __kfree_skb(struct sk_buff *skb) 755 { 756 skb_release_all(skb); >757< kfree_skbmem(skb); 758 } 759 EXPORT_SYMBOL(__kfree_skb); 760 761 /** 762 * kfree_skb - free an sk_buff (inlined by) consume_skb at net/core/skbuff.c:912 907 { 908 if (!skb_unref(skb)) 909 return; 910 911 trace_consume_skb(skb); >912< __kfree_skb(skb); 913 } 914 EXPORT_SYMBOL(consume_skb); 915 #endif 916 917 /** (inlined by) consume_skb at net/core/skbuff.c:906 901 * 902 * Drop a ref to the buffer and free it if the usage count has hit zero 903 * Functions identically to kfree_skb, but kfree_skb assumes that the frame 904 * is being dropped after a failure and notes that 905 */ >906< void consume_skb(struct sk_buff *skb) 907 { 908 if (!skb_unref(skb)) 909 return; 910 911 trace_consume_skb(skb); ./scripts/faddr2line --list vmlinux skb_release_data+0x1b0/0x5c0 skb_release_data+0x1b0/0x5c0: skb_zcopy_clear at include/linux/skbuff.h:1549 1544 { 1545 struct ubuf_info *uarg = skb_zcopy(skb); 1546 1547 if (uarg) { 1548 if (!skb_zcopy_is_nouarg(skb)) >1549< uarg->callback(skb, uarg, zerocopy_success); 1550 1551 skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY; 1552 } 1553 } 1554 (inlined by) skb_release_data at net/core/skbuff.c:669 664 if (skb->cloned && 665 atomic_sub_return(skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1, 666 &shinfo->dataref)) 667 goto exit; 668 >669< skb_zcopy_clear(skb, true); 670 671 for (i = 0; i < shinfo->nr_frags; i++) 672 __skb_frag_unref(&shinfo->frags[i], skb->pp_recycle); 673 674 if (shinfo->frag_list) But I didn't like the inlined code. So I've changed the compilation flags slightly: diff --git a/net/core/Makefile b/net/core/Makefile index 6bdcb2cafed8..5eda226c5f27 100644 --- a/net/core/Makefile +++ b/net/core/Makefile @@ -37,3 +37,4 @@ obj-$(CONFIG_NET_SOCK_MSG) += skmsg.o obj-$(CONFIG_BPF_SYSCALL) += sock_map.o obj-$(CONFIG_BPF_SYSCALL) += bpf_sk_storage.o obj-$(CONFIG_OF) += of_net.o +ccflags-y += -fno-inline -O1 -fno-optimize-sibling-calls Now the stacktrace is a lot more readable. And the returned crash location makes a lot more sense: ./scripts/faddr2line --list vmlinux 'skb_zcopy_clear+0x34/0x8f' skb_zcopy_clear+0x34/0x8f: skb_zcopy_clear at include/linux/skbuff.h:1549 1544 { 1545 struct ubuf_info *uarg = skb_zcopy(skb); 1546 1547 if (uarg) { 1548 if (!skb_zcopy_is_nouarg(skb)) >1549< uarg->callback(skb, uarg, zerocopy_success); 1550 1551 skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY; 1552 } 1553 } 1554 Or with the assembler: (gdb) disassemble /m *(skb_zcopy_clear+0x34/0x8f) Dump of assembler code for function skb_zcopy_clear: 1544 { 0x000000000000072a <+0>: push %r12 0x000000000000072c <+2>: push %rbp 0x000000000000072d <+3>: push %rbx 0x000000000000072e <+4>: mov %rdi,%rbx 0x0000000000000731 <+7>: mov %esi,%r12d 1545 struct ubuf_info *uarg = skb_zcopy(skb); 0x0000000000000734 <+10>: call 0x5d3 <skb_zcopy> 1546 1547 if (uarg) { 0x0000000000000739 <+15>: test %rax,%rax 0x000000000000073c <+18>: je 0x7a0 <skb_zcopy_clear+118> 0x000000000000073e <+20>: mov %rax,%rbp 1548 if (!skb_zcopy_is_nouarg(skb)) 0x0000000000000741 <+23>: mov %rbx,%rdi 0x0000000000000744 <+26>: call 0x6f6 <skb_zcopy_is_nouarg> 0x0000000000000749 <+31>: test %al,%al 0x000000000000074b <+33>: jne 0x777 <skb_zcopy_clear+77> 1549 uarg->callback(skb, uarg, zerocopy_success); 0x000000000000074d <+35>: mov %rbp,%rdx 0x0000000000000750 <+38>: shr $0x3,%rdx 0x0000000000000754 <+42>: movabs $0xdffffc0000000000,%rax 0x000000000000075e <+52>: cmpb $0x0,(%rdx,%rax,1) 0x0000000000000762 <+56>: jne 0x7a5 <skb_zcopy_clear+123> 0x0000000000000764 <+58>: movzbl %r12b,%edx 0x0000000000000768 <+62>: mov 0x0(%rbp),%rax 0x000000000000076c <+66>: mov %rbp,%rsi 0x000000000000076f <+69>: mov %rbx,%rdi 0x0000000000000772 <+72>: call 0x777 <skb_zcopy_clear+77> 0x00000000000007a5 <+123>: mov %rbp,%rdi 0x00000000000007a8 <+126>: call 0x7ad <skb_zcopy_clear+131> 0x00000000000007ad <+131>: jmp 0x764 <skb_zcopy_clear+58> 1550 1551 skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY; 0x0000000000000777 <+77>: mov %rbx,%rdi 0x000000000000077a <+80>: call 0x518 <skb_end_pointer> 0x000000000000077f <+85>: mov %rax,%rbx 0x0000000000000782 <+88>: mov %rax,%rdx 0x0000000000000785 <+91>: shr $0x3,%rdx 0x0000000000000789 <+95>: movabs $0xdffffc0000000000,%rax 0x0000000000000793 <+105>: movzbl (%rdx,%rax,1),%eax 0x0000000000000797 <+109>: test %al,%al 0x0000000000000799 <+111>: je 0x79d <skb_zcopy_clear+115> 0x000000000000079b <+113>: jle 0x7af <skb_zcopy_clear+133> 0x000000000000079d <+115>: andb $0xf8,(%rbx) 0x00000000000007af <+133>: mov %rbx,%rdi 0x00000000000007b2 <+136>: call 0x7b7 <skb_zcopy_clear+141> 0x00000000000007b7 <+141>: jmp 0x79d <skb_zcopy_clear+115> 1552 } 1553 } 0x00000000000007a0 <+118>: pop %rbx 0x00000000000007a1 <+119>: pop %rbp 0x00000000000007a2 <+120>: pop %r12 0x00000000000007a4 <+122>: ret End of assembler dump. To make it even easier to read, just disable the inline KASAN and reduce the optimization level for this for it: diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 059b6266dcd7..819cc58ab051 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -1540,6 +1540,8 @@ static inline void net_zcopy_put_abort(struct ubuf_info *uarg, bool have_uref) } /* Release a reference on a zerocopy structure */ +#pragma GCC push_options +#pragma GCC optimize ("O0") static inline void skb_zcopy_clear(struct sk_buff *skb, bool zerocopy_success) { struct ubuf_info *uarg = skb_zcopy(skb); @@ -1551,6 +1553,7 @@ static inline void skb_zcopy_clear(struct sk_buff *skb, bool zerocopy_success) skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY; } } +#pragma GCC pop_options static inline void skb_mark_not_on_list(struct sk_buff *skb) { This creates this nice, unoptimized function which crashes at +63: $ gdb net/core/skbuff.o -q Reading symbols from net/core/skbuff.o... (gdb) disassemble /m *(skb_zcopy_clear+0x3f/0x70) Dump of assembler code for function skb_zcopy_clear: 1546 { 0x0000000000000000 <+0>: push %rbp 0x0000000000000001 <+1>: mov %rsp,%rbp 0x0000000000000004 <+4>: sub $0x18,%rsp 0x0000000000000008 <+8>: mov %rdi,-0x10(%rbp) 0x000000000000000c <+12>: mov %esi,%eax 0x000000000000000e <+14>: mov %al,-0x14(%rbp) 1547 struct ubuf_info *uarg = skb_zcopy(skb); 0x0000000000000011 <+17>: mov -0x10(%rbp),%rax 0x0000000000000015 <+21>: mov %rax,%rdi 0x0000000000000018 <+24>: call 0x29e <skb_zcopy> 0x000000000000001d <+29>: mov %rax,-0x8(%rbp) 1548 1549 if (uarg) { 0x0000000000000021 <+33>: cmpq $0x0,-0x8(%rbp) 0x0000000000000026 <+38>: je 0x6d <skb_zcopy_clear+109> 1550 if (!skb_zcopy_is_nouarg(skb)) 0x0000000000000028 <+40>: mov -0x10(%rbp),%rax 0x000000000000002c <+44>: mov %rax,%rdi 0x000000000000002f <+47>: call 0x2df <skb_zcopy_is_nouarg> 0x0000000000000034 <+52>: xor $0x1,%eax 0x0000000000000037 <+55>: test %al,%al 0x0000000000000039 <+57>: je 0x59 <skb_zcopy_clear+89> 1551 uarg->callback(skb, uarg, zerocopy_success); 0x000000000000003b <+59>: mov -0x8(%rbp),%rax 0x000000000000003f <+63>: mov (%rax),%r8 0x0000000000000042 <+66>: movzbl -0x14(%rbp),%edx 0x0000000000000046 <+70>: mov -0x8(%rbp),%rcx 0x000000000000004a <+74>: mov -0x10(%rbp),%rax 0x000000000000004e <+78>: mov %rcx,%rsi 0x0000000000000051 <+81>: mov %rax,%rdi 0x0000000000000054 <+84>: call 0x59 <skb_zcopy_clear+89> 1552 1553 skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY; 0x0000000000000059 <+89>: mov -0x10(%rbp),%rax 0x000000000000005d <+93>: mov %rax,%rdi 0x0000000000000060 <+96>: call 0x27f <skb_end_pointer> 0x0000000000000065 <+101>: movzbl (%rax),%edx 0x0000000000000068 <+104>: and $0xfffffff8,%edx 0x000000000000006b <+107>: mov %dl,(%rax) 1554 } 1555 } 0x000000000000006d <+109>: nop 0x000000000000006e <+110>: leave 0x000000000000006f <+111>: ret End of assembler dump. The question now: What is causing the unclean state of the skb and thus doesn't let it get rejected by skb_zcopy_is_nouarg before the uarg callback is tried. Kind regards, Sven general protection fault, probably for non-canonical address 0xe0080c4200016463: 0000 [#1] PREEMPT SMP KASAN NOPTI KASAN: maybe wild-memory-access in range [0x00408210000b2318-0x00408210000b231f] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc1+ #3 Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014 RIP: 0010:skb_release_data (./include/linux/skbuff.h:1549 net/core/skbuff.c:669) Code: 00 00 48 8b 75 28 48 85 f6 0f 84 d2 00 00 00 40 f6 c6 01 0f 85 a3 00 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 f2 48 c1 ea 03 <80> 3c 02 00 0f 85 d3 03 00 00 48 8b 06 ba 01 00 00 00 48 89 df 0f All code ======== 0: 00 00 add %al,(%rax) 2: 48 8b 75 28 mov 0x28(%rbp),%rsi 6: 48 85 f6 test %rsi,%rsi 9: 0f 84 d2 00 00 00 je 0xe1 f: 40 f6 c6 01 test $0x1,%sil 13: 0f 85 a3 00 00 00 jne 0xbc 19: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax 20: fc ff df 23: 48 89 f2 mov %rsi,%rdx 26: 48 c1 ea 03 shr $0x3,%rdx 2a:* 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) <-- trapping instruction 2e: 0f 85 d3 03 00 00 jne 0x407 34: 48 8b 06 mov (%rsi),%rax 37: ba 01 00 00 00 mov $0x1,%edx 3c: 48 89 df mov %rbx,%rdi 3f: 0f .byte 0xf Code starting with the faulting instruction =========================================== 0: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) 4: 0f 85 d3 03 00 00 jne 0x3dd a: 48 8b 06 mov (%rsi),%rax d: ba 01 00 00 00 mov $0x1,%edx 12: 48 89 df mov %rbx,%rdi 15: 0f .byte 0xf RSP: 0018:ffff8880c7c09c50 EFLAGS: 00010216 RAX: dffffc0000000000 RBX: ffff888004c6bdc0 RCX: 1ffff1100076945d RDX: 0008104200016463 RSI: 00408210000b231a RDI: ffff888003b4a2e8 RBP: ffff888003b4a2c0 R08: 0000000000000000 R09: ffff888004c6be97 R10: ffffed100098d7d2 R11: 0000000000000001 R12: ffff888003b4a2c0 R13: ffff888004c6be7c R14: ffff88800c641e58 R15: ffff888004c6be80 FS: 0000000000000000(0000) GS:ffff8880c7c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055d3a95f6778 CR3: 0000000017c20000 CR4: 00000000000006f0 Call Trace: <IRQ> ? _raw_write_lock_irq (kernel/locking/spinlock.c:177) consume_skb (net/core/skbuff.c:757 net/core/skbuff.c:912 net/core/skbuff.c:906) ath11k_ce_tx_process_cb (drivers/net/wireless/ath/ath11k/ce.c:515) ath11k ? __local_bh_enable_ip (./arch/x86/include/asm/preempt.h:103 kernel/softirq.c:390) ? ath11k_ce_alloc_pipes (drivers/net/wireless/ath/ath11k/ce.c:500) ath11k ? ath11k_hal_srng_access_end (drivers/net/wireless/ath/ath11k/hal.c:849) ath11k ath11k_ce_per_engine_service (drivers/net/wireless/ath/ath11k/ce.c:694) ath11k ? _raw_spin_lock_irqsave (./arch/x86/include/asm/atomic.h:202 ./include/linux/atomic/atomic-instrumented.h:513 ./include/asm-generic/qspinlock.h:82 ./include/linux/spinlock.h:185 ./include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162) ? __lock_text_start (kernel/locking/spinlock.c:161) ? ath11k_ce_tx_process_cb (drivers/net/wireless/ath/ath11k/ce.c:689) ath11k ? __wake_up_bit (kernel/sched/wait_bit.c:192) ? __irq_put_desc_unlock (kernel/irq/irqdesc.c:819) ath11k_pci_ce_tasklet (drivers/net/wireless/ath/ath11k/pci.c:637) ath11k_pci ? tasklet_clear_sched (kernel/softirq.c:752) tasklet_action_common.constprop.0 (kernel/softirq.c:783) __do_softirq (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:212 ./include/trace/events/irq.h:142 kernel/softirq.c:559) __irq_exit_rcu (kernel/softirq.c:432 kernel/softirq.c:636) common_interrupt (arch/x86/kernel/irq.c:240 (discriminator 14)) </IRQ> <TASK> asm_common_interrupt (./arch/x86/include/asm/idtentry.h:629) RIP: 0010:cpuidle_enter_state (drivers/cpuidle/cpuidle.c:259) Code: ff e8 8e 95 db fe 80 3c 24 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 8e 06 00 00 31 ff e8 a1 b9 ef fe fb 66 0f 1f 44 00 00 <45> 85 ed 0f 88 52 03 00 00 4d 63 e5 4b 8d 04 64 49 8d 04 84 48 8d All code ======== 0: ff (bad) 1: e8 8e 95 db fe callq 0xfffffffffedb9594 6: 80 3c 24 00 cmpb $0x0,(%rsp) a: 74 17 je 0x23 c: 9c pushfq d: 58 pop %rax e: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) 13: f6 c4 02 test $0x2,%ah 16: 0f 85 8e 06 00 00 jne 0x6aa 1c: 31 ff xor %edi,%edi 1e: e8 a1 b9 ef fe callq 0xfffffffffeefb9c4 23: fb sti 24: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) 2a:* 45 85 ed test %r13d,%r13d <-- trapping instruction 2d: 0f 88 52 03 00 00 js 0x385 33: 4d 63 e5 movslq %r13d,%r12 36: 4b 8d 04 64 lea (%r12,%r12,2),%rax 3a: 49 8d 04 84 lea (%r12,%rax,4),%rax 3e: 48 rex.W 3f: 8d .byte 0x8d Code starting with the faulting instruction =========================================== 0: 45 85 ed test %r13d,%r13d 3: 0f 88 52 03 00 00 js 0x35b 9: 4d 63 e5 movslq %r13d,%r12 c: 4b 8d 04 64 lea (%r12,%r12,2),%rax 10: 49 8d 04 84 lea (%r12,%rax,4),%rax 14: 48 rex.W 15: 8d .byte 0x8d RSP: 0018:ffffffff89a07de0 EFLAGS: 00000246 RAX: dffffc0000000000 RBX: ffff888003b44000 RCX: 1ffffffff129775c RDX: 1ffff11018f88331 RSI: ffffffff89031b00 RDI: ffff8880c7c41988 RBP: ffffffff89ee0d20 R08: 0000000000000002 R09: ffff8880c7c41c2b R10: ffffed1018f88385 R11: 0000000000000001 R12: 0000000000000002 R13: 0000000000000002 R14: 00000024aa5bda97 R15: ffffffff89ee0e08 ? _raw_spin_unlock_irqrestore (./arch/x86/include/asm/preempt.h:103 ./include/linux/spinlock_api_smp.h:152 kernel/locking/spinlock.c:194) ? tick_nohz_idle_stop_tick (./include/linux/hrtimer.h:419 kernel/time/tick-sched.c:920 kernel/time/tick-sched.c:1062 kernel/time/tick-sched.c:1083) cpuidle_enter (drivers/cpuidle/cpuidle.c:353) do_idle (kernel/sched/idle.c:158 kernel/sched/idle.c:239 kernel/sched/idle.c:306) ? arch_cpu_idle_exit+0x40/0x40 cpu_startup_entry (kernel/sched/idle.c:402 (discriminator 1)) start_kernel (init/main.c:1137) secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:283) </TASK> Modules linked in: qrtr_mhi qrtr ath11k_pci mhi ath11k qmi_helpers mac80211 kvm_amd btusb btrtl ccp btbcm rng_core btintel libarc4 evdev leds_apu bluetooth kvm snd_pcm snd_timer jitterentropy_rng cfg80211 snd sha512_ssse3 sha512_generic sg soundcore irqbypass ctr pcspkr drbg ansi_cprng k10temp ecdh_generic rfkill ecc sp5100_tco watchdog acpi_cpufreq button drm fuse configfs ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 sd_mod t10_pi crc_t10dif crct10dif_generic crct10dif_common uas usb_storage ohci_pci ahci libahci ohci_hcd ehci_pci ehci_hcd libata r8169 realtek mdio_devres usbcore scsi_mod i2c_piix4 usb_common scsi_common libphy ---[ end trace dc622588d92d6988 ]--- RIP: 0010:skb_release_data (./include/linux/skbuff.h:1549 net/core/skbuff.c:669) Code: 00 00 48 8b 75 28 48 85 f6 0f 84 d2 00 00 00 40 f6 c6 01 0f 85 a3 00 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 f2 48 c1 ea 03 <80> 3c 02 00 0f 85 d3 03 00 00 48 8b 06 ba 01 00 00 00 48 89 df 0f All code ======== 0: 00 00 add %al,(%rax) 2: 48 8b 75 28 mov 0x28(%rbp),%rsi 6: 48 85 f6 test %rsi,%rsi 9: 0f 84 d2 00 00 00 je 0xe1 f: 40 f6 c6 01 test $0x1,%sil 13: 0f 85 a3 00 00 00 jne 0xbc 19: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax 20: fc ff df 23: 48 89 f2 mov %rsi,%rdx 26: 48 c1 ea 03 shr $0x3,%rdx 2a:* 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) <-- trapping instruction 2e: 0f 85 d3 03 00 00 jne 0x407 34: 48 8b 06 mov (%rsi),%rax 37: ba 01 00 00 00 mov $0x1,%edx 3c: 48 89 df mov %rbx,%rdi 3f: 0f .byte 0xf Code starting with the faulting instruction =========================================== 0: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) 4: 0f 85 d3 03 00 00 jne 0x3dd a: 48 8b 06 mov (%rsi),%rax d: ba 01 00 00 00 mov $0x1,%edx 12: 48 89 df mov %rbx,%rdi 15: 0f .byte 0xf RSP: 0018:ffff8880c7c09c50 EFLAGS: 00010216 RAX: dffffc0000000000 RBX: ffff888004c6bdc0 RCX: 1ffff1100076945d RDX: 0008104200016463 RSI: 00408210000b231a RDI: ffff888003b4a2e8 RBP: ffff888003b4a2c0 R08: 0000000000000000 R09: ffff888004c6be97 R10: ffffed100098d7d2 R11: 0000000000000001 R12: ffff888003b4a2c0 R13: ffff888004c6be7c R14: ffff88800c641e58 R15: ffff888004c6be80 FS: 0000000000000000(0000) GS:ffff8880c7c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055d3a95f6778 CR3: 0000000017c20000 CR4: 00000000000006f0 Kernel panic - not syncing: Fatal exception in interrupt Kernel Offset: 0x5c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]--- general protection fault, probably for non-canonical address 0xe0080c4200016463: 0000 [#1] PREEMPT SMP KASAN NOPTI KASAN: maybe wild-memory-access in range [0x00408210000b2318-0x00408210000b231f] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc1+ #1 Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014 RIP: 0010:skb_zcopy_clear (./include/linux/skbuff.h:1549) Code: e8 9a fe ff ff 48 85 c0 74 62 48 89 c5 48 89 df e8 ad ff ff ff 84 c0 75 2a 48 89 ea 48 c1 ea 03 48 b8 00 00 00 00 00 fc ff df <80> 3c 02 00 75 41 41 0f b6 d4 48 8b 45 00 48 89 ee 48 89 df 0f ae All code ======== 0: e8 9a fe ff ff callq 0xfffffffffffffe9f 5: 48 85 c0 test %rax,%rax 8: 74 62 je 0x6c a: 48 89 c5 mov %rax,%rbp d: 48 89 df mov %rbx,%rdi 10: e8 ad ff ff ff callq 0xffffffffffffffc2 15: 84 c0 test %al,%al 17: 75 2a jne 0x43 19: 48 89 ea mov %rbp,%rdx 1c: 48 c1 ea 03 shr $0x3,%rdx 20: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax 27: fc ff df 2a:* 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) <-- trapping instruction 2e: 75 41 jne 0x71 30: 41 0f b6 d4 movzbl %r12b,%edx 34: 48 8b 45 00 mov 0x0(%rbp),%rax 38: 48 89 ee mov %rbp,%rsi 3b: 48 89 df mov %rbx,%rdi 3e: 0f .byte 0xf 3f: ae scas %es:(%rdi),%al Code starting with the faulting instruction =========================================== 0: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) 4: 75 41 jne 0x47 6: 41 0f b6 d4 movzbl %r12b,%edx a: 48 8b 45 00 mov 0x0(%rbp),%rax e: 48 89 ee mov %rbp,%rsi 11: 48 89 df mov %rbx,%rdi 14: 0f .byte 0xf 15: ae scas %es:(%rdi),%al RSP: 0018:ffff8880c3a09c30 EFLAGS: 00010216 RAX: dffffc0000000000 RBX: ffff88800c233b40 RCX: ffffffff9fce961b RDX: 0008104200016463 RSI: 0000000000000001 RDI: ffff888015edeae8 RBP: 00408210000b231a R08: 0000000000000000 R09: ffff88800c233c17 R10: ffffed1001846782 R11: 0000000000000001 R12: 0000000000000001 R13: ffff88800c233bbe R14: ffff88800b701e58 R15: dffffc0000000000 FS: 0000000000000000(0000) GS:ffff8880c3a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fdccf1c75c0 CR3: 00000000063a4000 CR4: 00000000000006f0 Call Trace: <IRQ> skb_release_data (net/core/skbuff.c:671) skb_release_all (net/core/skbuff.c:743) __kfree_skb (net/core/skbuff.c:757) consume_skb (net/core/skbuff.c:912) __dev_kfree_skb_any (net/core/dev.c:3038) ath11k_ce_tx_process_cb (drivers/net/wireless/ath/ath11k/ce.c:515) ath11k ? __local_bh_enable_ip (./arch/x86/include/asm/preempt.h:103 kernel/softirq.c:390) ? ath11k_ce_alloc_pipes (drivers/net/wireless/ath/ath11k/ce.c:500) ath11k ? ath11k_hal_srng_access_end (drivers/net/wireless/ath/ath11k/hal.c:849) ath11k ath11k_ce_per_engine_service (drivers/net/wireless/ath/ath11k/ce.c:694) ath11k ? _raw_spin_lock_irqsave (./arch/x86/include/asm/atomic.h:202 ./include/linux/atomic/atomic-instrumented.h:513 ./include/asm-generic/qspinlock.h:82 ./include/linux/spinlock.h:185 ./include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162) ? __lock_text_start (kernel/locking/spinlock.c:161) ? ath11k_ce_tx_process_cb (drivers/net/wireless/ath/ath11k/ce.c:689) ath11k ? __wake_up_bit (kernel/sched/wait_bit.c:192) ? __irq_put_desc_unlock (kernel/irq/irqdesc.c:819) ath11k_pci_ce_tasklet (drivers/net/wireless/ath/ath11k/pci.c:637) ath11k_pci ? tasklet_clear_sched (kernel/softirq.c:752) tasklet_action_common.constprop.0 (kernel/softirq.c:783) __do_softirq (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:212 ./include/trace/events/irq.h:142 kernel/softirq.c:559) __irq_exit_rcu (kernel/softirq.c:432 kernel/softirq.c:636) common_interrupt (arch/x86/kernel/irq.c:240 (discriminator 14)) </IRQ> <TASK> asm_common_interrupt (./arch/x86/include/asm/idtentry.h:629) RIP: 0010:cpuidle_enter_state (drivers/cpuidle/cpuidle.c:259) Code: ff e8 8e 95 db fe 80 3c 24 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 8e 06 00 00 31 ff e8 a1 b9 ef fe fb 66 0f 1f 44 00 00 <45> 85 ed 0f 88 52 03 00 00 4d 63 e5 4b 8d 04 64 49 8d 04 84 48 8d All code ======== 0: ff (bad) 1: e8 8e 95 db fe callq 0xfffffffffedb9594 6: 80 3c 24 00 cmpb $0x0,(%rsp) a: 74 17 je 0x23 c: 9c pushfq d: 58 pop %rax e: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) 13: f6 c4 02 test $0x2,%ah 16: 0f 85 8e 06 00 00 jne 0x6aa 1c: 31 ff xor %edi,%edi 1e: e8 a1 b9 ef fe callq 0xfffffffffeefb9c4 23: fb sti 24: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) 2a:* 45 85 ed test %r13d,%r13d <-- trapping instruction 2d: 0f 88 52 03 00 00 js 0x385 33: 4d 63 e5 movslq %r13d,%r12 36: 4b 8d 04 64 lea (%r12,%r12,2),%rax 3a: 49 8d 04 84 lea (%r12,%rax,4),%rax 3e: 48 rex.W 3f: 8d .byte 0x8d Code starting with the faulting instruction =========================================== 0: 45 85 ed test %r13d,%r13d 3: 0f 88 52 03 00 00 js 0x35b 9: 4d 63 e5 movslq %r13d,%r12 c: 4b 8d 04 64 lea (%r12,%r12,2),%rax 10: 49 8d 04 84 lea (%r12,%rax,4),%rax 14: 48 rex.W 15: 8d .byte 0x8d RSP: 0018:ffffffffa1407de0 EFLAGS: 00000246 RAX: dffffc0000000000 RBX: ffff888003b20800 RCX: 1ffffffff41d935c RDX: 1ffff11018748331 RSI: ffffffffa0a31b00 RDI: ffff8880c3a41988 RBP: ffffffffa18e0d20 R08: 0000000000000002 R09: ffff8880c3a41c2b R10: ffffed1018748385 R11: 0000000000000001 R12: 0000000000000002 R13: 0000000000000002 R14: 0000001dfc72dae5 R15: ffffffffa18e0e08 ? _raw_spin_unlock_irqrestore (./arch/x86/include/asm/preempt.h:103 ./include/linux/spinlock_api_smp.h:152 kernel/locking/spinlock.c:194) ? tick_nohz_idle_stop_tick (./include/linux/hrtimer.h:419 kernel/time/tick-sched.c:920 kernel/time/tick-sched.c:1062 kernel/time/tick-sched.c:1083) cpuidle_enter (drivers/cpuidle/cpuidle.c:353) do_idle (kernel/sched/idle.c:158 kernel/sched/idle.c:239 kernel/sched/idle.c:306) ? arch_cpu_idle_exit+0x40/0x40 cpu_startup_entry (kernel/sched/idle.c:402 (discriminator 1)) start_kernel (init/main.c:1137) secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:283) </TASK> Modules linked in: qrtr_mhi qrtr ath11k_pci mhi ath11k qmi_helpers mac80211 kvm_amd btusb btrtl btbcm ccp btintel libarc4 rng_core evdev bluetooth cfg80211 kvm leds_apu jitterentropy_rng sha512_ssse3 sha512_generic snd_pcm ctr sg drbg snd_timer irqbypass ansi_cprng snd ecdh_generic rfkill soundcore ecc pcspkr k10temp sp5100_tco watchdog button acpi_cpufreq drm fuse configfs ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 sd_mod t10_pi crc_t10dif crct10dif_generic crct10dif_common uas usb_storage ohci_pci ahci libahci libata ehci_pci ohci_hcd r8169 ehci_hcd realtek mdio_devres scsi_mod usbcore i2c_piix4 usb_common scsi_common libphy ---[ end trace bd73d57ff2669c03 ]--- RIP: 0010:skb_zcopy_clear (./include/linux/skbuff.h:1549) Code: e8 9a fe ff ff 48 85 c0 74 62 48 89 c5 48 89 df e8 ad ff ff ff 84 c0 75 2a 48 89 ea 48 c1 ea 03 48 b8 00 00 00 00 00 fc ff df <80> 3c 02 00 75 41 41 0f b6 d4 48 8b 45 00 48 89 ee 48 89 df 0f ae All code ======== 0: e8 9a fe ff ff callq 0xfffffffffffffe9f 5: 48 85 c0 test %rax,%rax 8: 74 62 je 0x6c a: 48 89 c5 mov %rax,%rbp d: 48 89 df mov %rbx,%rdi 10: e8 ad ff ff ff callq 0xffffffffffffffc2 15: 84 c0 test %al,%al 17: 75 2a jne 0x43 19: 48 89 ea mov %rbp,%rdx 1c: 48 c1 ea 03 shr $0x3,%rdx 20: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax 27: fc ff df 2a:* 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) <-- trapping instruction 2e: 75 41 jne 0x71 30: 41 0f b6 d4 movzbl %r12b,%edx 34: 48 8b 45 00 mov 0x0(%rbp),%rax 38: 48 89 ee mov %rbp,%rsi 3b: 48 89 df mov %rbx,%rdi 3e: 0f .byte 0xf 3f: ae scas %es:(%rdi),%al Code starting with the faulting instruction =========================================== 0: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) 4: 75 41 jne 0x47 6: 41 0f b6 d4 movzbl %r12b,%edx a: 48 8b 45 00 mov 0x0(%rbp),%rax e: 48 89 ee mov %rbp,%rsi 11: 48 89 df mov %rbx,%rdi 14: 0f .byte 0xf 15: ae scas %es:(%rdi),%al RSP: 0018:ffff8880c3a09c30 EFLAGS: 00010216 RAX: dffffc0000000000 RBX: ffff88800c233b40 RCX: ffffffff9fce961b RDX: 0008104200016463 RSI: 0000000000000001 RDI: ffff888015edeae8 RBP: 00408210000b231a R08: 0000000000000000 R09: ffff88800c233c17 R10: ffffed1001846782 R11: 0000000000000001 R12: 0000000000000001 R13: ffff88800c233bbe R14: ffff88800b701e58 R15: dffffc0000000000 FS: 0000000000000000(0000) GS:ffff8880c3a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fdccf1c75c0 CR3: 00000000063a4000 CR4: 00000000000006f0 Kernel panic - not syncing: Fatal exception in interrupt Kernel Offset: 0x1d800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]--- general protection fault, probably for non-canonical address 0xe0080c4200016463: 0000 [#1] PREEMPT SMP KASAN NOPTI KASAN: maybe wild-memory-access in range [0x00408210000b2318-0x00408210000b231f] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc1+ #1 Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014 RIP: 0010:skb_zcopy_clear+0x34/0x8f Code: e8 9a fe ff ff 48 85 c0 74 62 48 89 c5 48 89 df e8 ad ff ff ff 84 c0 75 2a 48 89 ea 48 c1 ea 03 48 b8 00 00 00 00 00 fc ff df <80> 3c 02 00 75 41 41 0f b6 d4 48 8b 45 00 48 89 ee 48 89 df 0f ae RSP: 0018:ffff8880c3a09c30 EFLAGS: 00010216 RAX: dffffc0000000000 RBX: ffff88800c233b40 RCX: ffffffff9fce961b RDX: 0008104200016463 RSI: 0000000000000001 RDI: ffff888015edeae8 RBP: 00408210000b231a R08: 0000000000000000 R09: ffff88800c233c17 R10: ffffed1001846782 R11: 0000000000000001 R12: 0000000000000001 R13: ffff88800c233bbe R14: ffff88800b701e58 R15: dffffc0000000000 FS: 0000000000000000(0000) GS:ffff8880c3a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fdccf1c75c0 CR3: 00000000063a4000 CR4: 00000000000006f0 Call Trace: <IRQ> skb_release_data+0x91/0x1de skb_release_all+0x3e/0x47 __kfree_skb+0xe/0x18 consume_skb+0x24/0x26 __dev_kfree_skb_any+0x2a/0x2b ath11k_ce_tx_process_cb+0x3ef/0x8d0 [ath11k] ? __local_bh_enable_ip+0x37/0x80 ? ath11k_ce_alloc_pipes+0x5c0/0x5c0 [ath11k] ? ath11k_hal_srng_access_end+0x1d7/0x5d0 [ath11k] ath11k_ce_per_engine_service+0x96b/0xc60 [ath11k] ? _raw_spin_lock_irqsave+0x9a/0xf0 ? __lock_text_start+0x8/0x8 ? ath11k_ce_tx_process_cb+0x8d0/0x8d0 [ath11k] ? __wake_up_bit+0x100/0x100 ? __irq_put_desc_unlock+0x18/0x90 ath11k_pci_ce_tasklet+0x64/0x100 [ath11k_pci] ? tasklet_clear_sched+0x47/0xe0 tasklet_action_common.constprop.0+0x240/0x2d0 __do_softirq+0x1b0/0x5b9 __irq_exit_rcu+0xc6/0x170 common_interrupt+0xa9/0xc0 </IRQ> <TASK> asm_common_interrupt+0x1e/0x40 RIP: 0010:cpuidle_enter_state+0x196/0xa60 Code: ff e8 8e 95 db fe 80 3c 24 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 8e 06 00 00 31 ff e8 a1 b9 ef fe fb 66 0f 1f 44 00 00 <45> 85 ed 0f 88 52 03 00 00 4d 63 e5 4b 8d 04 64 49 8d 04 84 48 8d RSP: 0018:ffffffffa1407de0 EFLAGS: 00000246 RAX: dffffc0000000000 RBX: ffff888003b20800 RCX: 1ffffffff41d935c RDX: 1ffff11018748331 RSI: ffffffffa0a31b00 RDI: ffff8880c3a41988 RBP: ffffffffa18e0d20 R08: 0000000000000002 R09: ffff8880c3a41c2b R10: ffffed1018748385 R11: 0000000000000001 R12: 0000000000000002 R13: 0000000000000002 R14: 0000001dfc72dae5 R15: ffffffffa18e0e08 ? _raw_spin_unlock_irqrestore+0x25/0x40 ? tick_nohz_idle_stop_tick+0x599/0xa60 cpuidle_enter+0x4a/0xa0 do_idle+0x3d7/0x530 ? arch_cpu_idle_exit+0x40/0x40 cpu_startup_entry+0x19/0x20 start_kernel+0x38d/0x3ab secondary_startup_64_no_verify+0xb0/0xbb </TASK> Modules linked in: qrtr_mhi qrtr ath11k_pci mhi ath11k qmi_helpers mac80211 kvm_amd btusb btrtl btbcm ccp btintel libarc4 rng_core evdev bluetooth cfg80211 kvm leds_apu jitterentropy_rng sha512_ssse3 sha512_generic snd_pcm ctr sg drbg snd_timer irqbypass ansi_cprng snd ecdh_generic rfkill soundcore ecc pcspkr k10temp sp5100_tco watchdog button acpi_cpufreq drm fuse configfs ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 sd_mod t10_pi crc_t10dif crct10dif_generic crct10dif_common uas usb_storage ohci_pci ahci libahci libata ehci_pci ohci_hcd r8169 ehci_hcd realtek mdio_devres scsi_mod usbcore i2c_piix4 usb_common scsi_common libphy ---[ end trace bd73d57ff2669c03 ]--- RIP: 0010:skb_zcopy_clear+0x34/0x8f Code: e8 9a fe ff ff 48 85 c0 74 62 48 89 c5 48 89 df e8 ad ff ff ff 84 c0 75 2a 48 89 ea 48 c1 ea 03 48 b8 00 00 00 00 00 fc ff df <80> 3c 02 00 75 41 41 0f b6 d4 48 8b 45 00 48 89 ee 48 89 df 0f ae RSP: 0018:ffff8880c3a09c30 EFLAGS: 00010216 RAX: dffffc0000000000 RBX: ffff88800c233b40 RCX: ffffffff9fce961b RDX: 0008104200016463 RSI: 0000000000000001 RDI: ffff888015edeae8 RBP: 00408210000b231a R08: 0000000000000000 R09: ffff88800c233c17 R10: ffffed1001846782 R11: 0000000000000001 R12: 0000000000000001 R13: ffff88800c233bbe R14: ffff88800b701e58 R15: dffffc0000000000 FS: 0000000000000000(0000) GS:ffff8880c3a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fdccf1c75c0 CR3: 00000000063a4000 CR4: 00000000000006f0 Kernel panic - not syncing: Fatal exception in interrupt Kernel Offset: 0x1d800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]--- general protection fault, probably for non-canonical address 0x408210000b231a: 0000 [#1] PREEMPT SMP NOPTI CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc1+ #1 Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014 RIP: 0010:skb_zcopy_clear+0x3f/0x70 Code: 48 89 c7 e8 81 02 00 00 48 89 45 f8 48 83 7d f8 00 74 45 48 8b 45 f0 48 89 c7 e8 ab 02 00 00 83 f0 01 84 c0 74 1e 48 8b 45 f8 <4c> 8b 00 0f b6 55 ec 48 8b 4d f8 48 8b 45 f0 48 89 ce 48 89 c7 e8 RSP: 0018:ffffb58e80003de8 EFLAGS: 00010202 RAX: 00408210000b231a RBX: ffff8aa303097b00 RCX: 0000000000000000 RDX: 0000000000000102 RSI: 0000000000000001 RDI: ffff8aa303097b00 RBP: ffffb58e80003e00 R08: 0000000000000212 R09: ffffffff922d24e8 R10: 0000000000000000 R11: 00000000db69d000 R12: ffff8aa310c69ac0 R13: ffff8aa303097b00 R14: ffff8aa3062235d8 R15: 0000000000000005 FS: 0000000000000000(0000) GS:ffff8aa31ac00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dac9d55408 CR3: 00000001090fa000 CR4: 00000000000006f0 Call Trace: <IRQ> skb_release_data+0x4b/0xa2 skb_release_all+0x20/0x22 __kfree_skb+0xe/0x18 consume_skb+0x24/0x26 __dev_kfree_skb_any+0x2a/0x2b ath11k_ce_tx_process_cb+0x157/0x220 [ath11k] ath11k_ce_per_engine_service+0x3c0/0x3d0 [ath11k] ? _raw_spin_lock_irqsave+0x26/0x50 ath11k_pci_ce_tasklet+0x1c/0x40 [ath11k_pci] tasklet_action_common.constprop.0+0xaf/0xe0 __do_softirq+0xec/0x2e9 __irq_exit_rcu+0xbc/0x110 common_interrupt+0xb8/0xd0 </IRQ> <TASK> asm_common_interrupt+0x1e/0x40 RIP: 0010:cpuidle_enter_state+0xda/0x370 Code: 31 ff e8 d9 c6 9e ff 45 84 ff 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 78 02 00 00 31 ff e8 bd 97 a5 ff fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 11 01 00 00 49 63 c6 4c 2b 2c 24 48 8d 14 40 48 8d RSP: 0018:ffffffff92203e60 EFLAGS: 00000246 RAX: ffff8aa31ac00000 RBX: 0000000000000002 RCX: 000000000000001f RDX: 0000000000000000 RSI: ffffffff91b70667 RDI: ffffffff91b55729 RBP: ffff8aa300906c00 R08: 0000000955084e02 R09: 0000000000000018 R10: 0000000000000001 R11: 0000000000001015 R12: ffffffff923d05c0 R13: 0000000955084e02 R14: 0000000000000002 R15: 0000000000000000 cpuidle_enter+0x29/0x40 do_idle+0x200/0x2b0 cpu_startup_entry+0x19/0x20 start_kernel+0x6b7/0x6dc secondary_startup_64_no_verify+0xb0/0xbb </TASK> Modules linked in: qrtr_mhi qrtr ath11k_pci mhi ath11k qmi_helpers mac80211 btusb btrtl btbcm btintel bluetooth libarc4 kvm_amd cfg80211 ccp rng_core jitterentropy_rng kvm sha512_ssse3 sha512_generic evdev ctr snd_pcm drbg sg snd_timer ansi_cprng leds_apu irqbypass ecdh_generic snd rfkill ecc soundcore pcspkr k10temp sp5100_tco watchdog button acpi_cpufreq drm fuse configfs ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 sd_mod t10_pi crc_t10dif crct10dif_generic crct10dif_common uas usb_storage ohci_pci ahci libahci ohci_hcd ehci_pci ehci_hcd libata r8169 realtek mdio_devres scsi_mod usbcore i2c_piix4 usb_common scsi_common libphy ---[ end trace 23d792ef4816c4de ]--- RIP: 0010:skb_zcopy_clear+0x3f/0x70 Code: 48 89 c7 e8 81 02 00 00 48 89 45 f8 48 83 7d f8 00 74 45 48 8b 45 f0 48 89 c7 e8 ab 02 00 00 83 f0 01 84 c0 74 1e 48 8b 45 f8 <4c> 8b 00 0f b6 55 ec 48 8b 4d f8 48 8b 45 f0 48 89 ce 48 89 c7 e8 RSP: 0018:ffffb58e80003de8 EFLAGS: 00010202 RAX: 00408210000b231a RBX: ffff8aa303097b00 RCX: 0000000000000000 RDX: 0000000000000102 RSI: 0000000000000001 RDI: ffff8aa303097b00 RBP: ffffb58e80003e00 R08: 0000000000000212 R09: ffffffff922d24e8 R10: 0000000000000000 R11: 00000000db69d000 R12: ffff8aa310c69ac0 R13: ffff8aa303097b00 R14: ffff8aa3062235d8 R15: 0000000000000005 FS: 0000000000000000(0000) GS:ffff8aa31ac00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dac9d55408 CR3: 00000001090fa000 CR4: 00000000000006f0 Kernel panic - not syncing: Fatal exception in interrupt Kernel Offset: 0xfa00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]--- general protection fault, probably for non-canonical address 0x408210000b231a: 0000 [#1] PREEMPT SMP NOPTI CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc1+ #1 Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014 RIP: 0010:skb_zcopy_clear (./include/linux/skbuff.h:1551) Code: 48 89 c7 e8 81 02 00 00 48 89 45 f8 48 83 7d f8 00 74 45 48 8b 45 f0 48 89 c7 e8 ab 02 00 00 83 f0 01 84 c0 74 1e 48 8b 45 f8 <4c> 8b 00 0f b6 55 ec 48 8b 4d f8 48 8b 45 f0 48 89 ce 48 89 c7 e8 All code ======== 0: 48 89 c7 mov %rax,%rdi 3: e8 81 02 00 00 callq 0x289 8: 48 89 45 f8 mov %rax,-0x8(%rbp) c: 48 83 7d f8 00 cmpq $0x0,-0x8(%rbp) 11: 74 45 je 0x58 13: 48 8b 45 f0 mov -0x10(%rbp),%rax 17: 48 89 c7 mov %rax,%rdi 1a: e8 ab 02 00 00 callq 0x2ca 1f: 83 f0 01 xor $0x1,%eax 22: 84 c0 test %al,%al 24: 74 1e je 0x44 26: 48 8b 45 f8 mov -0x8(%rbp),%rax 2a:* 4c 8b 00 mov (%rax),%r8 <-- trapping instruction 2d: 0f b6 55 ec movzbl -0x14(%rbp),%edx 31: 48 8b 4d f8 mov -0x8(%rbp),%rcx 35: 48 8b 45 f0 mov -0x10(%rbp),%rax 39: 48 89 ce mov %rcx,%rsi 3c: 48 89 c7 mov %rax,%rdi 3f: e8 .byte 0xe8 Code starting with the faulting instruction =========================================== 0: 4c 8b 00 mov (%rax),%r8 3: 0f b6 55 ec movzbl -0x14(%rbp),%edx 7: 48 8b 4d f8 mov -0x8(%rbp),%rcx b: 48 8b 45 f0 mov -0x10(%rbp),%rax f: 48 89 ce mov %rcx,%rsi 12: 48 89 c7 mov %rax,%rdi 15: e8 .byte 0xe8 RSP: 0018:ffffb58e80003de8 EFLAGS: 00010202 RAX: 00408210000b231a RBX: ffff8aa303097b00 RCX: 0000000000000000 RDX: 0000000000000102 RSI: 0000000000000001 RDI: ffff8aa303097b00 RBP: ffffb58e80003e00 R08: 0000000000000212 R09: ffffffff922d24e8 R10: 0000000000000000 R11: 00000000db69d000 R12: ffff8aa310c69ac0 R13: ffff8aa303097b00 R14: ffff8aa3062235d8 R15: 0000000000000005 FS: 0000000000000000(0000) GS:ffff8aa31ac00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dac9d55408 CR3: 00000001090fa000 CR4: 00000000000006f0 Call Trace: <IRQ> skb_release_data (net/core/skbuff.c:671) skb_release_all (net/core/skbuff.c:743) __kfree_skb (net/core/skbuff.c:757) consume_skb (net/core/skbuff.c:912) __dev_kfree_skb_any (net/core/dev.c:3038) ath11k_ce_tx_process_cb (drivers/net/wireless/ath/ath11k/ce.c:515) ath11k ath11k_ce_per_engine_service (drivers/net/wireless/ath/ath11k/ce.c:694) ath11k ? _raw_spin_lock_irqsave (./arch/x86/include/asm/atomic.h:202 ./include/linux/atomic/atomic-instrumented.h:513 ./include/asm-generic/qspinlock.h:82 ./include/linux/spinlock.h:185 ./include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162) ath11k_pci_ce_tasklet (drivers/net/wireless/ath/ath11k/pci.c:637) ath11k_pci tasklet_action_common.constprop.0 (./arch/x86/include/asm/bitops.h:75 ./include/asm-generic/bitops/instrumented-atomic.h:42 kernel/softirq.c:879 kernel/softirq.c:787) __do_softirq (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:212 ./include/trace/events/irq.h:142 kernel/softirq.c:559) __irq_exit_rcu (kernel/softirq.c:432 kernel/softirq.c:636) common_interrupt (arch/x86/kernel/irq.c:240 (discriminator 14)) </IRQ> <TASK> asm_common_interrupt (./arch/x86/include/asm/idtentry.h:629) RIP: 0010:cpuidle_enter_state (drivers/cpuidle/cpuidle.c:259) Code: 31 ff e8 d9 c6 9e ff 45 84 ff 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 78 02 00 00 31 ff e8 bd 97 a5 ff fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 11 01 00 00 49 63 c6 4c 2b 2c 24 48 8d 14 40 48 8d All code ======== 0: 31 ff xor %edi,%edi 2: e8 d9 c6 9e ff callq 0xffffffffff9ec6e0 7: 45 84 ff test %r15b,%r15b a: 74 17 je 0x23 c: 9c pushfq d: 58 pop %rax e: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) 13: f6 c4 02 test $0x2,%ah 16: 0f 85 78 02 00 00 jne 0x294 1c: 31 ff xor %edi,%edi 1e: e8 bd 97 a5 ff callq 0xffffffffffa597e0 23: fb sti 24: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) 2a:* 45 85 f6 test %r14d,%r14d <-- trapping instruction 2d: 0f 88 11 01 00 00 js 0x144 33: 49 63 c6 movslq %r14d,%rax 36: 4c 2b 2c 24 sub (%rsp),%r13 3a: 48 8d 14 40 lea (%rax,%rax,2),%rdx 3e: 48 rex.W 3f: 8d .byte 0x8d Code starting with the faulting instruction =========================================== 0: 45 85 f6 test %r14d,%r14d 3: 0f 88 11 01 00 00 js 0x11a 9: 49 63 c6 movslq %r14d,%rax c: 4c 2b 2c 24 sub (%rsp),%r13 10: 48 8d 14 40 lea (%rax,%rax,2),%rdx 14: 48 rex.W 15: 8d .byte 0x8d RSP: 0018:ffffffff92203e60 EFLAGS: 00000246 RAX: ffff8aa31ac00000 RBX: 0000000000000002 RCX: 000000000000001f RDX: 0000000000000000 RSI: ffffffff91b70667 RDI: ffffffff91b55729 RBP: ffff8aa300906c00 R08: 0000000955084e02 R09: 0000000000000018 R10: 0000000000000001 R11: 0000000000001015 R12: ffffffff923d05c0 R13: 0000000955084e02 R14: 0000000000000002 R15: 0000000000000000 cpuidle_enter (drivers/cpuidle/cpuidle.c:353) do_idle (kernel/sched/idle.c:158 kernel/sched/idle.c:239 kernel/sched/idle.c:306) cpu_startup_entry (kernel/sched/idle.c:402 (discriminator 1)) start_kernel (init/main.c:1137) secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:283) </TASK> Modules linked in: qrtr_mhi qrtr ath11k_pci mhi ath11k qmi_helpers mac80211 btusb btrtl btbcm btintel bluetooth libarc4 kvm_amd cfg80211 ccp rng_core jitterentropy_rng kvm sha512_ssse3 sha512_generic evdev ctr snd_pcm drbg sg snd_timer ansi_cprng leds_apu irqbypass ecdh_generic snd rfkill ecc soundcore pcspkr k10temp sp5100_tco watchdog button acpi_cpufreq drm fuse configfs ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 sd_mod t10_pi crc_t10dif crct10dif_generic crct10dif_common uas usb_storage ohci_pci ahci libahci ohci_hcd ehci_pci ehci_hcd libata r8169 realtek mdio_devres scsi_mod usbcore i2c_piix4 usb_common scsi_common libphy ---[ end trace 23d792ef4816c4de ]--- RIP: 0010:skb_zcopy_clear (./include/linux/skbuff.h:1551) Code: 48 89 c7 e8 81 02 00 00 48 89 45 f8 48 83 7d f8 00 74 45 48 8b 45 f0 48 89 c7 e8 ab 02 00 00 83 f0 01 84 c0 74 1e 48 8b 45 f8 <4c> 8b 00 0f b6 55 ec 48 8b 4d f8 48 8b 45 f0 48 89 ce 48 89 c7 e8 All code ======== 0: 48 89 c7 mov %rax,%rdi 3: e8 81 02 00 00 callq 0x289 8: 48 89 45 f8 mov %rax,-0x8(%rbp) c: 48 83 7d f8 00 cmpq $0x0,-0x8(%rbp) 11: 74 45 je 0x58 13: 48 8b 45 f0 mov -0x10(%rbp),%rax 17: 48 89 c7 mov %rax,%rdi 1a: e8 ab 02 00 00 callq 0x2ca 1f: 83 f0 01 xor $0x1,%eax 22: 84 c0 test %al,%al 24: 74 1e je 0x44 26: 48 8b 45 f8 mov -0x8(%rbp),%rax 2a:* 4c 8b 00 mov (%rax),%r8 <-- trapping instruction 2d: 0f b6 55 ec movzbl -0x14(%rbp),%edx 31: 48 8b 4d f8 mov -0x8(%rbp),%rcx 35: 48 8b 45 f0 mov -0x10(%rbp),%rax 39: 48 89 ce mov %rcx,%rsi 3c: 48 89 c7 mov %rax,%rdi 3f: e8 .byte 0xe8 Code starting with the faulting instruction =========================================== 0: 4c 8b 00 mov (%rax),%r8 3: 0f b6 55 ec movzbl -0x14(%rbp),%edx 7: 48 8b 4d f8 mov -0x8(%rbp),%rcx b: 48 8b 45 f0 mov -0x10(%rbp),%rax f: 48 89 ce mov %rcx,%rsi 12: 48 89 c7 mov %rax,%rdi 15: e8 .byte 0xe8 RSP: 0018:ffffb58e80003de8 EFLAGS: 00010202 RAX: 00408210000b231a RBX: ffff8aa303097b00 RCX: 0000000000000000 RDX: 0000000000000102 RSI: 0000000000000001 RDI: ffff8aa303097b00 RBP: ffffb58e80003e00 R08: 0000000000000212 R09: ffffffff922d24e8 R10: 0000000000000000 R11: 00000000db69d000 R12: ffff8aa310c69ac0 R13: ffff8aa303097b00 R14: ffff8aa3062235d8 R15: 0000000000000005 FS: 0000000000000000(0000) GS:ffff8aa31ac00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dac9d55408 CR3: 00000001090fa000 CR4: 00000000000006f0 Kernel panic - not syncing: Fatal exception in interrupt Kernel Offset: 0xfa00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
On 12/7/2021 4:03 AM, Sven Eckelmann wrote: > On Monday, 6 December 2021 08:10:40 CET Wen Gong wrote: >>> On Monday, 6 December 2021 04:29:39 CET Wen Gong wrote: >>> [...] >>>> I did test in my setup, not see the crash. >>>> >>>> I am afraid you also need this patch("ath11k: change to use dynamic >>>> memory for channel list of scan", >>>> >>>> https://patchwork.kernel.org/project/linux-wireless/patch/20211129110939.15711-1-quic_wgong@quicinc.com >>>> ) >>>> >>>> Could you apply this patch and try again? >>> Tried it and I see the same problem. >> Could you tell what is your test steps? > Start kernel with commit a93789ae541c ("ath11k: Avoid NULL ptr > access during mgmt tx cleanup") + patches: > > * ath11k: enable IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS for WCN6855 > * ath11k: change to use dynamic memory for channel list of scan > > You can find the config in the first mail. But I have now enabled KASAN inline > to hopefully create some better error messages. > > The firmware + board data (see mail "ath11k: incorrect board_id retrieval") > was prepared like this: > > git clone https://github.com/kvalo/ath11k-firmware /root/ath11k-firmware > mkdir -p /lib/firmware/ath11k/WCN6855/hw2.0/ > cp /root/ath11k-firmware/WCN6855/hw2.0/*.bin /lib/firmware/ath11k/WCN6855/hw2.0/ > cp /root/ath11k-firmware/WCN6855/hw2.0/1.1/WLAN.HSP.1.1-02892.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1/*.bin /lib/firmware/ath11k/WCN6855/hw2.0/ > > git clone https://github.com/qca/qca-swiss-army-knife /root/qca-swiss-army-knife > apt install python2 > python2 /root/qca-swiss-army-knife/tools/scripts/ath11k/ath11k-bdencoder -e /lib/firmware/ath11k/WCN6855/hw2.0/board-2.bin > rm /lib/firmware/ath11k/WCN6855/hw2.0/board-2.bin > cp 'bus=pci,vendor=17cb,device=1103,subsystem-vendor=17cb,subsystem-device=3374,qmi-chip-id=2,qmi-board-id=266.bin' /lib/firmware/ath11k/WCN6855/hw2.0/board.bin > > Then I am just starting up the device as usual, and start wpa_supplicant (with > defconfig + CONFIG_MESH=y) from commit 14ab4a816c68 ("Reject > ap_vendor_elements if its length is odd") > > cat << "EOF" > station_test.cfg > network={ > ssid="MyTestAP" > key_mgmt=WPA-PSK FT-PSK > proto=RSN > psk="testtest" > } > EOF > ip link set up dev wlp6s0 > ~/hostap/wpa_supplicant/wpa_supplicant -D nl80211 -i wlp6s0 -c station_test.cfg > > The actual SSID + PSK is valid and multiple access points (4) have this BSS on > 2.4GHz + 5GHz. > > So you are basically always calling dev_kfree_skb_any in ath11k_ce_tx_process_cb > because wcn6855 hw2.0 has credit_flow has set. But it seems like one of the > entries returned by ath11k_ce_completed_send_next is bogus and causes this > problems during the ath11k_ce_tx_process_cb. And for some reason, this is > triggered here by this firmware feature. > > ./scripts/faddr2line --list vmlinux consume_skb+0x9f/0x1c0 > consume_skb+0x9f/0x1c0: > > __kfree_skb at net/core/skbuff.c:757 > 752 */ > 753 > 754 void __kfree_skb(struct sk_buff *skb) > 755 { > 756 skb_release_all(skb); > >757< kfree_skbmem(skb); > 758 } > 759 EXPORT_SYMBOL(__kfree_skb); > 760 > 761 /** > 762 * kfree_skb - free an sk_buff > > (inlined by) consume_skb at net/core/skbuff.c:912 > 907 { > 908 if (!skb_unref(skb)) > 909 return; > 910 > 911 trace_consume_skb(skb); > >912< __kfree_skb(skb); > 913 } > 914 EXPORT_SYMBOL(consume_skb); > 915 #endif > 916 > 917 /** > > (inlined by) consume_skb at net/core/skbuff.c:906 > 901 * > 902 * Drop a ref to the buffer and free it if the usage count has hit zero > 903 * Functions identically to kfree_skb, but kfree_skb assumes that the frame > 904 * is being dropped after a failure and notes that > 905 */ > >906< void consume_skb(struct sk_buff *skb) > 907 { > 908 if (!skb_unref(skb)) > 909 return; > 910 > 911 trace_consume_skb(skb); > > > ./scripts/faddr2line --list vmlinux skb_release_data+0x1b0/0x5c0 > skb_release_data+0x1b0/0x5c0: > > skb_zcopy_clear at include/linux/skbuff.h:1549 > 1544 { > 1545 struct ubuf_info *uarg = skb_zcopy(skb); > 1546 > 1547 if (uarg) { > 1548 if (!skb_zcopy_is_nouarg(skb)) > >1549< uarg->callback(skb, uarg, zerocopy_success); > 1550 > 1551 skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY; > 1552 } > 1553 } > 1554 > > (inlined by) skb_release_data at net/core/skbuff.c:669 > 664 if (skb->cloned && > 665 atomic_sub_return(skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1, > 666 &shinfo->dataref)) > 667 goto exit; > 668 > >669< skb_zcopy_clear(skb, true); > 670 > 671 for (i = 0; i < shinfo->nr_frags; i++) > 672 __skb_frag_unref(&shinfo->frags[i], skb->pp_recycle); > 673 > 674 if (shinfo->frag_list) > > But I didn't like the inlined code. So I've changed the compilation flags > slightly: > > diff --git a/net/core/Makefile b/net/core/Makefile > index 6bdcb2cafed8..5eda226c5f27 100644 > --- a/net/core/Makefile > +++ b/net/core/Makefile > @@ -37,3 +37,4 @@ obj-$(CONFIG_NET_SOCK_MSG) += skmsg.o > obj-$(CONFIG_BPF_SYSCALL) += sock_map.o > obj-$(CONFIG_BPF_SYSCALL) += bpf_sk_storage.o > obj-$(CONFIG_OF) += of_net.o > +ccflags-y += -fno-inline -O1 -fno-optimize-sibling-calls > > Now the stacktrace is a lot more readable. And the returned > crash location makes a lot more sense: > > ./scripts/faddr2line --list vmlinux 'skb_zcopy_clear+0x34/0x8f' > skb_zcopy_clear+0x34/0x8f: > > skb_zcopy_clear at include/linux/skbuff.h:1549 > 1544 { > 1545 struct ubuf_info *uarg = skb_zcopy(skb); > 1546 > 1547 if (uarg) { > 1548 if (!skb_zcopy_is_nouarg(skb)) > >1549< uarg->callback(skb, uarg, zerocopy_success); > 1550 > 1551 skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY; > 1552 } > 1553 } > 1554 > > Or with the assembler: > > (gdb) disassemble /m *(skb_zcopy_clear+0x34/0x8f) > Dump of assembler code for function skb_zcopy_clear: > 1544 { > 0x000000000000072a <+0>: push %r12 > 0x000000000000072c <+2>: push %rbp > 0x000000000000072d <+3>: push %rbx > 0x000000000000072e <+4>: mov %rdi,%rbx > 0x0000000000000731 <+7>: mov %esi,%r12d > > 1545 struct ubuf_info *uarg = skb_zcopy(skb); > 0x0000000000000734 <+10>: call 0x5d3 <skb_zcopy> > > 1546 > 1547 if (uarg) { > 0x0000000000000739 <+15>: test %rax,%rax > 0x000000000000073c <+18>: je 0x7a0 <skb_zcopy_clear+118> > 0x000000000000073e <+20>: mov %rax,%rbp > > 1548 if (!skb_zcopy_is_nouarg(skb)) > 0x0000000000000741 <+23>: mov %rbx,%rdi > 0x0000000000000744 <+26>: call 0x6f6 <skb_zcopy_is_nouarg> > 0x0000000000000749 <+31>: test %al,%al > 0x000000000000074b <+33>: jne 0x777 <skb_zcopy_clear+77> > > 1549 uarg->callback(skb, uarg, zerocopy_success); > 0x000000000000074d <+35>: mov %rbp,%rdx > 0x0000000000000750 <+38>: shr $0x3,%rdx > 0x0000000000000754 <+42>: movabs $0xdffffc0000000000,%rax > 0x000000000000075e <+52>: cmpb $0x0,(%rdx,%rax,1) > 0x0000000000000762 <+56>: jne 0x7a5 <skb_zcopy_clear+123> > 0x0000000000000764 <+58>: movzbl %r12b,%edx > 0x0000000000000768 <+62>: mov 0x0(%rbp),%rax > 0x000000000000076c <+66>: mov %rbp,%rsi > 0x000000000000076f <+69>: mov %rbx,%rdi > 0x0000000000000772 <+72>: call 0x777 <skb_zcopy_clear+77> > 0x00000000000007a5 <+123>: mov %rbp,%rdi > 0x00000000000007a8 <+126>: call 0x7ad <skb_zcopy_clear+131> > 0x00000000000007ad <+131>: jmp 0x764 <skb_zcopy_clear+58> > > 1550 > 1551 skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY; > 0x0000000000000777 <+77>: mov %rbx,%rdi > 0x000000000000077a <+80>: call 0x518 <skb_end_pointer> > 0x000000000000077f <+85>: mov %rax,%rbx > 0x0000000000000782 <+88>: mov %rax,%rdx > 0x0000000000000785 <+91>: shr $0x3,%rdx > 0x0000000000000789 <+95>: movabs $0xdffffc0000000000,%rax > 0x0000000000000793 <+105>: movzbl (%rdx,%rax,1),%eax > 0x0000000000000797 <+109>: test %al,%al > 0x0000000000000799 <+111>: je 0x79d <skb_zcopy_clear+115> > 0x000000000000079b <+113>: jle 0x7af <skb_zcopy_clear+133> > 0x000000000000079d <+115>: andb $0xf8,(%rbx) > 0x00000000000007af <+133>: mov %rbx,%rdi > 0x00000000000007b2 <+136>: call 0x7b7 <skb_zcopy_clear+141> > 0x00000000000007b7 <+141>: jmp 0x79d <skb_zcopy_clear+115> > > 1552 } > 1553 } > 0x00000000000007a0 <+118>: pop %rbx > 0x00000000000007a1 <+119>: pop %rbp > 0x00000000000007a2 <+120>: pop %r12 > 0x00000000000007a4 <+122>: ret > > End of assembler dump. > > To make it even easier to read, just disable the inline KASAN and reduce the > optimization level for this for it: > > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h > index 059b6266dcd7..819cc58ab051 100644 > --- a/include/linux/skbuff.h > +++ b/include/linux/skbuff.h > @@ -1540,6 +1540,8 @@ static inline void net_zcopy_put_abort(struct ubuf_info *uarg, bool have_uref) > } > > /* Release a reference on a zerocopy structure */ > +#pragma GCC push_options > +#pragma GCC optimize ("O0") > static inline void skb_zcopy_clear(struct sk_buff *skb, bool zerocopy_success) > { > struct ubuf_info *uarg = skb_zcopy(skb); > @@ -1551,6 +1553,7 @@ static inline void skb_zcopy_clear(struct sk_buff *skb, bool zerocopy_success) > skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY; > } > } > +#pragma GCC pop_options > > static inline void skb_mark_not_on_list(struct sk_buff *skb) > { > > This creates this nice, unoptimized function which crashes at +63: > > $ gdb net/core/skbuff.o -q > Reading symbols from net/core/skbuff.o... > (gdb) disassemble /m *(skb_zcopy_clear+0x3f/0x70) > Dump of assembler code for function skb_zcopy_clear: > 1546 { > 0x0000000000000000 <+0>: push %rbp > 0x0000000000000001 <+1>: mov %rsp,%rbp > 0x0000000000000004 <+4>: sub $0x18,%rsp > 0x0000000000000008 <+8>: mov %rdi,-0x10(%rbp) > 0x000000000000000c <+12>: mov %esi,%eax > 0x000000000000000e <+14>: mov %al,-0x14(%rbp) > > 1547 struct ubuf_info *uarg = skb_zcopy(skb); > 0x0000000000000011 <+17>: mov -0x10(%rbp),%rax > 0x0000000000000015 <+21>: mov %rax,%rdi > 0x0000000000000018 <+24>: call 0x29e <skb_zcopy> > 0x000000000000001d <+29>: mov %rax,-0x8(%rbp) > > 1548 > 1549 if (uarg) { > 0x0000000000000021 <+33>: cmpq $0x0,-0x8(%rbp) > 0x0000000000000026 <+38>: je 0x6d <skb_zcopy_clear+109> > > 1550 if (!skb_zcopy_is_nouarg(skb)) > 0x0000000000000028 <+40>: mov -0x10(%rbp),%rax > 0x000000000000002c <+44>: mov %rax,%rdi > 0x000000000000002f <+47>: call 0x2df <skb_zcopy_is_nouarg> > 0x0000000000000034 <+52>: xor $0x1,%eax > 0x0000000000000037 <+55>: test %al,%al > 0x0000000000000039 <+57>: je 0x59 <skb_zcopy_clear+89> > > 1551 uarg->callback(skb, uarg, zerocopy_success); > 0x000000000000003b <+59>: mov -0x8(%rbp),%rax > 0x000000000000003f <+63>: mov (%rax),%r8 > 0x0000000000000042 <+66>: movzbl -0x14(%rbp),%edx > 0x0000000000000046 <+70>: mov -0x8(%rbp),%rcx > 0x000000000000004a <+74>: mov -0x10(%rbp),%rax > 0x000000000000004e <+78>: mov %rcx,%rsi > 0x0000000000000051 <+81>: mov %rax,%rdi > 0x0000000000000054 <+84>: call 0x59 <skb_zcopy_clear+89> > > 1552 > 1553 skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY; > 0x0000000000000059 <+89>: mov -0x10(%rbp),%rax > 0x000000000000005d <+93>: mov %rax,%rdi > 0x0000000000000060 <+96>: call 0x27f <skb_end_pointer> > 0x0000000000000065 <+101>: movzbl (%rax),%edx > 0x0000000000000068 <+104>: and $0xfffffff8,%edx > 0x000000000000006b <+107>: mov %dl,(%rax) > > 1554 } > 1555 } > 0x000000000000006d <+109>: nop > 0x000000000000006e <+110>: leave > 0x000000000000006f <+111>: ret > > End of assembler dump. > > The question now: What is causing the unclean state of the skb and thus > doesn't let it get rejected by skb_zcopy_is_nouarg before the uarg > callback is tried. > > Kind regards, > Sven Thanks Sven a lot for your analyze. I still can not reproduce it. I think it is because the write over skb->tail in scan, because the invalid address is same for each crash(0x408210000b231a/0xe0080c4200016463), and it is caused by this instruction "0x000000000000003f <+63>: mov (%rax),%r8" which is assign the value of uarg->callback to %r8. Could you add below change? It will print the log to help us find out the bug. diff --git a/drivers/net/wireless/ath/ath11k/mac.c b/drivers/net/wireless/ath/ath11k/mac.c index 26181f237e23..2147f74f5ebf 100644 --- a/drivers/net/wireless/ath/ath11k/mac.c +++ b/drivers/net/wireless/ath/ath11k/mac.c @@ -3421,12 +3421,15 @@ static int ath11k_mac_op_hw_scan(struct ieee80211_hw *hw, memcpy(arg.extraie.ptr, req->ie, req->ie_len); } + ath11k_info(ar->ab, "n_ssids %d\n", req->n_ssids); + if (req->n_ssids) { arg.num_ssids = req->n_ssids; for (i = 0; i < arg.num_ssids; i++) { arg.ssid[i].length = req->ssids[i].ssid_len; memcpy(&arg.ssid[i].ssid, req->ssids[i].ssid, req->ssids[i].ssid_len); + ath11k_info(ar->ab, "ssid[%d] len %d\n", i, arg.ssid[i].length); } } else { arg.scan_flags |= WMI_SCAN_FLAG_PASSIVE; diff --git a/drivers/net/wireless/ath/ath11k/wmi.c b/drivers/net/wireless/ath/ath11k/wmi.c index 7d7f76d4bf1f..e42a64251799 100644 --- a/drivers/net/wireless/ath/ath11k/wmi.c +++ b/drivers/net/wireless/ath/ath11k/wmi.c @@ -2271,6 +2271,7 @@ int ath11k_wmi_send_scan_start_cmd(struct ath11k *ar, } } + ath11k_info(ar->ab, "%s ptr %px skb data %px len %d over %d", __func__, ptr, skb->data, skb->len, ((unsigned char *)ptr)-skb->data-skb->len); ret = ath11k_wmi_cmd_send(wmi, skb, WMI_START_SCAN_CMDID); if (ret) {
Thanks Sven's analyze/debugging. I see your patch "ath11k: Fix buffer overflow when scanning with extraie". On 12/7/2021 10:30 PM, Sven Eckelmann wrote: > On Tuesday, 7 December 2021 05:35:04 CET Wen Gong wrote: >> Thanks Sven a lot for your analyze. >> >> I still can not reproduce it. >> >> I think it is because the write over skb->tail in scan, because the >> invalid address > Yes, I thought that I wanted to write about it but it might have gone into > another draft of the mail. So what I wanted to write was something like: > > The information which is used in skb_zcopy_clear/skb_zcopy/skb_zcopy_is_nouarg > is coming from skb_shinfo. And skb_end_pointer is just a pointer to a region > at the end of the skb buffer (skb->end). And this got corrupted by something > Unfortunately this is correctly allocated memory and thus kasan cannot help > us with it. > > > > [...] >> --- a/drivers/net/wireless/ath/ath11k/wmi.c >> +++ b/drivers/net/wireless/ath/ath11k/wmi.c >> @@ -2271,6 +2271,7 @@ int ath11k_wmi_send_scan_start_cmd(struct ath11k *ar, >> } >> } >> >> + ath11k_info(ar->ab, "%s ptr %px skb data %px len %d over %d", >> __func__, ptr, skb->data, skb->len, ((unsigned char >> *)ptr)-skb->data-skb->len); >> ret = ath11k_wmi_cmd_send(wmi, skb, >> WMI_START_SCAN_CMDID); >> if (ret) { > Changed the last part to: > > ath11k_err(ar->ab, "%s ptr %px skb data %px len %d over %ld\n", __func__, ptr, skb->data, skb->len, ((unsigned char *)ptr) - skb->data - skb->len); > > > The output is: > > ath11k_pci 0000:01:00.0: n_ssids 1 > ath11k_pci 0000:01:00.0: ssid[0] len 0 > ath11k_pci 0000:01:00.0: ath11k_wmi_send_scan_start_cmd ptr ffff9217101e82b4 skb data ffff9217101e804c len 616 over 0 > > But we are looking at the ath11k_ce_tx_process_cb function. So I would have > expected that it is related to something which as sent out. So the first thing > I did was to add some skb_dumps in the sent path (ath11k_htc_send) and in the > cleanup path (skb_zcopy_clear). Something like this (just the cleanup path > because otherwise I have to post a rather large diff): > > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h > index 819cc58ab051..c15512e2f30c 100644 > --- a/include/linux/skbuff.h > +++ b/include/linux/skbuff.h > @@ -1547,8 +1547,10 @@ static inline void skb_zcopy_clear(struct sk_buff *skb, bool zerocopy_success) > struct ubuf_info *uarg = skb_zcopy(skb); > > if (uarg) { > - if (!skb_zcopy_is_nouarg(skb)) > + if (!skb_zcopy_is_nouarg(skb)) { > + skb_dump(KERN_ERR, skb, true); > uarg->callback(skb, uarg, zerocopy_success); > + } > > skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY; > } > > > But interestingly, it already crashes to parse the fraglist in > ath11k_htc_send. So I've added some more dump to figure out where is breaks. > And I've noticed that it breaks after following section in > ath11k_wmi_send_scan_start_cmd > > if (params->extraie.len) > memcpy(ptr, params->extraie.ptr, > params->extraie.len); > > Here is the full output: > > [ 30.641297] ath11k_wmi_send_scan_start_cmd:2357 > [ 30.645873] skb len=616 headroom=76 headlen=616 tailroom=12 > [ 30.645873] mac=(-1,-1) net=(0,-1) trans=-1 > [ 30.645873] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0)) > [ 30.645873] csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0) > [ 30.645873] hash(0x0 sw=0 l4=0) proto=0x0000 pkttype=0 iif=0 > [ 30.673381] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > [ 30.681073] skb headroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > [ 30.688758] skb headroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > [ 30.696465] skb headroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > [ 30.704197] skb headroom: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 > [ 30.710852] skb linear: 00000000: 9c 00 4d 00 00 a0 00 00 01 00 00 00 00 00 00 00 > [ 30.718538] skb linear: 00000010: 01 00 00 00 1f 00 00 00 32 00 00 00 96 00 00 00 > [ 30.726271] skb linear: 00000020: 32 00 00 00 f4 01 00 00 00 00 00 00 00 00 00 00 > [ 30.733954] skb linear: 00000030: 00 00 00 00 20 4e 00 00 05 00 00 00 10 00 00 00 > [ 30.741636] skb linear: 00000040: 00 00 00 00 61 00 00 00 01 00 00 00 01 00 00 00 > [ 30.749346] skb linear: 00000050: 08 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > [ 30.757092] skb linear: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > [ 30.764795] skb linear: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > [ 30.772483] skb linear: 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > [ 30.780170] skb linear: 00000090: 00 00 00 00 28 00 00 00 1e 00 00 00 00 00 00 00 > [ 30.787854] skb linear: 000000a0: 84 01 10 00 6c 09 00 00 71 09 00 00 76 09 00 00 > [ 30.795541] skb linear: 000000b0: 7b 09 00 00 80 09 00 00 85 09 00 00 8a 09 00 00 > [ 30.803236] skb linear: 000000c0: 8f 09 00 00 94 09 00 00 99 09 00 00 9e 09 00 00 > [ 30.810933] skb linear: 000000d0: a3 09 00 00 a8 09 00 00 3c 14 00 00 50 14 00 00 > [ 30.818620] skb linear: 000000e0: 64 14 00 00 78 14 00 00 8c 14 00 00 a0 14 00 00 > [ 30.826322] skb linear: 000000f0: b4 14 00 00 c8 14 00 00 7c 15 00 00 90 15 00 00 > [ 30.834018] skb linear: 00000100: a4 15 00 00 b8 15 00 00 cc 15 00 00 e0 15 00 00 > [ 30.841712] skb linear: 00000110: f4 15 00 00 08 16 00 00 1c 16 00 00 30 16 00 00 > [ 30.849402] skb linear: 00000120: 44 16 00 00 58 16 00 00 71 16 00 00 85 16 00 00 > [ 30.857094] skb linear: 00000130: 99 16 00 00 ad 16 00 00 c1 16 00 00 43 17 00 00 > [ 30.864776] skb linear: 00000140: 57 17 00 00 6b 17 00 00 7f 17 00 00 93 17 00 00 > [ 30.872490] skb linear: 00000150: a7 17 00 00 bb 17 00 00 cf 17 00 00 e3 17 00 00 > [ 30.880182] skb linear: 00000160: f7 17 00 00 0b 18 00 00 1f 18 00 00 33 18 00 00 > [ 30.887882] skb linear: 00000170: 47 18 00 00 5b 18 00 00 6f 18 00 00 83 18 00 00 > [ 30.895581] skb linear: 00000180: 97 18 00 00 ab 18 00 00 bf 18 00 00 d3 18 00 00 > [ 30.903265] skb linear: 00000190: e7 18 00 00 fb 18 00 00 0f 19 00 00 23 19 00 00 > [ 30.910974] skb linear: 000001a0: 37 19 00 00 4b 19 00 00 5f 19 00 00 73 19 00 00 > [ 30.918675] skb linear: 000001b0: 87 19 00 00 9b 19 00 00 af 19 00 00 c3 19 00 00 > [ 30.926418] skb linear: 000001c0: d7 19 00 00 eb 19 00 00 ff 19 00 00 13 1a 00 00 > [ 30.934118] skb linear: 000001d0: 27 1a 00 00 3b 1a 00 00 4f 1a 00 00 63 1a 00 00 > [ 30.941842] skb linear: 000001e0: 77 1a 00 00 8b 1a 00 00 9f 1a 00 00 b3 1a 00 00 > [ 30.949537] skb linear: 000001f0: c7 1a 00 00 db 1a 00 00 ef 1a 00 00 03 1b 00 00 > [ 30.957221] skb linear: 00000200: 17 1b 00 00 2b 1b 00 00 3f 1b 00 00 53 1b 00 00 > [ 30.964912] skb linear: 00000210: 67 1b 00 00 7b 1b 00 00 8f 1b 00 00 a3 1b 00 00 > [ 30.972614] skb linear: 00000220: b7 1b 00 00 cb 1b 00 00 24 00 13 00 00 00 00 00 > [ 30.980315] skb linear: 00000230: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > [ 30.988010] skb linear: 00000240: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > [ 30.995696] skb linear: 00000250: 08 00 13 00 ff ff ff ff ff ff 00 00 08 00 11 00 > [ 31.003394] skb linear: 00000260: 00 00 00 00 00 00 00 00 > [ 31.009002] skb tailroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 > [ 31.015646] ath11k_wmi_send_scan_start_cmd:2362 > [ 31.020217] skb len=616 headroom=76 headlen=616 tailroom=12 > [ 31.020217] mac=(-1,-1) net=(0,-1) trans=-1 > [ 31.020217] shinfo(txflags=0 nr_frags=255 gso(size=0 type=265087 segs=0)) > [ 31.020217] csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0) > [ 31.020217] hash(0x0 sw=0 l4=0) proto=0x0000 pkttype=0 iif=0 > [ 31.048289] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > [ 31.056015] skb headroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > [ 31.063714] skb headroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > [ 31.071425] skb headroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > [ 31.079141] skb headroom: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 > [ 31.085787] skb linear: 00000000: 9c 00 4d 00 00 a0 00 00 01 00 00 00 00 00 00 00 > [ 31.093518] skb linear: 00000010: 01 00 00 00 1f 00 00 00 32 00 00 00 96 00 00 00 > [ 31.101239] skb linear: 00000020: 32 00 00 00 f4 01 00 00 00 00 00 00 00 00 00 00 > [ 31.108947] skb linear: 00000030: 00 00 00 00 20 4e 00 00 05 00 00 00 10 00 00 00 > [ 31.116630] skb linear: 00000040: 00 00 00 00 61 00 00 00 01 00 00 00 01 00 00 00 > [ 31.124326] skb linear: 00000050: 08 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > [ 31.132007] skb linear: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > [ 31.139708] skb linear: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > [ 31.147420] skb linear: 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > [ 31.155118] skb linear: 00000090: 00 00 00 00 28 00 00 00 1e 00 00 00 00 00 00 00 > [ 31.162798] skb linear: 000000a0: 84 01 10 00 6c 09 00 00 71 09 00 00 76 09 00 00 > [ 31.170486] skb linear: 000000b0: 7b 09 00 00 80 09 00 00 85 09 00 00 8a 09 00 00 > [ 31.178175] skb linear: 000000c0: 8f 09 00 00 94 09 00 00 99 09 00 00 9e 09 00 00 > [ 31.185876] skb linear: 000000d0: a3 09 00 00 a8 09 00 00 3c 14 00 00 50 14 00 00 > [ 31.193593] skb linear: 000000e0: 64 14 00 00 78 14 00 00 8c 14 00 00 a0 14 00 00 > [ 31.201278] skb linear: 000000f0: b4 14 00 00 c8 14 00 00 7c 15 00 00 90 15 00 00 > [ 31.208969] skb linear: 00000100: a4 15 00 00 b8 15 00 00 cc 15 00 00 e0 15 00 00 > [ 31.216655] skb linear: 00000110: f4 15 00 00 08 16 00 00 1c 16 00 00 30 16 00 00 > [ 31.224346] skb linear: 00000120: 44 16 00 00 58 16 00 00 71 16 00 00 85 16 00 00 > [ 31.232030] skb linear: 00000130: 99 16 00 00 ad 16 00 00 c1 16 00 00 43 17 00 00 > [ 31.239739] skb linear: 00000140: 57 17 00 00 6b 17 00 00 7f 17 00 00 93 17 00 00 > [ 31.247428] skb linear: 00000150: a7 17 00 00 bb 17 00 00 cf 17 00 00 e3 17 00 00 > [ 31.255141] skb linear: 00000160: f7 17 00 00 0b 18 00 00 1f 18 00 00 33 18 00 00 > [ 31.262840] skb linear: 00000170: 47 18 00 00 5b 18 00 00 6f 18 00 00 83 18 00 00 > [ 31.270591] skb linear: 00000180: 97 18 00 00 ab 18 00 00 bf 18 00 00 d3 18 00 00 > [ 31.278282] skb linear: 00000190: e7 18 00 00 fb 18 00 00 0f 19 00 00 23 19 00 00 > [ 31.285965] skb linear: 000001a0: 37 19 00 00 4b 19 00 00 5f 19 00 00 73 19 00 00 > [ 31.293675] skb linear: 000001b0: 87 19 00 00 9b 19 00 00 af 19 00 00 c3 19 00 00 > [ 31.301361] skb linear: 000001c0: d7 19 00 00 eb 19 00 00 ff 19 00 00 13 1a 00 00 > [ 31.309056] skb linear: 000001d0: 27 1a 00 00 3b 1a 00 00 4f 1a 00 00 63 1a 00 00 > [ 31.316753] skb linear: 000001e0: 77 1a 00 00 8b 1a 00 00 9f 1a 00 00 b3 1a 00 00 > [ 31.324441] skb linear: 000001f0: c7 1a 00 00 db 1a 00 00 ef 1a 00 00 03 1b 00 00 > [ 31.332138] skb linear: 00000200: 17 1b 00 00 2b 1b 00 00 3f 1b 00 00 53 1b 00 00 > [ 31.339840] skb linear: 00000210: 67 1b 00 00 7b 1b 00 00 8f 1b 00 00 a3 1b 00 00 > [ 31.347520] skb linear: 00000220: b7 1b 00 00 cb 1b 00 00 24 00 13 00 00 00 00 00 > [ 31.355232] skb linear: 00000230: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > [ 31.362920] skb linear: 00000240: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > [ 31.370607] skb linear: 00000250: 08 00 13 00 ff ff ff ff ff ff 00 00 08 00 11 00 > [ 31.378331] skb linear: 00000260: 01 08 02 04 0b 16 0c 12 > [ 31.383972] skb tailroom: 00000000: 18 24 32 04 30 48 60 6c 2d 1a e3 19 > [ 31.390651] skb fraglist: > [ 31.393348] BUG: unable to handle page fault for address: 00000100000000bc > [ 31.400317] #PF: supervisor read access in kernel mode > [ 31.405624] #PF: error_code(0x0000) - not-present page > [ 31.410832] PGD 0 P4D 0 > [ 31.413422] Oops: 0000 [#1] PREEMPT SMP NOPTI > [ 31.417881] CPU: 0 PID: 520 Comm: wpa_supplicant Not tainted 5.16.0-rc1+ #5 > [ 31.424862] Hardware name: PC Engines apu2/apu2, BIOS v4.15.0.1 11/23/2021 > [ 31.431750] RIP: 0010:skb_end_pointer+0x0/0xe > [ 31.436129] Code: dc ff ff ff c3 48 8b 07 48 83 e0 fe 48 83 c8 02 48 89 07 c3 8b 47 08 c3 89 77 08 c3 01 77 08 c3 29 77 08 c3 b8 00 00 00 00 c3 <8b> 87 bc 00 00 00 48 03 87 c0 00 00 00 c3 8b 87 bc 00 00 00 c3 e8 > [ 31.454883] RSP: 0018:ffffb0edc0427490 EFLAGS: 00010282 > [ 31.460116] RAX: ffff8ff9818b1ec0 RBX: 0000010000000000 RCX: 0000000000000000 > [ 31.467267] RDX: 0000000000000001 RSI: 0000010000000000 RDI: 0000010000000000 > [ 31.474408] RBP: ffff8ff9891a07c8 R08: 0000000000000000 R09: ffffb0edc0427370 > [ 31.481549] R10: ffffb0edc0427368 R11: ffffffffa5cd22e8 R12: 0000000000000268 > [ 31.488689] R13: 00000000ffffffff R14: 0000010000000000 R15: ffffffffc0f2198b > [ 31.495823] FS: 00007f2725a55c00(0000) GS:ffff8ff9aac00000(0000) knlGS:0000000000000000 > [ 31.503936] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 31.509706] CR2: 00000100000000bc CR3: 00000001090bc000 CR4: 00000000000406f0 > [ 31.516868] Call Trace: > [ 31.519325] <TASK> > [ 31.521433] skb_dump+0x24/0x53a > [ 31.524688] ? _printk+0x58/0x6f > [ 31.527938] skb_dump+0x532/0x53a > [ 31.531267] ath11k_wmi_send_scan_start_cmd.cold+0x5f2/0x793 [ath11k] > [ 31.537785] ath11k_mac_op_hw_scan+0x173/0x3f0 [ath11k] > [ 31.543086] drv_hw_scan+0x43/0x130 [mac80211] > [ 31.547690] __ieee80211_start_scan+0x152/0x6d0 [mac80211] > [ 31.553306] ieee80211_request_scan+0x2c/0x50 [mac80211] > [ 31.558738] rdev_scan+0x28/0xd0 [cfg80211] > [ 31.563117] nl80211_trigger_scan+0x3fe/0x680 [cfg80211] > [ 31.568584] genl_family_rcv_msg_doit+0xea/0x150 > [ 31.573223] genl_rcv_msg+0xde/0x1d0 > [ 31.576816] ? nl80211_send_scan_start+0x90/0x90 [cfg80211] > [ 31.582520] ? genl_get_cmd+0xd0/0xd0 > [ 31.586191] netlink_rcv_skb+0x50/0xf0 > [ 31.589958] genl_rcv+0x24/0x40 > [ 31.593109] netlink_unicast+0x239/0x340 > [ 31.597045] netlink_sendmsg+0x245/0x480 > [ 31.600981] sock_sendmsg+0x5e/0x60 > [ 31.604487] ____sys_sendmsg+0x22e/0x270 > [ 31.608440] ? import_iovec+0x2d/0x30 > [ 31.612123] ? sendmsg_copy_msghdr+0x7c/0xa0 > [ 31.616406] ___sys_sendmsg+0x75/0xb0 > [ 31.620081] ? __mod_lruvec_page_state+0x7d/0xc0 > [ 31.624714] ? folio_add_lru+0x5c/0xa0 > [ 31.628476] ? _raw_spin_unlock+0x16/0x30 > [ 31.632506] ? __handle_mm_fault+0x1261/0x1520 > [ 31.636965] __sys_sendmsg+0x59/0xa0 > [ 31.640552] do_syscall_64+0x3b/0xc0 > [ 31.644148] entry_SYSCALL_64_after_hwframe+0x44/0xae > [ 31.649208] RIP: 0033:0x7f2725ef6f33 > [ 31.652797] Code: 64 89 02 48 c7 c0 ff ff ff ff eb b7 66 2e 0f 1f 84 00 00 00 00 00 90 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 55 c3 0f 1f 40 00 48 83 ec 28 89 54 24 1c 48 > [ 31.671547] RSP: 002b:00007fff1b5f1668 EFLAGS: 00000246 ORIG_RAX: 000000000000002e > [ 31.679122] RAX: ffffffffffffffda RBX: 0000564919260760 RCX: 00007f2725ef6f33 > [ 31.686264] RDX: 0000000000000000 RSI: 00007fff1b5f16a0 RDI: 0000000000000005 > [ 31.693406] RBP: 000056491928f6c0 R08: 0000000000000004 R09: 00007f2725fb6c00 > [ 31.700547] R10: 00007fff1b5f178c R11: 0000000000000246 R12: 0000564919260670 > [ 31.707689] R13: 00007fff1b5f16a0 R14: 0000000000000000 R15: 00007fff1b5f178c > [ 31.714834] </TASK> > [ 31.717031] Modules linked in: qrtr_mhi btusb btrtl btbcm btintel bluetooth jitterentropy_rng sha512_ssse3 sha512_generic drbg ansi_cprng amd64_edac ecdh_generic edac_mce_amd ecc kvm_amd kvm irqbypass qrtr crc32_pclmul ghash_clmulni_intel ath11k_pci mhi ath11k evdev pcengines_apuv2 qmi_helpers gpio_keys_polled gpio_amd_fch aesni_intel snd_pcm crypto_simd snd_timer sdhci_pci xhci_pci snd cqhci mac80211 soundcore ehci_pci sp5100_tco cryptd libarc4 xhci_hcd sdhci ehci_hcd pcspkr igb watchdog ptp cfg80211 mmc_core k10temp i2c_piix4 fam15h_power usbcore ccp pps_core sg dca rng_core i2c_algo_bit usb_common rfkill leds_gpio gpio_keys acpi_cpufreq button fuse drm configfs ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 sd_mod t10_pi crc_t10dif crct10dif_generic ahci libahci libata crct10dif_pclmul scsi_mod crct10dif_common crc32c_intel scsi_common > [ 31.793074] CR2: 00000100000000bc > [ 31.796498] ---[ end trace 07252723010a83e6 ]--- > [ 31.801261] RIP: 0010:skb_end_pointer+0x0/0xe > [ 31.805824] Code: dc ff ff ff c3 48 8b 07 48 83 e0 fe 48 83 c8 02 48 89 07 c3 8b 47 08 c3 89 77 08 c3 01 77 08 c3 29 77 08 c3 b8 00 00 00 00 c3 <8b> 87 bc 00 00 00 48 03 87 c0 00 00 00 c3 8b 87 bc 00 00 00 c3 e8 > [ 31.824842] RSP: 0018:ffffb0edc0427490 EFLAGS: 00010282 > [ 31.830105] RAX: ffff8ff9818b1ec0 RBX: 0000010000000000 RCX: 0000000000000000 > [ 31.837270] RDX: 0000000000000001 RSI: 0000010000000000 RDI: 0000010000000000 > [ 31.844441] RBP: ffff8ff9891a07c8 R08: 0000000000000000 R09: ffffb0edc0427370 > [ 31.851614] R10: ffffb0edc0427368 R11: ffffffffa5cd22e8 R12: 0000000000000268 > [ 31.858781] R13: 00000000ffffffff R14: 0000010000000000 R15: ffffffffc0f2198b > [ 31.866020] FS: 00007f2725a55c00(0000) GS:ffff8ff9aac00000(0000) knlGS:0000000000000000 > [ 31.874141] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 31.879920] CR2: 00000100000000bc CR3: 00000001090bc000 CR4: 00000000000406f0 > > > So the length calculated for the ath11k_wmi_alloc_skb is just wrong. Reason > for this is the extraie_len_with_pad which is only an u8. But the > params->extraie.len with the IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS is for me > already 264. So the length will end up as 8 - but the length it occupies > is still 264. > > But the problem is the length of the WMI_TLV_LEN. The params->extraie.len can > be up to 32 bit and WMI_TLV_LEN only has 16 bit. So the params->extraie.len > must also be size limited or we might run into a different problem. > > Kind regards, > Sven
Wen Gong <quic_wgong@quicinc.com> wrote: > Currently mac80211 will send 3 scan request for each scan of WCN6855, > they are 2.4 GHz/5 GHz/6 GHz band scan. Firmware of WCN6855 will > cache the RNR IE(Reduced Neighbor Report element) which exist in the > beacon of 2.4 GHz/5 GHz of the AP which is co-located with 6 GHz, > and then use the cache to scan in 6 GHz band scan if the 6 GHz scan > is in the same scan with the 2.4 GHz/5 GHz band, this will helpful to > search more AP of 6 GHz. Also it will decrease the time cost of scan > because firmware will use dual-band scan for the 2.4 GHz/5 GHz, it > means the 2.4 GHz and 5 GHz scans are doing simultaneously. > > Set the flag IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS for WCN6855 since > it supports 2.4 GHz/5 GHz/6 GHz and it is single pdev which means > all the 2.4 GHz/5 GHz/6 GHz exist in the same wiphy/ieee80211_hw. > > Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1 > > Signed-off-by: Wen Gong <quic_wgong@quicinc.com> > Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Sven, after your memory corruption fix is this good to take?
On 12/8/2021 4:16 PM, Kalle Valo wrote: > Wen Gong <quic_wgong@quicinc.com> wrote: ... > Sven, after your memory corruption fix is this good to take? After Sven's fix "ath11k: Fix buffer overflow when scanning with extraie", it will not happen kernel crash. But it need Sven's confirm.
On Wednesday, 8 December 2021 09:19:28 CET Wen Gong wrote: > On 12/8/2021 4:16 PM, Kalle Valo wrote: > > Wen Gong <quic_wgong@quicinc.com> wrote: > ... > > Sven, after your memory corruption fix is this good to take? > > After Sven's fix "ath11k: Fix buffer overflow when scanning with > extraie", it will not happen kernel crash. > > But it need Sven's confirm. Correct, it is not causing any problems anymore when the other fix was applied before this change. Tested-by: Sven Eckelmann <sven@narfation.org> Kind regards, Sven
Sven Eckelmann <sven@narfation.org> writes: > On Wednesday, 8 December 2021 09:19:28 CET Wen Gong wrote: >> On 12/8/2021 4:16 PM, Kalle Valo wrote: >> > Wen Gong <quic_wgong@quicinc.com> wrote: >> ... >> > Sven, after your memory corruption fix is this good to take? >> >> After Sven's fix "ath11k: Fix buffer overflow when scanning with >> extraie", it will not happen kernel crash. >> >> But it need Sven's confirm. > > Correct, it is not causing any problems anymore when the other fix was applied > before this change. > > Tested-by: Sven Eckelmann <sven@narfation.org> Very good, thanks. I included your Tested-by.
Wen Gong <quic_wgong@quicinc.com> wrote: > Currently mac80211 will send 3 scan request for each scan of WCN6855, > they are 2.4 GHz/5 GHz/6 GHz band scan. Firmware of WCN6855 will > cache the RNR IE(Reduced Neighbor Report element) which exist in the > beacon of 2.4 GHz/5 GHz of the AP which is co-located with 6 GHz, > and then use the cache to scan in 6 GHz band scan if the 6 GHz scan > is in the same scan with the 2.4 GHz/5 GHz band, this will helpful to > search more AP of 6 GHz. Also it will decrease the time cost of scan > because firmware will use dual-band scan for the 2.4 GHz/5 GHz, it > means the 2.4 GHz and 5 GHz scans are doing simultaneously. > > Set the flag IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS for WCN6855 since > it supports 2.4 GHz/5 GHz/6 GHz and it is single pdev which means > all the 2.4 GHz/5 GHz/6 GHz exist in the same wiphy/ieee80211_hw. > > Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1 > > Tested-by: Sven Eckelmann <sven@narfation.org> > Signed-off-by: Wen Gong <quic_wgong@quicinc.com> > Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Patch applied to ath-next branch of ath.git, thanks. 9f6da09a5f6a ath11k: enable IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS for WCN6855
diff --git a/drivers/net/wireless/ath/ath11k/mac.c b/drivers/net/wireless/ath/ath11k/mac.c index 06d20658586a..8218ea52f285 100644 --- a/drivers/net/wireless/ath/ath11k/mac.c +++ b/drivers/net/wireless/ath/ath11k/mac.c @@ -7915,6 +7915,9 @@ static int __ath11k_mac_register(struct ath11k *ar) ar->hw->wiphy->interface_modes = ab->hw_params.interface_modes; + if (ab->hw_params.single_pdev_only && ar->supports_6ghz) + ieee80211_hw_set(ar->hw, SINGLE_SCAN_ON_ALL_BANDS); + ieee80211_hw_set(ar->hw, SIGNAL_DBM); ieee80211_hw_set(ar->hw, SUPPORTS_PS); ieee80211_hw_set(ar->hw, SUPPORTS_DYNAMIC_PS);
Currently mac80211 will send 3 scan request for each scan of WCN6855, they are 2.4 GHz/5 GHz/6 GHz band scan. Firmware of WCN6855 will cache the RNR IE(Reduced Neighbor Report element) which exist in the beacon of 2.4 GHz/5 GHz of the AP which is co-located with 6 GHz, and then use the cache to scan in 6 GHz band scan if the 6 GHz scan is in the same scan with the 2.4 GHz/5 GHz band, this will helpful to search more AP of 6 GHz. Also it will decrease the time cost of scan because firmware will use dual-band scan for the 2.4 GHz/5 GHz, it means the 2.4 GHz and 5 GHz scans are doing simultaneously. Set the flag IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS for WCN6855 since it supports 2.4 GHz/5 GHz/6 GHz and it is single pdev which means all the 2.4 GHz/5 GHz/6 GHz exist in the same wiphy/ieee80211_hw. Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1 Signed-off-by: Wen Gong <quic_wgong@quicinc.com> --- drivers/net/wireless/ath/ath11k/mac.c | 3 +++ 1 file changed, 3 insertions(+)