diff mbox series

[05/10] wifi: ath12k: fix SLUB BUG - Object already free in ath12k_reg_free()

Message ID 20250109-fix_reboot_issues_with_hw_grouping-v1-5-fb39ec03451e@quicinc.com
State New
Headers show
Series wifi: ath12k: fixes for rmmod and recovery issues with hardware grouping | expand

Commit Message

Aditya Kumar Singh Jan. 9, 2025, 4:25 a.m. UTC
During rmmod of ath12k module with SLUB debug enabled, following print is
seen -

=============================================================================
BUG kmalloc-1k (Not tainted): Object already free
-----------------------------------------------------------------------------

Allocated in ath12k_reg_build_regd+0x94/0xa20 [ath12k] age=10470 cpu=0 pid=0
 __kmalloc_noprof+0xf4/0x368
 ath12k_reg_build_regd+0x94/0xa20 [ath12k]
 ath12k_wmi_op_rx+0x199c/0x2c14 [ath12k]
 ath12k_htc_rx_completion_handler+0x398/0x554 [ath12k]
 ath12k_ce_per_engine_service+0x248/0x368 [ath12k]
 ath12k_pci_ce_workqueue+0x28/0x50 [ath12k]
 process_one_work+0x14c/0x28c
 bh_worker+0x22c/0x27c
 workqueue_softirq_action+0x80/0x90
 tasklet_action+0x14/0x3c
 handle_softirqs+0x108/0x240
 __do_softirq+0x14/0x20
Freed in ath12k_reg_free+0x40/0x74 [ath12k] age=136 cpu=2 pid=166
 kfree+0x148/0x248
 ath12k_reg_free+0x40/0x74 [ath12k]
 ath12k_core_hw_group_destroy+0x68/0xac [ath12k]
 ath12k_core_deinit+0xd8/0x124 [ath12k]
 ath12k_pci_remove+0x6c/0x130 [ath12k]
 pci_device_remove+0x44/0xe8
 device_remove+0x4c/0x80
 device_release_driver_internal+0x1d0/0x22c
 driver_detach+0x50/0x98
 bus_remove_driver+0x70/0xf4
 driver_unregister+0x30/0x60
 pci_unregister_driver+0x24/0x9c
 ath12k_pci_exit+0x18/0x24 [ath12k]
 __arm64_sys_delete_module+0x1a0/0x2a8
 invoke_syscall+0x48/0x110
 el0_svc_common.constprop.0+0x40/0xe0
Slab 0xfffffdffc0033600 objects=10 used=6 fp=0xffff000000cdcc00 flags=0x3fffe0000000240(workingset|head|node=0|zone=0|lastcpupid=0x1ffff)
Object 0xffff000000cdcc00 @offset=19456 fp=0xffff000000cde400
[...]

This issue arises because in ath12k_core_hw_group_destroy(), each device
calls ath12k_core_soc_destroy() for itself and all its partners within the
same group. Since ath12k_core_hw_group_destroy() is invoked for each
device, this results in a double free condition, eventually causing the
SLUB bug.

To resolve this, a new member regd_freed is introduced in the ath12k_base
object. Once regd is freed, regd_freed is set to true. This ensures that
in the removal context of other devices, regd is not freed again if
regd_freed is already true. And since there could be a race condition to
read this member, guard ath12k_core_soc_destroy() with the mutext lock.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1

Fixes: 6f245ea0ec6c ("wifi: ath12k: introduce device group abstraction")
Signed-off-by: Aditya Kumar Singh <quic_adisi@quicinc.com>
---
 drivers/net/wireless/ath/ath12k/core.c | 2 ++
 drivers/net/wireless/ath/ath12k/core.h | 1 +
 drivers/net/wireless/ath/ath12k/reg.c  | 8 +++++++-
 drivers/net/wireless/ath/ath12k/wmi.c  | 4 +++-
 4 files changed, 13 insertions(+), 2 deletions(-)

Comments

Jeff Johnson Jan. 13, 2025, 7:21 p.m. UTC | #1
On 1/8/2025 8:25 PM, Aditya Kumar Singh wrote:
> During rmmod of ath12k module with SLUB debug enabled, following print is
> seen -
> 
> =============================================================================
> BUG kmalloc-1k (Not tainted): Object already free
> -----------------------------------------------------------------------------
> 
> Allocated in ath12k_reg_build_regd+0x94/0xa20 [ath12k] age=10470 cpu=0 pid=0
>  __kmalloc_noprof+0xf4/0x368
>  ath12k_reg_build_regd+0x94/0xa20 [ath12k]
>  ath12k_wmi_op_rx+0x199c/0x2c14 [ath12k]
>  ath12k_htc_rx_completion_handler+0x398/0x554 [ath12k]
>  ath12k_ce_per_engine_service+0x248/0x368 [ath12k]
>  ath12k_pci_ce_workqueue+0x28/0x50 [ath12k]
>  process_one_work+0x14c/0x28c
>  bh_worker+0x22c/0x27c
>  workqueue_softirq_action+0x80/0x90
>  tasklet_action+0x14/0x3c
>  handle_softirqs+0x108/0x240
>  __do_softirq+0x14/0x20
> Freed in ath12k_reg_free+0x40/0x74 [ath12k] age=136 cpu=2 pid=166
>  kfree+0x148/0x248
>  ath12k_reg_free+0x40/0x74 [ath12k]
>  ath12k_core_hw_group_destroy+0x68/0xac [ath12k]
>  ath12k_core_deinit+0xd8/0x124 [ath12k]
>  ath12k_pci_remove+0x6c/0x130 [ath12k]
>  pci_device_remove+0x44/0xe8
>  device_remove+0x4c/0x80
>  device_release_driver_internal+0x1d0/0x22c
>  driver_detach+0x50/0x98
>  bus_remove_driver+0x70/0xf4
>  driver_unregister+0x30/0x60
>  pci_unregister_driver+0x24/0x9c
>  ath12k_pci_exit+0x18/0x24 [ath12k]
>  __arm64_sys_delete_module+0x1a0/0x2a8
>  invoke_syscall+0x48/0x110
>  el0_svc_common.constprop.0+0x40/0xe0
> Slab 0xfffffdffc0033600 objects=10 used=6 fp=0xffff000000cdcc00 flags=0x3fffe0000000240(workingset|head|node=0|zone=0|lastcpupid=0x1ffff)
> Object 0xffff000000cdcc00 @offset=19456 fp=0xffff000000cde400
> [...]
> 
> This issue arises because in ath12k_core_hw_group_destroy(), each device
> calls ath12k_core_soc_destroy() for itself and all its partners within the
> same group. Since ath12k_core_hw_group_destroy() is invoked for each
> device, this results in a double free condition, eventually causing the
> SLUB bug.
> 
> To resolve this, a new member regd_freed is introduced in the ath12k_base
> object. Once regd is freed, regd_freed is set to true. This ensures that
> in the removal context of other devices, regd is not freed again if
> regd_freed is already true. And since there could be a race condition to
> read this member, guard ath12k_core_soc_destroy() with the mutext lock.
> 
> Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1
> 
> Fixes: 6f245ea0ec6c ("wifi: ath12k: introduce device group abstraction")
> Signed-off-by: Aditya Kumar Singh <quic_adisi@quicinc.com>
> ---
>  drivers/net/wireless/ath/ath12k/core.c | 2 ++
>  drivers/net/wireless/ath/ath12k/core.h | 1 +
>  drivers/net/wireless/ath/ath12k/reg.c  | 8 +++++++-
>  drivers/net/wireless/ath/ath12k/wmi.c  | 4 +++-
>  4 files changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/wireless/ath/ath12k/core.c b/drivers/net/wireless/ath/ath12k/core.c
> index 299d7686616b78752164d9cb064c1805af9a1155..72e6e3a0cf7be03b20b7421866c479dfcb8038ff 100644
> --- a/drivers/net/wireless/ath/ath12k/core.c
> +++ b/drivers/net/wireless/ath/ath12k/core.c
> @@ -1757,7 +1757,9 @@ static void ath12k_core_hw_group_destroy(struct ath12k_hw_group *ag)
>  		if (!ab)
>  			continue;
>  
> +		mutex_lock(&ab->core_lock);
>  		ath12k_core_soc_destroy(ab);
> +		mutex_unlock(&ab->core_lock);
>  	}
>  	mutex_unlock(&ag->mutex);
>  }
> diff --git a/drivers/net/wireless/ath/ath12k/core.h b/drivers/net/wireless/ath/ath12k/core.h
> index 58ebc56991af99de08e8ed783e98f742a687eddf..cc1bfcc1e65c87e30d86dad4c0bcd1905e6a2f51 100644
> --- a/drivers/net/wireless/ath/ath12k/core.h
> +++ b/drivers/net/wireless/ath/ath12k/core.h
> @@ -961,6 +961,7 @@ struct ath12k_base {
>  	 * This may or may not be used during the runtime
>  	 */
>  	struct ieee80211_regdomain *new_regd[MAX_RADIOS];
> +	bool regd_freed;
>  
>  	/* Current DFS Regulatory */
>  	enum ath12k_dfs_region dfs_region;
> diff --git a/drivers/net/wireless/ath/ath12k/reg.c b/drivers/net/wireless/ath/ath12k/reg.c
> index 439d61f284d89222e79c05d6cff8e85d0d315aad..b4d7fa1a04ca0e72728e8989c29b82d089171fc2 100644
> --- a/drivers/net/wireless/ath/ath12k/reg.c
> +++ b/drivers/net/wireless/ath/ath12k/reg.c
> @@ -1,7 +1,7 @@
>  // SPDX-License-Identifier: BSD-3-Clause-Clear
>  /*
>   * Copyright (c) 2018-2021 The Linux Foundation. All rights reserved.
> - * Copyright (c) 2021-2024 Qualcomm Innovation Center, Inc. All rights reserved.
> + * Copyright (c) 2021-2025 Qualcomm Innovation Center, Inc. All rights reserved.
>   */
>  #include <linux/rtnetlink.h>
>  #include "core.h"
> @@ -777,8 +777,14 @@ void ath12k_reg_free(struct ath12k_base *ab)
>  {
>  	int i;
>  
> +	if (ab->regd_freed)
> +		return;
> +
>  	for (i = 0; i < ab->hw_params->max_radios; i++) {
>  		kfree(ab->default_regd[i]);
>  		kfree(ab->new_regd[i]);
> +		ab->default_regd[i] = NULL;
> +		ab->new_regd[i] = NULL;
> +		ab->regd_freed = true;

since it is loop invariant, should this last assignment be outside the loop,
either before or after the loop?

but then again, why is a flag needed since setting the pointers to NULL should
already show they are freed, and any race conditions with those pointers would
also exist with the new flag (which you have addressed with the locking change).

>  	}
>  }
> diff --git a/drivers/net/wireless/ath/ath12k/wmi.c b/drivers/net/wireless/ath/ath12k/wmi.c
> index 4dd6cdf84571d3652cd03281ffa6486e3d340c42..1de6ed6cceaee3a22de63a2369358fe53fb0d638 100644
> --- a/drivers/net/wireless/ath/ath12k/wmi.c
> +++ b/drivers/net/wireless/ath/ath12k/wmi.c
> @@ -1,7 +1,7 @@
>  // SPDX-License-Identifier: BSD-3-Clause-Clear
>  /*
>   * Copyright (c) 2018-2021 The Linux Foundation. All rights reserved.
> - * Copyright (c) 2021-2024 Qualcomm Innovation Center, Inc. All rights reserved.
> + * Copyright (c) 2021-2025 Qualcomm Innovation Center, Inc. All rights reserved.
>   */
>  #include <linux/skbuff.h>
>  #include <linux/ctype.h>
> @@ -5950,6 +5950,8 @@ static int ath12k_reg_chan_list_event(struct ath12k_base *ab, struct sk_buff *sk
>  		/* This regd would be applied during mac registration */
>  		ab->default_regd[pdev_idx] = regd;
>  	}
> +
> +	ab->regd_freed = false;
>  	ab->dfs_region = reg_info->dfs_region;
>  	spin_unlock(&ab->base_lock);
>  
>
Aditya Kumar Singh Jan. 20, 2025, 8:38 a.m. UTC | #2
On 1/14/25 00:51, Jeff Johnson wrote:
>> diff --git a/drivers/net/wireless/ath/ath12k/reg.c b/drivers/net/wireless/ath/ath12k/reg.c
>> index 439d61f284d89222e79c05d6cff8e85d0d315aad..b4d7fa1a04ca0e72728e8989c29b82d089171fc2 100644
>> --- a/drivers/net/wireless/ath/ath12k/reg.c
>> +++ b/drivers/net/wireless/ath/ath12k/reg.c
>> @@ -1,7 +1,7 @@
>>   // SPDX-License-Identifier: BSD-3-Clause-Clear
>>   /*
>>    * Copyright (c) 2018-2021 The Linux Foundation. All rights reserved.
>> - * Copyright (c) 2021-2024 Qualcomm Innovation Center, Inc. All rights reserved.
>> + * Copyright (c) 2021-2025 Qualcomm Innovation Center, Inc. All rights reserved.
>>    */
>>   #include <linux/rtnetlink.h>
>>   #include "core.h"
>> @@ -777,8 +777,14 @@ void ath12k_reg_free(struct ath12k_base *ab)
>>   {
>>   	int i;
>>   
>> +	if (ab->regd_freed)
>> +		return;
>> +
>>   	for (i = 0; i < ab->hw_params->max_radios; i++) {
>>   		kfree(ab->default_regd[i]);
>>   		kfree(ab->new_regd[i]);
>> +		ab->default_regd[i] = NULL;
>> +		ab->new_regd[i] = NULL;
>> +		ab->regd_freed = true;
> since it is loop invariant, should this last assignment be outside the loop,
> either before or after the loop?
> 
> but then again, why is a flag needed since setting the pointers to NULL should
> already show they are freed, and any race conditions with those pointers would
> also exist with the new flag (which you have addressed with the locking change).

Well, looks like, this flag is not needed. I will remove this in next 
version. Thanks for pointing it out!
diff mbox series

Patch

diff --git a/drivers/net/wireless/ath/ath12k/core.c b/drivers/net/wireless/ath/ath12k/core.c
index 299d7686616b78752164d9cb064c1805af9a1155..72e6e3a0cf7be03b20b7421866c479dfcb8038ff 100644
--- a/drivers/net/wireless/ath/ath12k/core.c
+++ b/drivers/net/wireless/ath/ath12k/core.c
@@ -1757,7 +1757,9 @@  static void ath12k_core_hw_group_destroy(struct ath12k_hw_group *ag)
 		if (!ab)
 			continue;
 
+		mutex_lock(&ab->core_lock);
 		ath12k_core_soc_destroy(ab);
+		mutex_unlock(&ab->core_lock);
 	}
 	mutex_unlock(&ag->mutex);
 }
diff --git a/drivers/net/wireless/ath/ath12k/core.h b/drivers/net/wireless/ath/ath12k/core.h
index 58ebc56991af99de08e8ed783e98f742a687eddf..cc1bfcc1e65c87e30d86dad4c0bcd1905e6a2f51 100644
--- a/drivers/net/wireless/ath/ath12k/core.h
+++ b/drivers/net/wireless/ath/ath12k/core.h
@@ -961,6 +961,7 @@  struct ath12k_base {
 	 * This may or may not be used during the runtime
 	 */
 	struct ieee80211_regdomain *new_regd[MAX_RADIOS];
+	bool regd_freed;
 
 	/* Current DFS Regulatory */
 	enum ath12k_dfs_region dfs_region;
diff --git a/drivers/net/wireless/ath/ath12k/reg.c b/drivers/net/wireless/ath/ath12k/reg.c
index 439d61f284d89222e79c05d6cff8e85d0d315aad..b4d7fa1a04ca0e72728e8989c29b82d089171fc2 100644
--- a/drivers/net/wireless/ath/ath12k/reg.c
+++ b/drivers/net/wireless/ath/ath12k/reg.c
@@ -1,7 +1,7 @@ 
 // SPDX-License-Identifier: BSD-3-Clause-Clear
 /*
  * Copyright (c) 2018-2021 The Linux Foundation. All rights reserved.
- * Copyright (c) 2021-2024 Qualcomm Innovation Center, Inc. All rights reserved.
+ * Copyright (c) 2021-2025 Qualcomm Innovation Center, Inc. All rights reserved.
  */
 #include <linux/rtnetlink.h>
 #include "core.h"
@@ -777,8 +777,14 @@  void ath12k_reg_free(struct ath12k_base *ab)
 {
 	int i;
 
+	if (ab->regd_freed)
+		return;
+
 	for (i = 0; i < ab->hw_params->max_radios; i++) {
 		kfree(ab->default_regd[i]);
 		kfree(ab->new_regd[i]);
+		ab->default_regd[i] = NULL;
+		ab->new_regd[i] = NULL;
+		ab->regd_freed = true;
 	}
 }
diff --git a/drivers/net/wireless/ath/ath12k/wmi.c b/drivers/net/wireless/ath/ath12k/wmi.c
index 4dd6cdf84571d3652cd03281ffa6486e3d340c42..1de6ed6cceaee3a22de63a2369358fe53fb0d638 100644
--- a/drivers/net/wireless/ath/ath12k/wmi.c
+++ b/drivers/net/wireless/ath/ath12k/wmi.c
@@ -1,7 +1,7 @@ 
 // SPDX-License-Identifier: BSD-3-Clause-Clear
 /*
  * Copyright (c) 2018-2021 The Linux Foundation. All rights reserved.
- * Copyright (c) 2021-2024 Qualcomm Innovation Center, Inc. All rights reserved.
+ * Copyright (c) 2021-2025 Qualcomm Innovation Center, Inc. All rights reserved.
  */
 #include <linux/skbuff.h>
 #include <linux/ctype.h>
@@ -5950,6 +5950,8 @@  static int ath12k_reg_chan_list_event(struct ath12k_base *ab, struct sk_buff *sk
 		/* This regd would be applied during mac registration */
 		ab->default_regd[pdev_idx] = regd;
 	}
+
+	ab->regd_freed = false;
 	ab->dfs_region = reg_info->dfs_region;
 	spin_unlock(&ab->base_lock);