diff mbox series

[v2,3/3] wifi: ath12k: Add lock to protect the hardware state

Message ID 20240424065646.1666166-4-quic_periyasa@quicinc.com
State New
Headers show
Series wifi: ath12k: Refactor the hardware recovery procedures | expand

Commit Message

Karthikeyan Periyasamy April 24, 2024, 6:56 a.m. UTC
Currently, hardware state is not protected across the reconfigure
operations. However, in single wiphy models, multiple radio/links is
exposed as a MAC hardware (ieee80211_hw) through the driver hardware
abstraction (ath12k_hw) layer. In such scenario, we need to protect
hardware state across the multiple radio/link at the driver hardware
abstraction (ath12k_hw) layer. Therefore, introduce a new mutex in
the ath12k_hw layer.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1

Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
---
 drivers/net/wireless/ath/ath12k/core.c |  4 ++++
 drivers/net/wireless/ath/ath12k/core.h |  6 ++++++
 drivers/net/wireless/ath/ath12k/mac.c  | 29 ++++++++++++++++++++++++--
 3 files changed, 37 insertions(+), 2 deletions(-)

Comments

Johannes Berg April 24, 2024, 7:28 a.m. UTC | #1
On Wed, 2024-04-24 at 12:26 +0530, Karthikeyan Periyasamy wrote:
> Currently, hardware state is not protected across the reconfigure
> operations. However, in single wiphy models, multiple radio/links is
> exposed as a MAC hardware (ieee80211_hw) through the driver hardware
> abstraction (ath12k_hw) layer. In such scenario, we need to protect
> hardware state across the multiple radio/link at the driver hardware
> abstraction (ath12k_hw) layer. Therefore, introduce a new mutex in
> the ath12k_hw layer.
> 

It's your driver, but ... it would seem _simpler_ to do locking across
the hw with a single wiphy model, because everything (except currently
for ath12k_core_reset and ath12k_core_restart) already holds the wiphy
mutex. You can probably move those to wiphy work.

I'd avoid doing lock explosion like we had in mac80211, it's going to
come back and bite you eventually :)

johannes
Kalle Valo April 24, 2024, 9:39 a.m. UTC | #2
Johannes Berg <johannes@sipsolutions.net> writes:

> On Wed, 2024-04-24 at 12:26 +0530, Karthikeyan Periyasamy wrote:
>> Currently, hardware state is not protected across the reconfigure
>> operations. However, in single wiphy models, multiple radio/links is
>> exposed as a MAC hardware (ieee80211_hw) through the driver hardware
>> abstraction (ath12k_hw) layer. In such scenario, we need to protect
>> hardware state across the multiple radio/link at the driver hardware
>> abstraction (ath12k_hw) layer. Therefore, introduce a new mutex in
>> the ath12k_hw layer.
>> 
>
> It's your driver, but ... it would seem _simpler_ to do locking across
> the hw with a single wiphy model, because everything (except currently
> for ath12k_core_reset and ath12k_core_restart) already holds the wiphy
> mutex. You can probably move those to wiphy work.
>
> I'd avoid doing lock explosion like we had in mac80211, it's going to
> come back and bite you eventually :)

Exactly. Swithing to use wiphy_lock() definitely one of my longterm
goals in ath12k. I already started working on switching ath12k to use
wiphy_lock() but IIRC I got blocked by not being able to call
wiphy_delayed_work_cancel() without taking wiphy_lock. One way to solve
that might be to convert WMI event handling to use wiphy work but didn't
have the time to work on that so I put it on the backborner. I should
resurrect that branch and submit it as RFC.

But for the time being I think we need to add another mutex to ath12k, I
don't want to block Karthikeyan for too long.
Kalle Valo April 24, 2024, 9:50 a.m. UTC | #3
Johannes Berg <johannes@sipsolutions.net> writes:

> On Wed, 2024-04-24 at 12:39 +0300, Kalle Valo wrote:
>> 
>> Exactly. Swithing to use wiphy_lock() definitely one of my longterm
>> goals in ath12k. I already started working on switching ath12k to use
>> wiphy_lock() but IIRC I got blocked by not being able to call
>> wiphy_delayed_work_cancel() without taking wiphy_lock.
>
> That's because I didn't want to have an async version, but theoretically
> we could have a version of it that just cancels the timer. If you don't
> hold the wiphy mutex already you could even wait for it to finish. It
> just adds more complexity there, and I was trying to make it all a lot
> more obvious :)

Yeah, understandable. I think changing ath12k WMI event handling to use
wiphy_work makes sense, running them in tasklet context feels overkill.
I just need to write the patch :) But after that I hope we can switch to
using wiphy mutex.
Kalle Valo April 24, 2024, 12:16 p.m. UTC | #4
Johannes Berg <johannes@sipsolutions.net> writes:

> On Wed, 2024-04-24 at 12:50 +0300, Kalle Valo wrote:
>> Johannes Berg <johannes@sipsolutions.net> writes:
>> 
>> > On Wed, 2024-04-24 at 12:39 +0300, Kalle Valo wrote:
>> > > 
>> > > Exactly. Swithing to use wiphy_lock() definitely one of my longterm
>> > > goals in ath12k. I already started working on switching ath12k to use
>> > > wiphy_lock() but IIRC I got blocked by not being able to call
>> > > wiphy_delayed_work_cancel() without taking wiphy_lock.
>> > 
>> > That's because I didn't want to have an async version, but theoretically
>> > we could have a version of it that just cancels the timer. If you don't
>> > hold the wiphy mutex already you could even wait for it to finish. It
>> > just adds more complexity there, and I was trying to make it all a lot
>> > more obvious :)
>> 
>> Yeah, understandable. I think changing ath12k WMI event handling to use
>> wiphy_work makes sense, running them in tasklet context feels overkill.
>
> Maybe. Note that the wiphy lock _can_ create delays etc. here, if you're
> doing other configuration work at the same time, or similar. Though I
> guess eventually you need locking there anyway, to access the HW. So it
> can be a bit of a trade-off.
>
> I expect this will evolve over time even in mac80211, though so far we
> haven't seen any bad effect from it.

Good point. And now that I think of this more I don't think we can use
wiphy_work with WMI events as we are waiting some events while holding
wiphy_lock, that would end up as deadlock. So most likely we need to use
a normal workqueue with WMI events and take wiphy_lock manually in
certain cases. But I'll need to investigate more and do some
experiments.
Karthikeyan Periyasamy April 25, 2024, 6:58 a.m. UTC | #5
On 4/25/2024 2:11 AM, Jeff Johnson wrote:
> On 4/23/2024 11:56 PM, Karthikeyan Periyasamy wrote:
>> Currently, hardware state is not protected across the reconfigure
>> operations. However, in single wiphy models, multiple radio/links is
>> exposed as a MAC hardware (ieee80211_hw) through the driver hardware
>> abstraction (ath12k_hw) layer. In such scenario, we need to protect
>> hardware state across the multiple radio/link at the driver hardware
>> abstraction (ath12k_hw) layer. Therefore, introduce a new mutex in
>> the ath12k_hw layer.
>>
>> Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1
>>
>> Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
>> ---
>>   drivers/net/wireless/ath/ath12k/core.c |  4 ++++
>>   drivers/net/wireless/ath/ath12k/core.h |  6 ++++++
>>   drivers/net/wireless/ath/ath12k/mac.c  | 29 ++++++++++++++++++++++++--
>>   3 files changed, 37 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/wireless/ath/ath12k/core.c b/drivers/net/wireless/ath/ath12k/core.c
>> index a685cfd6fd92..e9aabdb9341c 100644
>> --- a/drivers/net/wireless/ath/ath12k/core.c
>> +++ b/drivers/net/wireless/ath/ath12k/core.c
>> @@ -1048,6 +1048,8 @@ static void ath12k_core_post_reconfigure_recovery(struct ath12k_base *ab)
>>   		if (!ah || ah->state == ATH12K_HW_STATE_OFF)
>>   			continue;
>>   
>> +		mutex_lock(&ah->hw_mutex);
>> +
>>   		switch (ah->state) {
>>   		case ATH12K_HW_STATE_ON:
>>   			ah->state = ATH12K_HW_STATE_RESTARTING;
>> @@ -1078,6 +1080,8 @@ static void ath12k_core_post_reconfigure_recovery(struct ath12k_base *ab)
>>   				    "device is wedged, will not restart hw %d\n", i);
>>   			break;
>>   		}
>> +
>> +		mutex_unlock(&ah->hw_mutex);
>>   	}
>>   
>>   	complete(&ab->driver_recovery);
>> diff --git a/drivers/net/wireless/ath/ath12k/core.h b/drivers/net/wireless/ath/ath12k/core.h
>> index eff1093fbb6e..3e3e157b6c56 100644
>> --- a/drivers/net/wireless/ath/ath12k/core.h
>> +++ b/drivers/net/wireless/ath/ath12k/core.h
>> @@ -634,11 +634,17 @@ struct ath12k {
>>   struct ath12k_hw {
>>   	struct ieee80211_hw *hw;
>>   	struct ath12k_base *ab;
>> +
>> +	/* Protect the write operation of the hardware state ath12k_hw::state
>> +	 * between hardware start<=>reconfigure<=>stop transitions.
>> +	 */
>> +	struct mutex hw_mutex;
>>   	enum ath12k_hw_state state;
>>   	bool regd_updated;
>>   	bool use_6ghz_regd;
>>   	u8 num_radio;
>>   
>> +	/* Keep last */
>>   	struct ath12k radio[] __aligned(sizeof(void *));
>>   };
>>   
>> diff --git a/drivers/net/wireless/ath/ath12k/mac.c b/drivers/net/wireless/ath/ath12k/mac.c
>> index 8b003c18796d..3dc95d36b6a2 100644
>> --- a/drivers/net/wireless/ath/ath12k/mac.c
>> +++ b/drivers/net/wireless/ath/ath12k/mac.c
>> @@ -5546,10 +5546,13 @@ static void ath12k_mac_wait_reconfigure(struct ath12k_base *ab)
>>   
>>   static int ath12k_mac_start(struct ath12k *ar)
>>   {
>> +	struct ath12k_hw *ah = ar->ah;
>>   	struct ath12k_base *ab = ar->ab;
>>   	struct ath12k_pdev *pdev = ar->pdev;
>>   	int ret;
>>   
>> +	lockdep_assert_held(&ah->hw_mutex);
>> +
>>   	mutex_lock(&ar->conf_mutex);
>>   
>>   	ret = ath12k_wmi_pdev_set_param(ar, WMI_PDEV_PARAM_PMF_QOS,
>> @@ -5664,6 +5667,8 @@ static int ath12k_mac_op_start(struct ieee80211_hw *hw)
>>   
>>   	ath12k_drain_tx(ah);
>>   
>> +	mutex_lock(&ah->hw_mutex);
> 
> since this is always unlocked just before we return, it is a good candidate
> for the cleanup.h functionality. just change this to:
> 	guard(mutex)(&ah->hw_mutex);
> 
> and remove all of the other edits to this function. the mutex will always be
> unlocked when the function returns
> 

sure, will address this comment in the next version of the patch.


>> +
>>   	switch (ah->state) {
>>   	case ATH12K_HW_STATE_OFF:
>>   		ah->state = ATH12K_HW_STATE_ON;
>> @@ -5678,7 +5683,8 @@ static int ath12k_mac_op_start(struct ieee80211_hw *hw)
>>   		ah->state = ATH12K_HW_STATE_OFF;
>>   
>>   		WARN_ON(1);
>> -		return -EINVAL;
>> +		ret = -EINVAL;
>> +		goto err;
>>   	}
>>   
>>   	for_each_ar(ah, ar, i) {
>> @@ -5692,6 +5698,8 @@ static int ath12k_mac_op_start(struct ieee80211_hw *hw)
>>   		}
>>   	}
>>   
>> +	mutex_unlock(&ah->hw_mutex);
>> +
>>   	return 0;
>>   
>>   fail_start:
>> @@ -5700,6 +5708,8 @@ static int ath12k_mac_op_start(struct ieee80211_hw *hw)
>>   		ath12k_mac_stop(ar);
>>   	}
>>   
>> +err:
>> +	mutex_unlock(&ah->hw_mutex);
>>   	return ret;
>>   }
>>   
>> @@ -5762,9 +5772,12 @@ int ath12k_mac_rfkill_enable_radio(struct ath12k *ar, bool enable)
>>   
>>   static void ath12k_mac_stop(struct ath12k *ar)
>>   {
>> +	struct ath12k_hw *ah = ar->ah;
>>   	struct htt_ppdu_stats_info *ppdu_stats, *tmp;
>>   	int ret;
>>   
>> +	lockdep_assert_held(&ah->hw_mutex);
>> +
>>   	mutex_lock(&ar->conf_mutex);
>>   	ret = ath12k_mac_config_mon_status_default(ar, false);
>>   	if (ret && (ret != -EOPNOTSUPP))
>> @@ -5800,10 +5813,14 @@ static void ath12k_mac_op_stop(struct ieee80211_hw *hw)
>>   
>>   	ath12k_drain_tx(ah);
>>   
>> +	mutex_lock(&ah->hw_mutex);
>> +
>>   	ah->state = ATH12K_HW_STATE_OFF;
>>   
>>   	for_each_ar(ah, ar, i)
>>   		ath12k_mac_stop(ar);
>> +
>> +	mutex_unlock(&ah->hw_mutex);
>>   }
>>   
>>   static u8
>> @@ -7848,8 +7865,12 @@ ath12k_mac_op_reconfig_complete(struct ieee80211_hw *hw,
>>   	if (reconfig_type != IEEE80211_RECONFIG_TYPE_RESTART)
>>   		return;
>>   
>> -	if (ah->state != ATH12K_HW_STATE_RESTARTED)
>> +	mutex_lock(&ah->hw_mutex);
> 
> another good candidate for
> 	guard(mutex)(&ah->hw_mutex);

sure, will address this comment in the next version of the patch.

> 
>> +
>> +	if (ah->state != ATH12K_HW_STATE_RESTARTED) {
>> +		mutex_unlock(&ah->hw_mutex);
>>   		return;
>> +	}
>>   
>>   	ah->state = ATH12K_HW_STATE_ON;
>>   	ieee80211_wake_queues(hw);
>> @@ -7904,6 +7925,8 @@ ath12k_mac_op_reconfig_complete(struct ieee80211_hw *hw,
>>   
>>   		mutex_unlock(&ar->conf_mutex);
>>   	}
>> +
>> +	mutex_unlock(&ah->hw_mutex);
>>   }
>>   
>>   static void
>> @@ -8850,6 +8873,8 @@ static struct ath12k_hw *ath12k_mac_hw_allocate(struct ath12k_base *ab,
>>   	ah->ab = ab;
>>   	ah->num_radio = num_pdev_map;
>>   
>> +	mutex_init(&ah->hw_mutex);
>> +
>>   	for (i = 0; i < num_pdev_map; i++) {
>>   		ab = pdev_map[i].ab;
>>   		pdev_idx = pdev_map[i].pdev_idx;
>
diff mbox series

Patch

diff --git a/drivers/net/wireless/ath/ath12k/core.c b/drivers/net/wireless/ath/ath12k/core.c
index a685cfd6fd92..e9aabdb9341c 100644
--- a/drivers/net/wireless/ath/ath12k/core.c
+++ b/drivers/net/wireless/ath/ath12k/core.c
@@ -1048,6 +1048,8 @@  static void ath12k_core_post_reconfigure_recovery(struct ath12k_base *ab)
 		if (!ah || ah->state == ATH12K_HW_STATE_OFF)
 			continue;
 
+		mutex_lock(&ah->hw_mutex);
+
 		switch (ah->state) {
 		case ATH12K_HW_STATE_ON:
 			ah->state = ATH12K_HW_STATE_RESTARTING;
@@ -1078,6 +1080,8 @@  static void ath12k_core_post_reconfigure_recovery(struct ath12k_base *ab)
 				    "device is wedged, will not restart hw %d\n", i);
 			break;
 		}
+
+		mutex_unlock(&ah->hw_mutex);
 	}
 
 	complete(&ab->driver_recovery);
diff --git a/drivers/net/wireless/ath/ath12k/core.h b/drivers/net/wireless/ath/ath12k/core.h
index eff1093fbb6e..3e3e157b6c56 100644
--- a/drivers/net/wireless/ath/ath12k/core.h
+++ b/drivers/net/wireless/ath/ath12k/core.h
@@ -634,11 +634,17 @@  struct ath12k {
 struct ath12k_hw {
 	struct ieee80211_hw *hw;
 	struct ath12k_base *ab;
+
+	/* Protect the write operation of the hardware state ath12k_hw::state
+	 * between hardware start<=>reconfigure<=>stop transitions.
+	 */
+	struct mutex hw_mutex;
 	enum ath12k_hw_state state;
 	bool regd_updated;
 	bool use_6ghz_regd;
 	u8 num_radio;
 
+	/* Keep last */
 	struct ath12k radio[] __aligned(sizeof(void *));
 };
 
diff --git a/drivers/net/wireless/ath/ath12k/mac.c b/drivers/net/wireless/ath/ath12k/mac.c
index 8b003c18796d..3dc95d36b6a2 100644
--- a/drivers/net/wireless/ath/ath12k/mac.c
+++ b/drivers/net/wireless/ath/ath12k/mac.c
@@ -5546,10 +5546,13 @@  static void ath12k_mac_wait_reconfigure(struct ath12k_base *ab)
 
 static int ath12k_mac_start(struct ath12k *ar)
 {
+	struct ath12k_hw *ah = ar->ah;
 	struct ath12k_base *ab = ar->ab;
 	struct ath12k_pdev *pdev = ar->pdev;
 	int ret;
 
+	lockdep_assert_held(&ah->hw_mutex);
+
 	mutex_lock(&ar->conf_mutex);
 
 	ret = ath12k_wmi_pdev_set_param(ar, WMI_PDEV_PARAM_PMF_QOS,
@@ -5664,6 +5667,8 @@  static int ath12k_mac_op_start(struct ieee80211_hw *hw)
 
 	ath12k_drain_tx(ah);
 
+	mutex_lock(&ah->hw_mutex);
+
 	switch (ah->state) {
 	case ATH12K_HW_STATE_OFF:
 		ah->state = ATH12K_HW_STATE_ON;
@@ -5678,7 +5683,8 @@  static int ath12k_mac_op_start(struct ieee80211_hw *hw)
 		ah->state = ATH12K_HW_STATE_OFF;
 
 		WARN_ON(1);
-		return -EINVAL;
+		ret = -EINVAL;
+		goto err;
 	}
 
 	for_each_ar(ah, ar, i) {
@@ -5692,6 +5698,8 @@  static int ath12k_mac_op_start(struct ieee80211_hw *hw)
 		}
 	}
 
+	mutex_unlock(&ah->hw_mutex);
+
 	return 0;
 
 fail_start:
@@ -5700,6 +5708,8 @@  static int ath12k_mac_op_start(struct ieee80211_hw *hw)
 		ath12k_mac_stop(ar);
 	}
 
+err:
+	mutex_unlock(&ah->hw_mutex);
 	return ret;
 }
 
@@ -5762,9 +5772,12 @@  int ath12k_mac_rfkill_enable_radio(struct ath12k *ar, bool enable)
 
 static void ath12k_mac_stop(struct ath12k *ar)
 {
+	struct ath12k_hw *ah = ar->ah;
 	struct htt_ppdu_stats_info *ppdu_stats, *tmp;
 	int ret;
 
+	lockdep_assert_held(&ah->hw_mutex);
+
 	mutex_lock(&ar->conf_mutex);
 	ret = ath12k_mac_config_mon_status_default(ar, false);
 	if (ret && (ret != -EOPNOTSUPP))
@@ -5800,10 +5813,14 @@  static void ath12k_mac_op_stop(struct ieee80211_hw *hw)
 
 	ath12k_drain_tx(ah);
 
+	mutex_lock(&ah->hw_mutex);
+
 	ah->state = ATH12K_HW_STATE_OFF;
 
 	for_each_ar(ah, ar, i)
 		ath12k_mac_stop(ar);
+
+	mutex_unlock(&ah->hw_mutex);
 }
 
 static u8
@@ -7848,8 +7865,12 @@  ath12k_mac_op_reconfig_complete(struct ieee80211_hw *hw,
 	if (reconfig_type != IEEE80211_RECONFIG_TYPE_RESTART)
 		return;
 
-	if (ah->state != ATH12K_HW_STATE_RESTARTED)
+	mutex_lock(&ah->hw_mutex);
+
+	if (ah->state != ATH12K_HW_STATE_RESTARTED) {
+		mutex_unlock(&ah->hw_mutex);
 		return;
+	}
 
 	ah->state = ATH12K_HW_STATE_ON;
 	ieee80211_wake_queues(hw);
@@ -7904,6 +7925,8 @@  ath12k_mac_op_reconfig_complete(struct ieee80211_hw *hw,
 
 		mutex_unlock(&ar->conf_mutex);
 	}
+
+	mutex_unlock(&ah->hw_mutex);
 }
 
 static void
@@ -8850,6 +8873,8 @@  static struct ath12k_hw *ath12k_mac_hw_allocate(struct ath12k_base *ab,
 	ah->ab = ab;
 	ah->num_radio = num_pdev_map;
 
+	mutex_init(&ah->hw_mutex);
+
 	for (i = 0; i < num_pdev_map; i++) {
 		ab = pdev_map[i].ab;
 		pdev_idx = pdev_map[i].pdev_idx;