mbox series

[0/5] ath10k fixes for warns

Message ID cover.1612915444.git.skhan@linuxfoundation.org
Headers show
Series ath10k fixes for warns | expand

Message

Shuah Khan Feb. 10, 2021, 12:42 a.m. UTC
I have been seeing lockdep asserts for a couple of months and finally
found time to debug and fix the problems. The dmesg looks clean with
these fixes.

Enabling LOCKDEP and ATH10K_DEBUGFS triggers the lockdep assert and
RCU warns.

The first two patches in this series are fixes to lockdep assert and
RCU usage bugs.

The last patch (5/5) is a fix to reduce invalid ht params rate message
noise. Patch 3/4 changes a message from debug to warn. Patch 4 adds
detect to assert not calling ath10k_drain_tx() holding conf_mutex.

Shuah Khan (5):
  ath10k: fix conf_mutex lock assert in ath10k_debug_fw_stats_request()
  ath10k: fix WARNING: suspicious RCU usage
  ath10k: change ath10k_offchan_tx_work() peer present msg to a warn
  ath10k: detect conf_mutex held ath10k_drain_tx() calls
  ath10k: reduce invalid ht params rate message noise

 drivers/net/wireless/ath/ath10k/mac.c     | 13 ++++++++-----
 drivers/net/wireless/ath/ath10k/wmi-tlv.c | 15 +++++++++++----
 2 files changed, 19 insertions(+), 9 deletions(-)

Comments

Shuah Khan Feb. 10, 2021, 4:13 p.m. UTC | #1
On 2/10/21 1:28 AM, Kalle Valo wrote:
> Wen Gong <wgong@codeaurora.org> writes:

> 

>> On 2021-02-10 08:42, Shuah Khan wrote:

>>> ath10k_mac_get_rate_flags_ht() floods dmesg with the following

>>> messages,

>>> when it fails to find a match for mcs=7 and rate=1440.

>>>

>>> supported_ht_mcs_rate_nss2:

>>> {7,  {1300, 2700, 1444, 3000} }

>>>

>>> ath10k_pci 0000:02:00.0: invalid ht params rate 1440 100kbps nss 2

>>> mcs 7

>>>

>>> dev_warn_ratelimited() isn't helping the noise. Use dev_warn_once()

>>> instead.

>>>

>>> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>

>>> ---

>>>   drivers/net/wireless/ath/ath10k/mac.c | 5 +++--

>>>   1 file changed, 3 insertions(+), 2 deletions(-)

>>>

>>> diff --git a/drivers/net/wireless/ath/ath10k/mac.c

>>> b/drivers/net/wireless/ath/ath10k/mac.c

>>> index 3545ce7dce0a..276321f0cfdd 100644

>>> --- a/drivers/net/wireless/ath/ath10k/mac.c

>>> +++ b/drivers/net/wireless/ath/ath10k/mac.c

>>> @@ -8970,8 +8970,9 @@ static void ath10k_mac_get_rate_flags_ht(struct

>>> ath10k *ar, u32 rate, u8 nss, u8

>>>   		*bw |= RATE_INFO_BW_40;

>>>   		*flags |= RATE_INFO_FLAGS_SHORT_GI;

>>>   	} else {

>>> -		ath10k_warn(ar, "invalid ht params rate %d 100kbps nss %d mcs %d",

>>> -			    rate, nss, mcs);

>>> +		dev_warn_once(ar->dev,

>>> +			      "invalid ht params rate %d 100kbps nss %d mcs %d",

>>> +			      rate, nss, mcs);

>>>   	}

>>>   }

>>

>> The {7,  {1300, 2700, 1444, 3000} } is a correct value.

>> The 1440 is report from firmware, its a wrong value, it has fixed in

>> firmware.

> 

> In what version?

> 


Here is the info:

ath10k_pci 0000:02:00.0: qca6174 hw3.2 target 0x05030000 chip_id 
0x00340aff sub 17aa:0827

ath10k_pci 0000:02:00.0: firmware ver WLAN.RM.4.4.1-00140-QCARMSWPZ-1 
api 6 features wowlan,ignore-otp,mfp crc32 29eb8ca1

ath10k_pci 0000:02:00.0: board_file api 2 bmi_id N/A crc32 4ac0889b

ath10k_pci 0000:02:00.0: htt-ver 3.60 wmi-op 4 htt-op 3 cal otp max-sta 
32 raw 0 hwcrypto 1

>> If change it to dev_warn_once, then it will have no chance to find the

>> other wrong values which report by firmware, and it indicate

>> a wrong value to mac80211/cfg80211 and lead "iw wlan0 station dump"

>> get a wrong bitrate.

> 


Agreed.

> I agree, we should keep this warning. If the firmware still keeps

> sending invalid rates we should add a specific check to ignore the known

> invalid values, but not all of them.

> 


Would it be helpful to adjust the default rate limits and set the to
a higher value instead. It might be difficult to account all possible
invalid values?

Something like, ath10k_warn_ratelimited() to adjust the

DEFAULT_RATELIMIT_INTERVAL and DEFAULT_RATELIMIT_BURST using
DEFINE_RATELIMIT_STATE

Let me know if you like this idea. I can send a patch in to do this.
I will hang on to this firmware version for a little but longer, so
we have a test case. :)

thanks,
-- Shuah
Kalle Valo Feb. 11, 2021, 11:24 a.m. UTC | #2
Shuah Khan <skhan@linuxfoundation.org> writes:

> On 2/10/21 1:28 AM, Kalle Valo wrote:
>> Wen Gong <wgong@codeaurora.org> writes:
>>
>>> On 2021-02-10 08:42, Shuah Khan wrote:
>>>> ath10k_mac_get_rate_flags_ht() floods dmesg with the following
>>>> messages,
>>>> when it fails to find a match for mcs=7 and rate=1440.
>>>>
>>>> supported_ht_mcs_rate_nss2:
>>>> {7,  {1300, 2700, 1444, 3000} }
>>>>
>>>> ath10k_pci 0000:02:00.0: invalid ht params rate 1440 100kbps nss 2
>>>> mcs 7
>>>>
>>>> dev_warn_ratelimited() isn't helping the noise. Use dev_warn_once()
>>>> instead.
>>>>
>>>> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
>>>> ---
>>>>   drivers/net/wireless/ath/ath10k/mac.c | 5 +++--
>>>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/net/wireless/ath/ath10k/mac.c
>>>> b/drivers/net/wireless/ath/ath10k/mac.c
>>>> index 3545ce7dce0a..276321f0cfdd 100644
>>>> --- a/drivers/net/wireless/ath/ath10k/mac.c
>>>> +++ b/drivers/net/wireless/ath/ath10k/mac.c
>>>> @@ -8970,8 +8970,9 @@ static void ath10k_mac_get_rate_flags_ht(struct
>>>> ath10k *ar, u32 rate, u8 nss, u8
>>>>   		*bw |= RATE_INFO_BW_40;
>>>>   		*flags |= RATE_INFO_FLAGS_SHORT_GI;
>>>>   	} else {
>>>> -		ath10k_warn(ar, "invalid ht params rate %d 100kbps nss %d mcs %d",
>>>> -			    rate, nss, mcs);
>>>> +		dev_warn_once(ar->dev,
>>>> +			      "invalid ht params rate %d 100kbps nss %d mcs %d",
>>>> +			      rate, nss, mcs);
>>>>   	}
>>>>   }
>>>
>>> The {7,  {1300, 2700, 1444, 3000} } is a correct value.
>>> The 1440 is report from firmware, its a wrong value, it has fixed in
>>> firmware.
>>
>> In what version?
>>
>
> Here is the info:
>
> ath10k_pci 0000:02:00.0: qca6174 hw3.2 target 0x05030000 chip_id
> 0x00340aff sub 17aa:0827
>
> ath10k_pci 0000:02:00.0: firmware ver WLAN.RM.4.4.1-00140-QCARMSWPZ-1 
> api 6 features wowlan,ignore-otp,mfp crc32 29eb8ca1
>
> ath10k_pci 0000:02:00.0: board_file api 2 bmi_id N/A crc32 4ac0889b
>
> ath10k_pci 0000:02:00.0: htt-ver 3.60 wmi-op 4 htt-op 3 cal otp
> max-sta 32 raw 0 hwcrypto 1
>
>>> If change it to dev_warn_once, then it will have no chance to find the
>>> other wrong values which report by firmware, and it indicate
>>> a wrong value to mac80211/cfg80211 and lead "iw wlan0 station dump"
>>> get a wrong bitrate.
>>
>
> Agreed.
>
>> I agree, we should keep this warning. If the firmware still keeps
>> sending invalid rates we should add a specific check to ignore the known
>> invalid values, but not all of them.
>>
>
> Would it be helpful to adjust the default rate limits and set the to
> a higher value instead. It might be difficult to account all possible
> invalid values?
>
> Something like, ath10k_warn_ratelimited() to adjust the
>
> DEFAULT_RATELIMIT_INTERVAL and DEFAULT_RATELIMIT_BURST using
> DEFINE_RATELIMIT_STATE
>
> Let me know if you like this idea. I can send a patch in to do this.
> I will hang on to this firmware version for a little but longer, so
> we have a test case. :)

I would rather first try to fix the root cause, which is the firmware
sending invalid rates. Wen, you mentioned there's a fix in firmware. Do
you know which firmware version (and branch) has the fix?
Shuah Khan Feb. 26, 2021, 6:01 p.m. UTC | #3
On 2/11/21 4:24 AM, Kalle Valo wrote:
> Shuah Khan <skhan@linuxfoundation.org> writes:
> 
>> On 2/10/21 1:28 AM, Kalle Valo wrote:
>>> Wen Gong <wgong@codeaurora.org> writes:
>>>
>>>> On 2021-02-10 08:42, Shuah Khan wrote:
>>>>> ath10k_mac_get_rate_flags_ht() floods dmesg with the following
>>>>> messages,
>>>>> when it fails to find a match for mcs=7 and rate=1440.
>>>>>
>>>>> supported_ht_mcs_rate_nss2:
>>>>> {7,  {1300, 2700, 1444, 3000} }
>>>>>
>>>>> ath10k_pci 0000:02:00.0: invalid ht params rate 1440 100kbps nss 2
>>>>> mcs 7
>>>>>
>>>>> dev_warn_ratelimited() isn't helping the noise. Use dev_warn_once()
>>>>> instead.
>>>>>
>>>>> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
>>>>> ---
>>>>>    drivers/net/wireless/ath/ath10k/mac.c | 5 +++--
>>>>>    1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/net/wireless/ath/ath10k/mac.c
>>>>> b/drivers/net/wireless/ath/ath10k/mac.c
>>>>> index 3545ce7dce0a..276321f0cfdd 100644
>>>>> --- a/drivers/net/wireless/ath/ath10k/mac.c
>>>>> +++ b/drivers/net/wireless/ath/ath10k/mac.c
>>>>> @@ -8970,8 +8970,9 @@ static void ath10k_mac_get_rate_flags_ht(struct
>>>>> ath10k *ar, u32 rate, u8 nss, u8
>>>>>    		*bw |= RATE_INFO_BW_40;
>>>>>    		*flags |= RATE_INFO_FLAGS_SHORT_GI;
>>>>>    	} else {
>>>>> -		ath10k_warn(ar, "invalid ht params rate %d 100kbps nss %d mcs %d",
>>>>> -			    rate, nss, mcs);
>>>>> +		dev_warn_once(ar->dev,
>>>>> +			      "invalid ht params rate %d 100kbps nss %d mcs %d",
>>>>> +			      rate, nss, mcs);
>>>>>    	}
>>>>>    }
>>>>
>>>> The {7,  {1300, 2700, 1444, 3000} } is a correct value.
>>>> The 1440 is report from firmware, its a wrong value, it has fixed in
>>>> firmware.
>>>
>>> In what version?
>>>
>>
>> Here is the info:
>>
>> ath10k_pci 0000:02:00.0: qca6174 hw3.2 target 0x05030000 chip_id
>> 0x00340aff sub 17aa:0827
>>
>> ath10k_pci 0000:02:00.0: firmware ver WLAN.RM.4.4.1-00140-QCARMSWPZ-1
>> api 6 features wowlan,ignore-otp,mfp crc32 29eb8ca1
>>
>> ath10k_pci 0000:02:00.0: board_file api 2 bmi_id N/A crc32 4ac0889b
>>
>> ath10k_pci 0000:02:00.0: htt-ver 3.60 wmi-op 4 htt-op 3 cal otp
>> max-sta 32 raw 0 hwcrypto 1
>>
>>>> If change it to dev_warn_once, then it will have no chance to find the
>>>> other wrong values which report by firmware, and it indicate
>>>> a wrong value to mac80211/cfg80211 and lead "iw wlan0 station dump"
>>>> get a wrong bitrate.
>>>
>>
>> Agreed.
>>
>>> I agree, we should keep this warning. If the firmware still keeps
>>> sending invalid rates we should add a specific check to ignore the known
>>> invalid values, but not all of them.
>>>
>>
>> Would it be helpful to adjust the default rate limits and set the to
>> a higher value instead. It might be difficult to account all possible
>> invalid values?
>>
>> Something like, ath10k_warn_ratelimited() to adjust the
>>
>> DEFAULT_RATELIMIT_INTERVAL and DEFAULT_RATELIMIT_BURST using
>> DEFINE_RATELIMIT_STATE
>>
>> Let me know if you like this idea. I can send a patch in to do this.
>> I will hang on to this firmware version for a little but longer, so
>> we have a test case. :)
> 
> I would rather first try to fix the root cause, which is the firmware
> sending invalid rates. Wen, you mentioned there's a fix in firmware. Do
> you know which firmware version (and branch) has the fix?
> 

Picking this back up. Wen, which firmware version has this fix? I can
test this on my system and get rid of the noisy messages. :)

thanks,
-- Shuah