diff mbox series

[4/5] ath10k: detect conf_mutex held ath10k_drain_tx() calls

Message ID a980abfb143f5240375f3f1046f0f26971c695e6.1612915444.git.skhan@linuxfoundation.org
State Superseded
Headers show
Series [1/5] ath10k: fix conf_mutex lock assert in ath10k_debug_fw_stats_request() | expand

Commit Message

Shuah Khan Feb. 10, 2021, 12:42 a.m. UTC
ath10k_drain_tx() must not be called with conf_mutex held as workers can
use that also. Add check to detect conf_mutex held calls.

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
---
 drivers/net/wireless/ath/ath10k/mac.c | 1 +
 1 file changed, 1 insertion(+)

Comments

kernel test robot Feb. 10, 2021, 8:03 a.m. UTC | #1
Hi Shuah,

I love your patch! Yet something to improve:

[auto build test ERROR on wireless-drivers-next/master]
[also build test ERROR on wireless-drivers/master v5.11-rc7 next-20210125]
[cannot apply to ath6kl/ath-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Shuah-Khan/ath10k-fixes-for-warns/20210210-085259
base:   https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next.git master
config: arm64-defconfig (attached as .config)
compiler: aarch64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/5a96c0c9dfec2bd59e680b7ec8ade9378b654c4c
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Shuah-Khan/ath10k-fixes-for-warns/20210210-085259
        git checkout 5a96c0c9dfec2bd59e680b7ec8ade9378b654c4c
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arm64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>, old ones prefixed by <<):

>> ERROR: modpost: "lockdep_is_held" [drivers/net/wireless/ath/ath10k/ath10k_core.ko] undefined!

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Kalle Valo Feb. 10, 2021, 8:25 a.m. UTC | #2
Shuah Khan <skhan@linuxfoundation.org> writes:

> ath10k_drain_tx() must not be called with conf_mutex held as workers can
> use that also. Add check to detect conf_mutex held calls.
>
> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>

The commit log does not answer to "Why?". How did you find this? What
actual problem are you trying to solve?

> ---
>  drivers/net/wireless/ath/ath10k/mac.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
> index 53f92945006f..3545ce7dce0a 100644
> --- a/drivers/net/wireless/ath/ath10k/mac.c
> +++ b/drivers/net/wireless/ath/ath10k/mac.c
> @@ -4566,6 +4566,7 @@ static void ath10k_mac_op_wake_tx_queue(struct ieee80211_hw *hw,
>  /* Must not be called with conf_mutex held as workers can use that also. */
>  void ath10k_drain_tx(struct ath10k *ar)
>  {
> +	WARN_ON(lockdep_is_held(&ar->conf_mutex));

Empty line after WARN_ON().

Shouldn't this check debug_locks similarly lockdep_assert_held() does?

#define lockdep_assert_held(l)	do {				\
		WARN_ON(debug_locks && !lockdep_is_held(l));	\
	} while (0)

And I suspect you need #ifdef CONFIG_LOCKDEP which should fix the kbuild
bot error.

But honestly I would prefer to have lockdep_assert_not_held() in
include/linux/lockdep.h, much cleaner that way. Also
i915_gem_object_lookup_rcu() could then use the same macro.
kernel test robot Feb. 11, 2021, 12:13 a.m. UTC | #3
Hi Shuah,

I love your patch! Yet something to improve:

[auto build test ERROR on wireless-drivers-next/master]
[also build test ERROR on wireless-drivers/master v5.11-rc7 next-20210125]
[cannot apply to ath6kl/ath-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Shuah-Khan/ath10k-fixes-for-warns/20210210-085259
base:   https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next.git master
config: m68k-allyesconfig (attached as .config)
compiler: m68k-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/5a96c0c9dfec2bd59e680b7ec8ade9378b654c4c
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Shuah-Khan/ath10k-fixes-for-warns/20210210-085259
        git checkout 5a96c0c9dfec2bd59e680b7ec8ade9378b654c4c
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=m68k 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   m68k-linux-ld: drivers/net/wireless/ath/ath10k/mac.o: in function `ath10k_drain_tx':
>> (.text+0xc8b2): undefined reference to `lockdep_is_held'


---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Kalle Valo Feb. 11, 2021, 11:20 a.m. UTC | #4
Shuah Khan <skhan@linuxfoundation.org> writes:

> On 2/10/21 1:25 AM, Kalle Valo wrote:

>> Shuah Khan <skhan@linuxfoundation.org> writes:

>>

>>> ath10k_drain_tx() must not be called with conf_mutex held as workers can

>>> use that also. Add check to detect conf_mutex held calls.

>>>

>>> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>

>>

>> The commit log does not answer to "Why?". How did you find this? What

>> actual problem are you trying to solve?

>>

>

> I came across the comment block above the ath10k_drain_tx() as I was

> reviewing at conf_mutex holds while I was debugging the conf_mutex

> lock assert in ath10k_debug_fw_stats_request().

>

> My reasoning is that having this will help detect incorrect usages

> of ath10k_drain_tx() while holding conf_mutex which could lead to

> locking problems when async worker routines try to call this routine.


Ok, makes sense. I prefer having this background info in the commit log,
for example "found by code review" or something like that. Or just copy
what you wrote above :)

>>> --- a/drivers/net/wireless/ath/ath10k/mac.c

>>> +++ b/drivers/net/wireless/ath/ath10k/mac.c

>>> @@ -4566,6 +4566,7 @@ static void

>>> ath10k_mac_op_wake_tx_queue(struct ieee80211_hw *hw,

>>>   /* Must not be called with conf_mutex held as workers can use that also. */

>>>   void ath10k_drain_tx(struct ath10k *ar)

>>>   {

>>> +	WARN_ON(lockdep_is_held(&ar->conf_mutex));

>>

>> Empty line after WARN_ON().

>>

>

> Will do.

>

>> Shouldn't this check debug_locks similarly lockdep_assert_held() does?

>>

>> #define lockdep_assert_held(l)	do {				\

>> 		WARN_ON(debug_locks && !lockdep_is_held(l));	\

>> 	} while (0)

>>

>> And I suspect you need #ifdef CONFIG_LOCKDEP which should fix the kbuild

>> bot error.

>>

>

> Yes.

>

>> But honestly I would prefer to have lockdep_assert_not_held() in

>> include/linux/lockdep.h, much cleaner that way. Also

>> i915_gem_object_lookup_rcu() could then use the same macro.

>>

>

> Right. This is the right way to go. That was first instinct and

> decided to have the discussion evolve in that direction. Now that

> it has, I will combine this change with

> include/linux/lockdep.h and add lockdep_assert_not_held()

>

> I think we might have other places in the kernel that could use

> lockdep_assert_not_held() in addition to i915_gem_object_lookup_rcu()


Great, thank you. The only problem is that lockdep.h changes have to go
via some other tree, I just don't know which :) I think it would be
easiest if also the ath10k patch goes via that other tree, I can ack the
ath10k changes.

Another option is that I'll apply the ath10k patch after the lockdep.h
change has trickled down to my tree, but that usually happens only after
the merge window and means weeks of waiting. Either is fine for me.

-- 
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
Shuah Khan Feb. 11, 2021, 8:53 p.m. UTC | #5
On 2/11/21 4:20 AM, Kalle Valo wrote:
> Shuah Khan <skhan@linuxfoundation.org> writes:

> 

>> On 2/10/21 1:25 AM, Kalle Valo wrote:

>>> Shuah Khan <skhan@linuxfoundation.org> writes:

>>>

>>>> ath10k_drain_tx() must not be called with conf_mutex held as workers can

>>>> use that also. Add check to detect conf_mutex held calls.

>>>>

>>>> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>

>>>

>>> The commit log does not answer to "Why?". How did you find this? What

>>> actual problem are you trying to solve?

>>>

>>

>> I came across the comment block above the ath10k_drain_tx() as I was

>> reviewing at conf_mutex holds while I was debugging the conf_mutex

>> lock assert in ath10k_debug_fw_stats_request().

>>

>> My reasoning is that having this will help detect incorrect usages

>> of ath10k_drain_tx() while holding conf_mutex which could lead to

>> locking problems when async worker routines try to call this routine.

> 

> Ok, makes sense. I prefer having this background info in the commit log,

> for example "found by code review" or something like that. Or just copy

> what you wrote above :)

> 


Thanks. I will do that.

>>>> --- a/drivers/net/wireless/ath/ath10k/mac.c

>>>> +++ b/drivers/net/wireless/ath/ath10k/mac.c

>>>> @@ -4566,6 +4566,7 @@ static void

>>>> ath10k_mac_op_wake_tx_queue(struct ieee80211_hw *hw,

>>>>    /* Must not be called with conf_mutex held as workers can use that also. */

>>>>    void ath10k_drain_tx(struct ath10k *ar)

>>>>    {

>>>> +	WARN_ON(lockdep_is_held(&ar->conf_mutex));

>>>

>>> Empty line after WARN_ON().

>>>

>>

>> Will do.

>>

>>> Shouldn't this check debug_locks similarly lockdep_assert_held() does?

>>>

>>> #define lockdep_assert_held(l)	do {				\

>>> 		WARN_ON(debug_locks && !lockdep_is_held(l));	\

>>> 	} while (0)

>>>

>>> And I suspect you need #ifdef CONFIG_LOCKDEP which should fix the kbuild

>>> bot error.

>>>

>>

>> Yes.

>>

>>> But honestly I would prefer to have lockdep_assert_not_held() in

>>> include/linux/lockdep.h, much cleaner that way. Also

>>> i915_gem_object_lookup_rcu() could then use the same macro.

>>>

>>

>> Right. This is the right way to go. That was first instinct and

>> decided to have the discussion evolve in that direction. Now that

>> it has, I will combine this change with

>> include/linux/lockdep.h and add lockdep_assert_not_held()

>>

>> I think we might have other places in the kernel that could use

>> lockdep_assert_not_held() in addition to i915_gem_object_lookup_rcu()

> 


I looked at i915_gem_object_lookup_rcu(). The following can be replaced
by lockdep_assert_held().

#ifdef CONFIG_LOCKDEP
         WARN_ON(debug_locks && !lock_is_held(&rcu_lock_map));
#endif

> Great, thank you. The only problem is that lockdep.h changes have to go

> via some other tree, I just don't know which :) I think it would be

> easiest if also the ath10k patch goes via that other tree, I can ack the

> ath10k changes.

> 

> Another option is that I'll apply the ath10k patch after the lockdep.h

> change has trickled down to my tree, but that usually happens only after

> the merge window and means weeks of waiting. Either is fine for me.

> 


I will send the include/linux/lockdep.h and ath10k patch together and
we will take it from there.

thanks,
-- Shuah
diff mbox series

Patch

diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
index 53f92945006f..3545ce7dce0a 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -4566,6 +4566,7 @@  static void ath10k_mac_op_wake_tx_queue(struct ieee80211_hw *hw,
 /* Must not be called with conf_mutex held as workers can use that also. */
 void ath10k_drain_tx(struct ath10k *ar)
 {
+	WARN_ON(lockdep_is_held(&ar->conf_mutex));
 	/* make sure rcu-protected mac80211 tx path itself is drained */
 	synchronize_net();