diff mbox series

[v2] wifi: mac80211: fix cfg80211_bss always hold when assoc response fail for MLO connection

Message ID 20230822100409.1242-1-quic_wgong@quicinc.com
State New
Headers show
Series [v2] wifi: mac80211: fix cfg80211_bss always hold when assoc response fail for MLO connection | expand

Commit Message

Wen Gong Aug. 22, 2023, 10:04 a.m. UTC
When connect to MLO AP with more than one link, and the assoc response of
AP is not success, then cfg80211_unhold_bss() is not called for all the
links' cfg80211_bss except the primary link which means the link used by
the latest successful association request. Thus the hold value of the
cfg80211_bss is not reset to 0 after the assoc fail, and then the
__cfg80211_unlink_bss() will not be called for the cfg80211_bss by
__cfg80211_bss_expire().

Then the AP always looks exist even the AP is shutdown or reconfigured
to another type, then it will lead error while connecting it again.

The detail info are as below.

When connect with muti-links AP, cfg80211_hold_bss() is called by
cfg80211_mlme_assoc() for each cfg80211_bss of all the links. When
assoc response from AP is not success(such as status_code==1), the
ieee80211_link_data of non-primary link(sdata->link[link_id]) is NULL
because ieee80211_assoc_success()->ieee80211_vif_update_links() is
not called for the links.

Then struct cfg80211_rx_assoc_resp resp in cfg80211_rx_assoc_resp() and
struct cfg80211_connect_resp_params cr in __cfg80211_connect_result()
will only have the data of the primary link, and finally function
cfg80211_connect_result_release_bsses() only call cfg80211_unhold_bss()
for the primary link. Then cfg80211_bss of the other links will never free
because its hold is always > 0 now.

Hence assign value for the bss and status from assoc_data since it is
valid for this case. Also assign value of addr from assoc_data when the
link is NULL because the addrs of assoc_data and link both represent the
local link addr and they are same value for success connection.

It is 100% ratio to reproduce the issue with this change.
Add "mgmt->u.assoc_resp.status_code = 1" in ieee80211_rx_mgmt_assoc_resp().

Fixes: 81151ce462e5 ("wifi: mac80211: support MLO authentication/association with one link")
Signed-off-by: Wen Gong <quic_wgong@quicinc.com>
---
v2: change logic per johannes

 include/net/cfg80211.h |  2 +-
 net/mac80211/mlme.c    | 12 +++++++-----
 2 files changed, 8 insertions(+), 6 deletions(-)


base-commit: 3f257461ab0ab19806bae2bfde4c3cd88dbf050e

Comments

Johannes Berg Aug. 22, 2023, 11:16 a.m. UTC | #1
On Tue, 2023-08-22 at 06:04 -0400, Wen Gong wrote:
> 
> It is 100% ratio to reproduce the issue with this change.
> Add "mgmt->u.assoc_resp.status_code = 1" in ieee80211_rx_mgmt_assoc_resp().

Don't think that adds any value, but I guess I can always remove it.

> +		/* use the addr of assoc_data link which is set in ieee80211_mgd_assoc() */

not sure that's so useful? but anyway

The reason I'm even writing this message is that you didn't think this
change through:

> -		const u8 *addr;
> +		u8 addr[ETH_ALEN] __aligned(2);

That has some other consequences you need to adapt to, in particular

                /* need to have local link addresses for MLO connections */

                WARN_ON(cr.ap_mld_addr && !cr.links[link_id].addr);

makes no sense anymore. Not sure if that's the only one.


johannes
Wen Gong Aug. 23, 2023, 2:28 a.m. UTC | #2
On 8/22/2023 7:16 PM, Johannes Berg wrote:
> On Tue, 2023-08-22 at 06:04 -0400, Wen Gong wrote:
>> It is 100% ratio to reproduce the issue with this change.
>> Add "mgmt->u.assoc_resp.status_code = 1" in ieee80211_rx_mgmt_assoc_resp().
> Don't think that adds any value, but I guess I can always remove it.

change the "mgmt->u.assoc_resp.status_code = 1" is only a way to 
reproduce the bss hold

issue. you can remove it when upstream this patch.😁

>
>> +		/* use the addr of assoc_data link which is set in ieee80211_mgd_assoc() */
> not sure that's so useful? but anyway
>
> The reason I'm even writing this message is that you didn't think this
> change through:
Yes, you can remove the comment when upstream this patch.
>> -		const u8 *addr;
>> +		u8 addr[ETH_ALEN] __aligned(2);
> That has some other consequences you need to adapt to, in particular
>
>                  /* need to have local link addresses for MLO connections */
>
>                  WARN_ON(cr.ap_mld_addr && !cr.links[link_id].addr);
>
> makes no sense anymore. Not sure if that's the only one.

After this patch, the cr.links[link_id].addr will be a valid local link 
address from

struct cfg80211_rx_assoc_resp, so I think it is not needed remove now.


Another is here in __cfg80211_connect_result():

         for_each_valid_link(cr, link) {
             if (WARN_ON(!cr->links[link].addr))
                 goto out;

>
>
> johannes
Johannes Berg Aug. 23, 2023, 6:34 a.m. UTC | #3
On Wed, 2023-08-23 at 10:28 +0800, Wen Gong wrote:
> > 
> >                  /* need to have local link addresses for MLO connections */
> > 
> >                  WARN_ON(cr.ap_mld_addr && !cr.links[link_id].addr);
> > 
> > makes no sense anymore. Not sure if that's the only one.
> 
> After this patch, the cr.links[link_id].addr will be a valid local link 
> address from
> 
> struct cfg80211_rx_assoc_resp, so I think it is not needed remove now.

You don't understand.

The issue is that it's set the line above.

>                 cr.links[link_id].addr = data->links[link_id].addr;
>                 /* need to have local link addresses for MLO connections */
>                 WARN_ON(cr.ap_mld_addr && !cr.links[link_id].addr);

But look at that! What values can cr.links[link_id].addr get? Note how
it's a pointer - assigned from an array.


johannes
Wen Gong Aug. 24, 2023, 2:26 a.m. UTC | #4
On 8/23/2023 2:34 PM, Johannes Berg wrote:
> On Wed, 2023-08-23 at 10:28 +0800, Wen Gong wrote:
>>>                   /* need to have local link addresses for MLO connections */
>>>
>>>                   WARN_ON(cr.ap_mld_addr && !cr.links[link_id].addr);
>>>
>>> makes no sense anymore. Not sure if that's the only one.
>> After this patch, the cr.links[link_id].addr will be a valid local link
>> address from
>>
>> struct cfg80211_rx_assoc_resp, so I think it is not needed remove now.
> You don't understand.
>
> The issue is that it's set the line above.
>
>>                  cr.links[link_id].addr = data->links[link_id].addr;
>>                  /* need to have local link addresses for MLO connections */
>>                  WARN_ON(cr.ap_mld_addr && !cr.links[link_id].addr);
> But look at that! What values can cr.links[link_id].addr get? Note how
> it's a pointer - assigned from an array.

Oh, I know it now. the cr.links[link_id].addr will always NOT 0 because 
it is pointer

to an array "u8 addr[ETH_ALEN]" of struct cfg80211_rx_assoc_resp.

So maybe we can choose one of the 2 things to do:

1. remove the "WARN_ON(cr.ap_mld_addr && !cr.links[link_id].addr);"

2. change like this:

WARN_ON(cr.ap_mld_addr && !is_valid_ether_addr(cr.links[link_id].addr));

What about your advise?

>
> johannes
Johannes Berg Aug. 24, 2023, 7:03 a.m. UTC | #5
On Thu, 2023-08-24 at 10:26 +0800, Wen Gong wrote:
> On 8/23/2023 2:34 PM, Johannes Berg wrote:
> > On Wed, 2023-08-23 at 10:28 +0800, Wen Gong wrote:
> > > >                   /* need to have local link addresses for MLO connections */
> > > > 
> > > >                   WARN_ON(cr.ap_mld_addr && !cr.links[link_id].addr);
> > > > 
> > > > makes no sense anymore. Not sure if that's the only one.
> > > After this patch, the cr.links[link_id].addr will be a valid local link
> > > address from
> > > 
> > > struct cfg80211_rx_assoc_resp, so I think it is not needed remove now.
> > You don't understand.
> > 
> > The issue is that it's set the line above.
> > 
> > >                  cr.links[link_id].addr = data->links[link_id].addr;
> > >                  /* need to have local link addresses for MLO connections */
> > >                  WARN_ON(cr.ap_mld_addr && !cr.links[link_id].addr);
> > But look at that! What values can cr.links[link_id].addr get? Note how
> > it's a pointer - assigned from an array.
> 
> Oh, I know it now. the cr.links[link_id].addr will always NOT 0 because 
> it is pointer
> 
> to an array "u8 addr[ETH_ALEN]" of struct cfg80211_rx_assoc_resp.

Yep.

> So maybe we can choose one of the 2 things to do:
> 
> 1. remove the "WARN_ON(cr.ap_mld_addr && !cr.links[link_id].addr);"
> 
> 2. change like this:
> 
> WARN_ON(cr.ap_mld_addr && !is_valid_ether_addr(cr.links[link_id].addr));
> 

I think we should check that it's valid, because if you don't fill it,
it'll (hopefully) point to zeroed data somewhere.

johannes
Wen Gong Aug. 24, 2023, 7:05 a.m. UTC | #6
On 8/24/2023 3:03 PM, Johannes Berg wrote:
> On Thu, 2023-08-24 at 10:26 +0800, Wen Gong wrote:
>> On 8/23/2023 2:34 PM, Johannes Berg wrote:
>>> On Wed, 2023-08-23 at 10:28 +0800, Wen Gong wrote:
>>>>>                    /* need to have local link addresses for MLO connections */
>>>>>
>>>>>                    WARN_ON(cr.ap_mld_addr && !cr.links[link_id].addr);
>>>>>
>>>>> makes no sense anymore. Not sure if that's the only one.
>>>> After this patch, the cr.links[link_id].addr will be a valid local link
>>>> address from
>>>>
>>>> struct cfg80211_rx_assoc_resp, so I think it is not needed remove now.
>>> You don't understand.
>>>
>>> The issue is that it's set the line above.
>>>
>>>>                   cr.links[link_id].addr = data->links[link_id].addr;
>>>>                   /* need to have local link addresses for MLO connections */
>>>>                   WARN_ON(cr.ap_mld_addr && !cr.links[link_id].addr);
>>> But look at that! What values can cr.links[link_id].addr get? Note how
>>> it's a pointer - assigned from an array.
>> Oh, I know it now. the cr.links[link_id].addr will always NOT 0 because
>> it is pointer
>>
>> to an array "u8 addr[ETH_ALEN]" of struct cfg80211_rx_assoc_resp.
> Yep.
>
>> So maybe we can choose one of the 2 things to do:
>>
>> 1. remove the "WARN_ON(cr.ap_mld_addr && !cr.links[link_id].addr);"
>>
>> 2. change like this:
>>
>> WARN_ON(cr.ap_mld_addr && !is_valid_ether_addr(cr.links[link_id].addr));
>>
> I think we should check that it's valid, because if you don't fill it,
> it'll (hopefully) point to zeroed data somewhere.
>
> johannes
OK. So I will change it in next version.
diff mbox series

Patch

diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h
index 7c7d03aa9d06..5a53cc0edb30 100644
--- a/include/net/cfg80211.h
+++ b/include/net/cfg80211.h
@@ -7229,7 +7229,7 @@  struct cfg80211_rx_assoc_resp {
 	int uapsd_queues;
 	const u8 *ap_mld_addr;
 	struct {
-		const u8 *addr;
+		u8 addr[ETH_ALEN] __aligned(2);
 		struct cfg80211_bss *bss;
 		u16 status;
 	} links[IEEE80211_MLD_MAX_NUM_LINKS];
diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c
index f93eb38ae0b8..f36359269377 100644
--- a/net/mac80211/mlme.c
+++ b/net/mac80211/mlme.c
@@ -5429,17 +5429,19 @@  static void ieee80211_rx_mgmt_assoc_resp(struct ieee80211_sub_if_data *sdata,
 	for (link_id = 0; link_id < IEEE80211_MLD_MAX_NUM_LINKS; link_id++) {
 		struct ieee80211_link_data *link;
 
-		link = sdata_dereference(sdata->link[link_id], sdata);
-		if (!link)
-			continue;
-
 		if (!assoc_data->link[link_id].bss)
 			continue;
 
 		resp.links[link_id].bss = assoc_data->link[link_id].bss;
-		resp.links[link_id].addr = link->conf->addr;
+		/* use the addr of assoc_data link which is set in ieee80211_mgd_assoc() */
+		ether_addr_copy(resp.links[link_id].addr,
+				assoc_data->link[link_id].addr);
 		resp.links[link_id].status = assoc_data->link[link_id].status;
 
+		link = sdata_dereference(sdata->link[link_id], sdata);
+		if (!link)
+			continue;
+
 		/* get uapsd queues configuration - same for all links */
 		resp.uapsd_queues = 0;
 		for (ac = 0; ac < IEEE80211_NUM_ACS; ac++)