diff mbox series

wifi: ath12k: Fix Tx Completion Ring(WBM2SW) Setup Failure

Message ID 20240409190922.4180631-1-quic_nithp@quicinc.com
State Superseded
Headers show
Series wifi: ath12k: Fix Tx Completion Ring(WBM2SW) Setup Failure | expand

Commit Message

Nithyanantham Paramasivam April 9, 2024, 7:09 p.m. UTC
We observe intermittent ping failures from the access point (AP) to
station (STA) in any mode(AP-STA or Mesh) configured. Specifically,
the transmission completion status is not received at Tx completion
ring id-4(WBM2SW ring4) for the packets transmitted via TCL DATA
ring id-3. This prevents freeing up Tx descriptors and leads
to buffer exhaustion.

Currently, during initialization of the WBM2SW ring, we are directly
mapping the ring number to the ring mask to obtain the ring mask
group index. This approach is causing setup failures for WBM2SW
ring-4. Similarly, during runtime, when receiving incoming
transmission completion status, the validation of the ring number by
mapping the interrupted ring mask. This is resulting in
validation failure. Thereby preventing entry into the completion
handler(ath12k_dp_tx_completion_handler()).

The existing design assumed that the ring numbers would always be
sequential and could be directly mapped with the ring mask. However,
this assumption does not hold true for WBM2SW ring-4. Therefore,
modify the design such that, instead of mapping the ring number,
the ring ID is mapped with the ring mask.

According to this design:
1. During initialization of the WBM2SW ring, mapping the ring ID
to the ring mask will ensure obtaining the correct ring mask group
ID.
2. During runtime, validating the interrupted ring mask group ID
within the transmission completion group is sufficient. This
approach allows the ring ID to be derived from the interrupted ring
mask and enables entry into the completion handler.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1

Fixes: d889913205cf ("wifi: ath12k: driver for Qualcomm Wi-Fi 7 devices")
Signed-off-by: Nithyanantham Paramasivam <quic_nithp@quicinc.com>
---
 drivers/net/wireless/ath/ath12k/dp.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)


base-commit: dc410c4accd2fe64479a1f4ebc47ec9cd3928f4a

Comments

Jeff Johnson April 18, 2024, 5:52 p.m. UTC | #1
On 4/10/2024 1:25 PM, Jeff Johnson wrote:
> On 4/9/2024 12:09 PM, Nithyanantham Paramasivam wrote:
>> We observe intermittent ping failures from the access point (AP) to
>> station (STA) in any mode(AP-STA or Mesh) configured. Specifically,
>> the transmission completion status is not received at Tx completion
>> ring id-4(WBM2SW ring4) for the packets transmitted via TCL DATA
>> ring id-3. This prevents freeing up Tx descriptors and leads
>> to buffer exhaustion.
>>
>> Currently, during initialization of the WBM2SW ring, we are directly
>> mapping the ring number to the ring mask to obtain the ring mask
>> group index. This approach is causing setup failures for WBM2SW
>> ring-4. Similarly, during runtime, when receiving incoming
>> transmission completion status, the validation of the ring number by
>> mapping the interrupted ring mask. This is resulting in
>> validation failure. Thereby preventing entry into the completion
>> handler(ath12k_dp_tx_completion_handler()).
>>
>> The existing design assumed that the ring numbers would always be
>> sequential and could be directly mapped with the ring mask. However,
>> this assumption does not hold true for WBM2SW ring-4. Therefore,
>> modify the design such that, instead of mapping the ring number,
>> the ring ID is mapped with the ring mask.
>>
>> According to this design:
>> 1. During initialization of the WBM2SW ring, mapping the ring ID
>> to the ring mask will ensure obtaining the correct ring mask group
>> ID.
>> 2. During runtime, validating the interrupted ring mask group ID
>> within the transmission completion group is sufficient. This
>> approach allows the ring ID to be derived from the interrupted ring
>> mask and enables entry into the completion handler.
>>
>> Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1
>>
>> Fixes: d889913205cf ("wifi: ath12k: driver for Qualcomm Wi-Fi 7 devices")
>> Signed-off-by: Nithyanantham Paramasivam <quic_nithp@quicinc.com>
> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
> 
> 
Please remove my Acked-by.

I've bisected a kernel crash on my laptop to this patch
Jeff Johnson May 9, 2024, 2:39 p.m. UTC | #2
On 4/18/2024 10:52 AM, Jeff Johnson wrote:
> On 4/10/2024 1:25 PM, Jeff Johnson wrote:
>> On 4/9/2024 12:09 PM, Nithyanantham Paramasivam wrote:
>>> We observe intermittent ping failures from the access point (AP) to
>>> station (STA) in any mode(AP-STA or Mesh) configured. Specifically,
>>> the transmission completion status is not received at Tx completion
>>> ring id-4(WBM2SW ring4) for the packets transmitted via TCL DATA
>>> ring id-3. This prevents freeing up Tx descriptors and leads
>>> to buffer exhaustion.
>>>
>>> Currently, during initialization of the WBM2SW ring, we are directly
>>> mapping the ring number to the ring mask to obtain the ring mask
>>> group index. This approach is causing setup failures for WBM2SW
>>> ring-4. Similarly, during runtime, when receiving incoming
>>> transmission completion status, the validation of the ring number by
>>> mapping the interrupted ring mask. This is resulting in
>>> validation failure. Thereby preventing entry into the completion
>>> handler(ath12k_dp_tx_completion_handler()).
>>>
>>> The existing design assumed that the ring numbers would always be
>>> sequential and could be directly mapped with the ring mask. However,
>>> this assumption does not hold true for WBM2SW ring-4. Therefore,
>>> modify the design such that, instead of mapping the ring number,
>>> the ring ID is mapped with the ring mask.
>>>
>>> According to this design:
>>> 1. During initialization of the WBM2SW ring, mapping the ring ID
>>> to the ring mask will ensure obtaining the correct ring mask group
>>> ID.
>>> 2. During runtime, validating the interrupted ring mask group ID
>>> within the transmission completion group is sufficient. This
>>> approach allows the ring ID to be derived from the interrupted ring
>>> mask and enables entry into the completion handler.
>>>
>>> Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1
>>>
>>> Fixes: d889913205cf ("wifi: ath12k: driver for Qualcomm Wi-Fi 7 devices")
>>> Signed-off-by: Nithyanantham Paramasivam <quic_nithp@quicinc.com>
>> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
>>
>>
> Please remove my Acked-by.
> 
> I've bisected a kernel crash on my laptop to this patch

While debugging my crash I've determined the issue isn't with this patch, so
restore my:

Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>

/jeff
Jeff Johnson May 9, 2024, 4:01 p.m. UTC | #3
On 5/9/2024 7:39 AM, Jeff Johnson wrote:
> On 4/18/2024 10:52 AM, Jeff Johnson wrote:
>> On 4/10/2024 1:25 PM, Jeff Johnson wrote:
>>> On 4/9/2024 12:09 PM, Nithyanantham Paramasivam wrote:
>>>> We observe intermittent ping failures from the access point (AP) to
>>>> station (STA) in any mode(AP-STA or Mesh) configured. Specifically,
>>>> the transmission completion status is not received at Tx completion
>>>> ring id-4(WBM2SW ring4) for the packets transmitted via TCL DATA
>>>> ring id-3. This prevents freeing up Tx descriptors and leads
>>>> to buffer exhaustion.
>>>>
>>>> Currently, during initialization of the WBM2SW ring, we are directly
>>>> mapping the ring number to the ring mask to obtain the ring mask
>>>> group index. This approach is causing setup failures for WBM2SW
>>>> ring-4. Similarly, during runtime, when receiving incoming
>>>> transmission completion status, the validation of the ring number by
>>>> mapping the interrupted ring mask. This is resulting in
>>>> validation failure. Thereby preventing entry into the completion
>>>> handler(ath12k_dp_tx_completion_handler()).
>>>>
>>>> The existing design assumed that the ring numbers would always be
>>>> sequential and could be directly mapped with the ring mask. However,
>>>> this assumption does not hold true for WBM2SW ring-4. Therefore,
>>>> modify the design such that, instead of mapping the ring number,
>>>> the ring ID is mapped with the ring mask.
>>>>
>>>> According to this design:
>>>> 1. During initialization of the WBM2SW ring, mapping the ring ID
>>>> to the ring mask will ensure obtaining the correct ring mask group
>>>> ID.
>>>> 2. During runtime, validating the interrupted ring mask group ID
>>>> within the transmission completion group is sufficient. This
>>>> approach allows the ring ID to be derived from the interrupted ring
>>>> mask and enables entry into the completion handler.
>>>>
>>>> Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1
>>>>
>>>> Fixes: d889913205cf ("wifi: ath12k: driver for Qualcomm Wi-Fi 7 devices")
>>>> Signed-off-by: Nithyanantham Paramasivam <quic_nithp@quicinc.com>
>>> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
>>>
>>>
>> Please remove my Acked-by.
>>
>> I've bisected a kernel crash on my laptop to this patch
> 
> While debugging my crash I've determined the issue isn't with this patch, so
> restore my:
> 
> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
> 
> /jeff
> 
OK, I got confused between testing public patches & internal patches. It turns
out this version actually does have an issue, and there is a respin coming.

So reinstate the NAK on this version.
diff mbox series

Patch

diff --git a/drivers/net/wireless/ath/ath12k/dp.c b/drivers/net/wireless/ath/ath12k/dp.c
index a0aa8c571867..86d80fd5e2c5 100644
--- a/drivers/net/wireless/ath/ath12k/dp.c
+++ b/drivers/net/wireless/ath/ath12k/dp.c
@@ -127,7 +127,9 @@  static int ath12k_dp_srng_find_ring_in_mask(int ring_num, const u8 *grp_mask)
 static int ath12k_dp_srng_calculate_msi_group(struct ath12k_base *ab,
 					      enum hal_ring_type type, int ring_num)
 {
+	const struct ath12k_hal_tcl_to_wbm_rbm_map *map;
 	const u8 *grp_mask;
+	int i;
 
 	switch (type) {
 	case HAL_WBM2SW_RELEASE:
@@ -135,6 +137,14 @@  static int ath12k_dp_srng_calculate_msi_group(struct ath12k_base *ab,
 			grp_mask = &ab->hw_params->ring_mask->rx_wbm_rel[0];
 			ring_num = 0;
 		} else {
+			map = ab->hw_params->hal_ops->tcl_to_wbm_rbm_map;
+			for (i = 0; i < ab->hw_params->max_tx_ring; i++) {
+				if (ring_num == map[i].wbm_ring_num) {
+					ring_num = i;
+					break;
+				}
+			}
+
 			grp_mask = &ab->hw_params->ring_mask->tx[0];
 		}
 		break;
@@ -876,11 +886,9 @@  int ath12k_dp_service_srng(struct ath12k_base *ab,
 	enum dp_monitor_mode monitor_mode;
 	u8 ring_mask;
 
-	while (i < ab->hw_params->max_tx_ring) {
-		if (ab->hw_params->ring_mask->tx[grp_id] &
-			BIT(ab->hw_params->hal_ops->tcl_to_wbm_rbm_map[i].wbm_ring_num))
-			ath12k_dp_tx_completion_handler(ab, i);
-		i++;
+	if (ab->hw_params->ring_mask->tx[grp_id]) {
+		i = fls(ab->hw_params->ring_mask->tx[grp_id]) - 1;
+		ath12k_dp_tx_completion_handler(ab, i);
 	}
 
 	if (ab->hw_params->ring_mask->rx_err[grp_id]) {