diff mbox series

wifi: ath11k: Fix memory reuse logic

Message ID 20250418120951.94021-1-usama.anjum@collabora.com
State Superseded
Headers show
Series wifi: ath11k: Fix memory reuse logic | expand

Commit Message

Muhammad Usama Anjum April 18, 2025, 12:09 p.m. UTC
Firmware requests 2 segments at first. 1st segment is of 6799360 whose
allocation fails and we return success to firmware. Then firmware asks
for 22 smaller segments. Those get allocated. At suspend/hibernation
time, these segments aren't freed as they are reused by firmware.

After resume the firmware asks for 2 segments again with first segment
of 6799360 and with same vaddr of the first smaller segment which we had
allocated. Hence vaddr isn't NULL and we compare the type and size if it
can be reused. Unfornately, we detect that we cannot reuse it and this
first smaller segment is freed. Then we continue to allocate 6799360 size
memory from dma which fails and we call ath11k_qmi_free_target_mem_chunk()
which frees the second smaller segment as well. Later success is returned
to firmware which asks for 22 smaller segments again. But as we had freed
2 segments already, we'll allocate the first 2 new segments again and
reuse the remaining 20.

This patch is correctiong the skip logic when vaddr is set, but size/type
don't match. In this case, we should use the same skip and success logic
as used when dma_alloc_coherent fails without freeing the memory area.

We had got reports that memory allocation in this function failed at
resume which made us debug why the reuse logic is wrong. Those failures
weren't because of the bigger chunk allocation failure as they are
skipped. Rather these failures were because of smaller chunk allocation
failures. This patch fixes freeing and allocation of 2 smaller chunks.

Tested-on: QCNFA765 WLAN.HSP.1.1-03926.13-QCAHSPSWPL_V2_SILICONZ_CE-2.52297.6

Fixes: 5962f370ce41 ("ath11k: Reuse the available memory after firmware reload")
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
 drivers/net/wireless/ath/ath11k/qmi.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

Comments

Baochen Qiang April 22, 2025, 2:15 a.m. UTC | #1
On 4/18/2025 8:09 PM, Muhammad Usama Anjum wrote:
> Firmware requests 2 segments at first. 1st segment is of 6799360 whose
> allocation fails and we return success to firmware. Then firmware asks

Host won't fail in case DMA remapping is enabled. Better to rephrase to make it clear that
the big segment allocation fails in case DMA remapping is not working, usually due to
IOMMU not present or any necessary kernel config not enabled.

> for 22 smaller segments. Those get allocated. At suspend/hibernation
> time, these segments aren't freed as they are reused by firmware.
> 
> After resume the firmware asks for 2 segments again with first segment
> of 6799360 and with same vaddr of the first smaller segment which we had

Not follow you here. What do you mean by 'same vaddr'? firmware does not care about vaddr
at all.

> allocated. Hence vaddr isn't NULL and we compare the type and size if it
> can be reused. Unfornately, we detect that we cannot reuse it and this

s/Unfornately/Unfortunately/

> first smaller segment is freed. Then we continue to allocate 6799360 size
> memory from dma which fails and we call ath11k_qmi_free_target_mem_chunk()
> which frees the second smaller segment as well. Later success is returned
> to firmware which asks for 22 smaller segments again. But as we had freed
> 2 segments already, we'll allocate the first 2 new segments again and
> reuse the remaining 20.
> 
> This patch is correctiong the skip logic when vaddr is set, but size/type

s/correctiong/correcting/

> don't match. In this case, we should use the same skip and success logic
> as used when dma_alloc_coherent fails without freeing the memory area.
> 
> We had got reports that memory allocation in this function failed at

any public link to the report?

> resume which made us debug why the reuse logic is wrong. Those failures
> weren't because of the bigger chunk allocation failure as they are
> skipped. Rather these failures were because of smaller chunk allocation
> failures. This patch fixes freeing and allocation of 2 smaller chunks.

any you saying kernels fail to alloc a smaller chunk? why? is system memory exhausted or
too fragmented?

> 
> Tested-on: QCNFA765 WLAN.HSP.1.1-03926.13-QCAHSPSWPL_V2_SILICONZ_CE-2.52297.6

QCNFA765 is not an official chip name. please use WCN6855.

> 
> Fixes: 5962f370ce41 ("ath11k: Reuse the available memory after firmware reload")

I don't think a Fixes tag apply here. As IMO this is not really an issue, it is just not
doing well.

> Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
> ---
>  drivers/net/wireless/ath/ath11k/qmi.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/wireless/ath/ath11k/qmi.c b/drivers/net/wireless/ath/ath11k/qmi.c
> index 47b9d4126d3a9..3c26f4dcf5d29 100644
> --- a/drivers/net/wireless/ath/ath11k/qmi.c
> +++ b/drivers/net/wireless/ath/ath11k/qmi.c
> @@ -1990,8 +1990,16 @@ static int ath11k_qmi_alloc_target_mem_chunk(struct ath11k_base *ab)
>  		 */
>  		if (chunk->vaddr) {
>  			if (chunk->prev_type == chunk->type &&
> -			    chunk->prev_size == chunk->size)
> +			    chunk->prev_size == chunk->size) {
>  				continue;
> +			} else if (ab->qmi.mem_seg_count <= ATH11K_QMI_FW_MEM_REQ_SEGMENT_CNT) {
> +				ath11k_dbg(ab, ATH11K_DBG_QMI,
> +					   "size/type mismatch (current %d %u) (prev %d %u), try later with small size\n",
> +					    chunk->size, chunk->type,
> +					    chunk->prev_size, chunk->prev_type);
> +				ab->qmi.target_mem_delayed = true;
> +				return 0;
> +			}
>  
>  			/* cannot reuse the existing chunk */
>  			dma_free_coherent(ab->dev, chunk->prev_size,

actual code change LGTM.
Muhammad Usama Anjum April 22, 2025, 7:46 a.m. UTC | #2
Hi,

Thank you for excellent review.

On 4/22/25 7:15 AM, Baochen Qiang wrote:
> 
> 
> On 4/18/2025 8:09 PM, Muhammad Usama Anjum wrote:
>> Firmware requests 2 segments at first. 1st segment is of 6799360 whose
>> allocation fails and we return success to firmware. Then firmware asks
> 
> Host won't fail in case DMA remapping is enabled. Better to rephrase to make it clear that
> the big segment allocation fails in case DMA remapping is not working, usually due to
> IOMMU not present or any necessary kernel config not enabled.
IOMMU is turned off. I'll make description better.

> 
>> for 22 smaller segments. Those get allocated. At suspend/hibernation
>> time, these segments aren't freed as they are reused by firmware.
>>
>> After resume the firmware asks for 2 segments again with first segment
>> of 6799360 and with same vaddr of the first smaller segment which we had
> 
> Not follow you here. What do you mean by 'same vaddr'? firmware does not care about vaddr
> at all.
So we get request to allocate memory of size = 6799360 and vaddr =
0xABC). We fail it. Then we get request to allocate memory of size =
500000 and vaddr is same 0xABC which gets allocated successfully.

When we resume, firmware asks again for 6799360 with 0xABC vaddr even
though we had allocated memory of 500000 size at 0xABC. I'm referring to
this vaddr that its same.

> 
>> allocated. Hence vaddr isn't NULL and we compare the type and size if it
>> can be reused. Unfornately, we detect that we cannot reuse it and this
> 
> s/Unfornately/Unfortunately/
> 
>> first smaller segment is freed. Then we continue to allocate 6799360 size
>> memory from dma which fails and we call ath11k_qmi_free_target_mem_chunk()
>> which frees the second smaller segment as well. Later success is returned
>> to firmware which asks for 22 smaller segments again. But as we had freed
>> 2 segments already, we'll allocate the first 2 new segments again and
>> reuse the remaining 20.
>>
>> This patch is correctiong the skip logic when vaddr is set, but size/type
> 
> s/correctiong/correcting/
> 
>> don't match. In this case, we should use the same skip and success logic
>> as used when dma_alloc_coherent fails without freeing the memory area.
>>
>> We had got reports that memory allocation in this function failed at
> 
> any public link to the report?
There's no public report. I've attached the logs. You'll find following
error logs in it:

ath11k_pci 0000:03:00.0: failed to allocate dma memory for qmi (524288 B
type 1)
ath11k_pci 0000:03:00.0: failed to allocate qmi target memory: -22


> 
>> resume which made us debug why the reuse logic is wrong. Those failures
>> weren't because of the bigger chunk allocation failure as they are
>> skipped. Rather these failures were because of smaller chunk allocation
>> failures. This patch fixes freeing and allocation of 2 smaller chunks.
> 
> any you saying kernels fail to alloc a smaller chunk? why? is system memory exhausted or
> too fragmented?
Yes, the smaller chunk doesn't get allocated. I've not been able to
reproduce it on my setup. Both system memory exhaustion and
fragmentation are the suspects.

> 
>>
>> Tested-on: QCNFA765 WLAN.HSP.1.1-03926.13-QCAHSPSWPL_V2_SILICONZ_CE-2.52297.6
> 
> QCNFA765 is not an official chip name. please use WCN6855.
Okay. I'll fix all of these mistakes.

> 
>>
>> Fixes: 5962f370ce41 ("ath11k: Reuse the available memory after firmware reload")
> 
> I don't think a Fixes tag apply here. As IMO this is not really an issue, it is just not
> doing well.
> 
>> Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
>> ---
>>  drivers/net/wireless/ath/ath11k/qmi.c | 10 +++++++++-
>>  1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/wireless/ath/ath11k/qmi.c b/drivers/net/wireless/ath/ath11k/qmi.c
>> index 47b9d4126d3a9..3c26f4dcf5d29 100644
>> --- a/drivers/net/wireless/ath/ath11k/qmi.c
>> +++ b/drivers/net/wireless/ath/ath11k/qmi.c
>> @@ -1990,8 +1990,16 @@ static int ath11k_qmi_alloc_target_mem_chunk(struct ath11k_base *ab)
>>  		 */
>>  		if (chunk->vaddr) {
>>  			if (chunk->prev_type == chunk->type &&
>> -			    chunk->prev_size == chunk->size)
>> +			    chunk->prev_size == chunk->size) {
>>  				continue;
>> +			} else if (ab->qmi.mem_seg_count <= ATH11K_QMI_FW_MEM_REQ_SEGMENT_CNT) {
>> +				ath11k_dbg(ab, ATH11K_DBG_QMI,
>> +					   "size/type mismatch (current %d %u) (prev %d %u), try later with small size\n",
>> +					    chunk->size, chunk->type,
>> +					    chunk->prev_size, chunk->prev_type);
>> +				ab->qmi.target_mem_delayed = true;
>> +				return 0;
>> +			}
>>  
>>  			/* cannot reuse the existing chunk */
>>  			dma_free_coherent(ab->dev, chunk->prev_size,
> 
> actual code change LGTM.
> 
>
Baochen Qiang April 22, 2025, 8:01 a.m. UTC | #3
On 4/22/2025 3:46 PM, Muhammad Usama Anjum wrote:
> Hi,
> 
> Thank you for excellent review.
> 
> On 4/22/25 7:15 AM, Baochen Qiang wrote:
>>
>>
>> On 4/18/2025 8:09 PM, Muhammad Usama Anjum wrote:
>>> Firmware requests 2 segments at first. 1st segment is of 6799360 whose
>>> allocation fails and we return success to firmware. Then firmware asks
>>
>> Host won't fail in case DMA remapping is enabled. Better to rephrase to make it clear that
>> the big segment allocation fails in case DMA remapping is not working, usually due to
>> IOMMU not present or any necessary kernel config not enabled.
> IOMMU is turned off. I'll make description better.
> 
>>
>>> for 22 smaller segments. Those get allocated. At suspend/hibernation
>>> time, these segments aren't freed as they are reused by firmware.
>>>
>>> After resume the firmware asks for 2 segments again with first segment
>>> of 6799360 and with same vaddr of the first smaller segment which we had
>>
>> Not follow you here. What do you mean by 'same vaddr'? firmware does not care about vaddr
>> at all.
> So we get request to allocate memory of size = 6799360 and vaddr =
> 0xABC). We fail it. Then we get request to allocate memory of size =
> 500000 and vaddr is same 0xABC which gets allocated successfully.
> 
> When we resume, firmware asks again for 6799360 with 0xABC vaddr even
> though we had allocated memory of 500000 size at 0xABC. I'm referring to
> this vaddr that its same.

OK, get your point. But like I said, firmware doesn't case about vaddr, so it is not
asking for a 'same vaddr'.

IMO just mentioning vaddr is not NULL is sufficient.

> 
>>
>>> allocated. Hence vaddr isn't NULL and we compare the type and size if it
>>> can be reused. Unfornately, we detect that we cannot reuse it and this
>>
>> s/Unfornately/Unfortunately/
>>
>>> first smaller segment is freed. Then we continue to allocate 6799360 size
>>> memory from dma which fails and we call ath11k_qmi_free_target_mem_chunk()
>>> which frees the second smaller segment as well. Later success is returned
>>> to firmware which asks for 22 smaller segments again. But as we had freed
>>> 2 segments already, we'll allocate the first 2 new segments again and
>>> reuse the remaining 20.
>>>
>>> This patch is correctiong the skip logic when vaddr is set, but size/type
>>
>> s/correctiong/correcting/
>>
>>> don't match. In this case, we should use the same skip and success logic
>>> as used when dma_alloc_coherent fails without freeing the memory area.
>>>
>>> We had got reports that memory allocation in this function failed at
>>
>> any public link to the report?
> There's no public report. I've attached the logs. You'll find following
> error logs in it:
> 
> ath11k_pci 0000:03:00.0: failed to allocate dma memory for qmi (524288 B
> type 1)
> ath11k_pci 0000:03:00.0: failed to allocate qmi target memory: -22
> 
> 
>>
>>> resume which made us debug why the reuse logic is wrong. Those failures
>>> weren't because of the bigger chunk allocation failure as they are
>>> skipped. Rather these failures were because of smaller chunk allocation
>>> failures. This patch fixes freeing and allocation of 2 smaller chunks.
>>
>> any you saying kernels fail to alloc a smaller chunk? why? is system memory exhausted or
>> too fragmented?
> Yes, the smaller chunk doesn't get allocated. I've not been able to
> reproduce it on my setup. Both system memory exhaustion and
> fragmentation are the suspects.

so it is kernel failing to allocate the buffer, not any issue in ath12k leading to this.
Please help make this clear to avoid confusion.

> 
>>
>>>
>>> Tested-on: QCNFA765 WLAN.HSP.1.1-03926.13-QCAHSPSWPL_V2_SILICONZ_CE-2.52297.6
>>
>> QCNFA765 is not an official chip name. please use WCN6855.
> Okay. I'll fix all of these mistakes.
> 
>>
>>>
>>> Fixes: 5962f370ce41 ("ath11k: Reuse the available memory after firmware reload")
>>
>> I don't think a Fixes tag apply here. As IMO this is not really an issue, it is just not
>> doing well.
>>
>>> Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
>>> ---
>>>  drivers/net/wireless/ath/ath11k/qmi.c | 10 +++++++++-
>>>  1 file changed, 9 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/wireless/ath/ath11k/qmi.c b/drivers/net/wireless/ath/ath11k/qmi.c
>>> index 47b9d4126d3a9..3c26f4dcf5d29 100644
>>> --- a/drivers/net/wireless/ath/ath11k/qmi.c
>>> +++ b/drivers/net/wireless/ath/ath11k/qmi.c
>>> @@ -1990,8 +1990,16 @@ static int ath11k_qmi_alloc_target_mem_chunk(struct ath11k_base *ab)
>>>  		 */
>>>  		if (chunk->vaddr) {
>>>  			if (chunk->prev_type == chunk->type &&
>>> -			    chunk->prev_size == chunk->size)
>>> +			    chunk->prev_size == chunk->size) {
>>>  				continue;
>>> +			} else if (ab->qmi.mem_seg_count <= ATH11K_QMI_FW_MEM_REQ_SEGMENT_CNT) {
>>> +				ath11k_dbg(ab, ATH11K_DBG_QMI,
>>> +					   "size/type mismatch (current %d %u) (prev %d %u), try later with small size\n",
>>> +					    chunk->size, chunk->type,
>>> +					    chunk->prev_size, chunk->prev_type);
>>> +				ab->qmi.target_mem_delayed = true;
>>> +				return 0;
>>> +			}
>>>  
>>>  			/* cannot reuse the existing chunk */
>>>  			dma_free_coherent(ab->dev, chunk->prev_size,
>>
>> actual code change LGTM.
>>
>>
> 
>
Muhammad Usama Anjum April 23, 2025, 6:26 a.m. UTC | #4
On 4/22/25 1:01 PM, Baochen Qiang wrote:
> 
> 
> On 4/22/2025 3:46 PM, Muhammad Usama Anjum wrote:
>> Hi,
>>
>> Thank you for excellent review.
>>
>> On 4/22/25 7:15 AM, Baochen Qiang wrote:
>>>
>>>
>>> On 4/18/2025 8:09 PM, Muhammad Usama Anjum wrote:
>>>> Firmware requests 2 segments at first. 1st segment is of 6799360 whose
>>>> allocation fails and we return success to firmware. Then firmware asks
>>>
>>> Host won't fail in case DMA remapping is enabled. Better to rephrase to make it clear that
>>> the big segment allocation fails in case DMA remapping is not working, usually due to
>>> IOMMU not present or any necessary kernel config not enabled.
>> IOMMU is turned off. I'll make description better.
>>
>>>
>>>> for 22 smaller segments. Those get allocated. At suspend/hibernation
>>>> time, these segments aren't freed as they are reused by firmware.
>>>>
>>>> After resume the firmware asks for 2 segments again with first segment
>>>> of 6799360 and with same vaddr of the first smaller segment which we had
>>>
>>> Not follow you here. What do you mean by 'same vaddr'? firmware does not care about vaddr
>>> at all.
>> So we get request to allocate memory of size = 6799360 and vaddr =
>> 0xABC). We fail it. Then we get request to allocate memory of size =
>> 500000 and vaddr is same 0xABC which gets allocated successfully.
>>
>> When we resume, firmware asks again for 6799360 with 0xABC vaddr even
>> though we had allocated memory of 500000 size at 0xABC. I'm referring to
>> this vaddr that its same.
> 
> OK, get your point. But like I said, firmware doesn't case about vaddr, so it is not
> asking for a 'same vaddr'.
> 
> IMO just mentioning vaddr is not NULL is sufficient.
Okay. I'll update the description to avoid confusion.

> 
>>
>>>
>>>> allocated. Hence vaddr isn't NULL and we compare the type and size if it
>>>> can be reused. Unfornately, we detect that we cannot reuse it and this
>>>
>>> s/Unfornately/Unfortunately/
>>>
>>>> first smaller segment is freed. Then we continue to allocate 6799360 size
>>>> memory from dma which fails and we call ath11k_qmi_free_target_mem_chunk()
>>>> which frees the second smaller segment as well. Later success is returned
>>>> to firmware which asks for 22 smaller segments again. But as we had freed
>>>> 2 segments already, we'll allocate the first 2 new segments again and
>>>> reuse the remaining 20.
>>>>
>>>> This patch is correctiong the skip logic when vaddr is set, but size/type
>>>
>>> s/correctiong/correcting/
>>>
>>>> don't match. In this case, we should use the same skip and success logic
>>>> as used when dma_alloc_coherent fails without freeing the memory area.
>>>>
>>>> We had got reports that memory allocation in this function failed at
>>>
>>> any public link to the report?
>> There's no public report. I've attached the logs. You'll find following
>> error logs in it:
>>
>> ath11k_pci 0000:03:00.0: failed to allocate dma memory for qmi (524288 B
>> type 1)
>> ath11k_pci 0000:03:00.0: failed to allocate qmi target memory: -22
>>
>>
>>>
>>>> resume which made us debug why the reuse logic is wrong. Those failures
>>>> weren't because of the bigger chunk allocation failure as they are
>>>> skipped. Rather these failures were because of smaller chunk allocation
>>>> failures. This patch fixes freeing and allocation of 2 smaller chunks.
>>>
>>> any you saying kernels fail to alloc a smaller chunk? why? is system memory exhausted or
>>> too fragmented?
>> Yes, the smaller chunk doesn't get allocated. I've not been able to
>> reproduce it on my setup. Both system memory exhaustion and
>> fragmentation are the suspects.
> 
> so it is kernel failing to allocate the buffer, not any issue in ath12k leading to this.
> Please help make this clear to avoid confusion.
We caught the bug as kernel was unable to allocate memory at resume.
Later found out that with the optimization in ath11k, we shouldn't be
trying to allocate memory in the first place. That's why I've sent this
patch.

Let me update the description and send v2.
Baochen Qiang April 23, 2025, 6:47 a.m. UTC | #5
On 4/23/2025 2:26 PM, Muhammad Usama Anjum wrote:
> On 4/22/25 1:01 PM, Baochen Qiang wrote:
>>
>>
>> On 4/22/2025 3:46 PM, Muhammad Usama Anjum wrote:
>>> Hi,
>>>
>>> Thank you for excellent review.
>>>
>>> On 4/22/25 7:15 AM, Baochen Qiang wrote:
>>>>
>>>>
>>>> On 4/18/2025 8:09 PM, Muhammad Usama Anjum wrote:
>>>>> Firmware requests 2 segments at first. 1st segment is of 6799360 whose
>>>>> allocation fails and we return success to firmware. Then firmware asks
>>>>
>>>> Host won't fail in case DMA remapping is enabled. Better to rephrase to make it clear that
>>>> the big segment allocation fails in case DMA remapping is not working, usually due to
>>>> IOMMU not present or any necessary kernel config not enabled.
>>> IOMMU is turned off. I'll make description better.
>>>
>>>>
>>>>> for 22 smaller segments. Those get allocated. At suspend/hibernation
>>>>> time, these segments aren't freed as they are reused by firmware.
>>>>>
>>>>> After resume the firmware asks for 2 segments again with first segment
>>>>> of 6799360 and with same vaddr of the first smaller segment which we had
>>>>
>>>> Not follow you here. What do you mean by 'same vaddr'? firmware does not care about vaddr
>>>> at all.
>>> So we get request to allocate memory of size = 6799360 and vaddr =
>>> 0xABC). We fail it. Then we get request to allocate memory of size =
>>> 500000 and vaddr is same 0xABC which gets allocated successfully.
>>>
>>> When we resume, firmware asks again for 6799360 with 0xABC vaddr even
>>> though we had allocated memory of 500000 size at 0xABC. I'm referring to
>>> this vaddr that its same.
>>
>> OK, get your point. But like I said, firmware doesn't case about vaddr, so it is not
>> asking for a 'same vaddr'.
>>
>> IMO just mentioning vaddr is not NULL is sufficient.
> Okay. I'll update the description to avoid confusion.
> 
>>
>>>
>>>>
>>>>> allocated. Hence vaddr isn't NULL and we compare the type and size if it
>>>>> can be reused. Unfornately, we detect that we cannot reuse it and this
>>>>
>>>> s/Unfornately/Unfortunately/
>>>>
>>>>> first smaller segment is freed. Then we continue to allocate 6799360 size
>>>>> memory from dma which fails and we call ath11k_qmi_free_target_mem_chunk()
>>>>> which frees the second smaller segment as well. Later success is returned
>>>>> to firmware which asks for 22 smaller segments again. But as we had freed
>>>>> 2 segments already, we'll allocate the first 2 new segments again and
>>>>> reuse the remaining 20.
>>>>>
>>>>> This patch is correctiong the skip logic when vaddr is set, but size/type
>>>>
>>>> s/correctiong/correcting/
>>>>
>>>>> don't match. In this case, we should use the same skip and success logic
>>>>> as used when dma_alloc_coherent fails without freeing the memory area.
>>>>>
>>>>> We had got reports that memory allocation in this function failed at
>>>>
>>>> any public link to the report?
>>> There's no public report. I've attached the logs. You'll find following
>>> error logs in it:
>>>
>>> ath11k_pci 0000:03:00.0: failed to allocate dma memory for qmi (524288 B
>>> type 1)
>>> ath11k_pci 0000:03:00.0: failed to allocate qmi target memory: -22
>>>
>>>
>>>>
>>>>> resume which made us debug why the reuse logic is wrong. Those failures
>>>>> weren't because of the bigger chunk allocation failure as they are
>>>>> skipped. Rather these failures were because of smaller chunk allocation
>>>>> failures. This patch fixes freeing and allocation of 2 smaller chunks.
>>>>
>>>> any you saying kernels fail to alloc a smaller chunk? why? is system memory exhausted or
>>>> too fragmented?
>>> Yes, the smaller chunk doesn't get allocated. I've not been able to
>>> reproduce it on my setup. Both system memory exhaustion and
>>> fragmentation are the suspects.
>>
>> so it is kernel failing to allocate the buffer, not any issue in ath12k leading to this.
>> Please help make this clear to avoid confusion.
> We caught the bug as kernel was unable to allocate memory at resume.
> Later found out that with the optimization in ath11k, we shouldn't be
> trying to allocate memory in the first place. That's why I've sent this
> patch.
> 

yeah, understand that.

> Let me update the description and send v2.
> 

looking forward to it.

>
diff mbox series

Patch

diff --git a/drivers/net/wireless/ath/ath11k/qmi.c b/drivers/net/wireless/ath/ath11k/qmi.c
index 47b9d4126d3a9..3c26f4dcf5d29 100644
--- a/drivers/net/wireless/ath/ath11k/qmi.c
+++ b/drivers/net/wireless/ath/ath11k/qmi.c
@@ -1990,8 +1990,16 @@  static int ath11k_qmi_alloc_target_mem_chunk(struct ath11k_base *ab)
 		 */
 		if (chunk->vaddr) {
 			if (chunk->prev_type == chunk->type &&
-			    chunk->prev_size == chunk->size)
+			    chunk->prev_size == chunk->size) {
 				continue;
+			} else if (ab->qmi.mem_seg_count <= ATH11K_QMI_FW_MEM_REQ_SEGMENT_CNT) {
+				ath11k_dbg(ab, ATH11K_DBG_QMI,
+					   "size/type mismatch (current %d %u) (prev %d %u), try later with small size\n",
+					    chunk->size, chunk->type,
+					    chunk->prev_size, chunk->prev_type);
+				ab->qmi.target_mem_delayed = true;
+				return 0;
+			}
 
 			/* cannot reuse the existing chunk */
 			dma_free_coherent(ab->dev, chunk->prev_size,