Message ID | 20250604144509.28374-1-johan+linaro@kernel.org |
---|---|
Headers | show |
Series | wifi: ath12k: fix dest ring-buffer corruption | expand |
On 6/4/2025 10:45 PM, Johan Hovold wrote: > Add the missing memory barrier to make sure that destination ring > descriptors are read after the head pointers to avoid using stale data > on weakly ordered architectures like aarch64. > > The barrier is added to the ath12k_hal_srng_access_begin() helper for > symmetry with follow-on fixes for source ring buffer corruption which > will add barriers to ath12k_hal_srng_access_end(). > > Note that this may fix the empty descriptor issue recently worked around > by commit 51ad34a47e9f ("wifi: ath12k: Add drop descriptor handling for > monitor ring"). why? I would expect drunk cookies are valid in case of HAL_MON_DEST_INFO0_EMPTY_DESC, rather than anything caused by reordering. > > Tested-on: WCN7850 hw2.0 WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3 > > Fixes: d889913205cf ("wifi: ath12k: driver for Qualcomm Wi-Fi 7 devices") > Cc: stable@vger.kernel.org # 6.3 > Signed-off-by: Johan Hovold <johan+linaro@kernel.org> > --- > drivers/net/wireless/ath/ath12k/ce.c | 3 --- > drivers/net/wireless/ath/ath12k/hal.c | 17 ++++++++++++++--- > 2 files changed, 14 insertions(+), 6 deletions(-) > > diff --git a/drivers/net/wireless/ath/ath12k/ce.c b/drivers/net/wireless/ath/ath12k/ce.c > index 740586fe49d1..b66d23d6b2bd 100644 > --- a/drivers/net/wireless/ath/ath12k/ce.c > +++ b/drivers/net/wireless/ath/ath12k/ce.c > @@ -343,9 +343,6 @@ static int ath12k_ce_completed_recv_next(struct ath12k_ce_pipe *pipe, > goto err; > } > > - /* Make sure descriptor is read after the head pointer. */ > - dma_rmb(); > - > *nbytes = ath12k_hal_ce_dst_status_get_length(desc); > > *skb = pipe->dest_ring->skb[sw_index]; > diff --git a/drivers/net/wireless/ath/ath12k/hal.c b/drivers/net/wireless/ath/ath12k/hal.c > index 91d5126ca149..9eea13ed5565 100644 > --- a/drivers/net/wireless/ath/ath12k/hal.c > +++ b/drivers/net/wireless/ath/ath12k/hal.c > @@ -2126,13 +2126,24 @@ void *ath12k_hal_srng_src_get_next_reaped(struct ath12k_base *ab, > > void ath12k_hal_srng_access_begin(struct ath12k_base *ab, struct hal_srng *srng) > { > + u32 hp; > + > lockdep_assert_held(&srng->lock); > > - if (srng->ring_dir == HAL_SRNG_DIR_SRC) > + if (srng->ring_dir == HAL_SRNG_DIR_SRC) { > srng->u.src_ring.cached_tp = > *(volatile u32 *)srng->u.src_ring.tp_addr; > - else > - srng->u.dst_ring.cached_hp = READ_ONCE(*srng->u.dst_ring.hp_addr); > + } else { > + hp = READ_ONCE(*srng->u.dst_ring.hp_addr); > + > + if (hp != srng->u.dst_ring.cached_hp) { This consumes additional CPU cycles in hot path, which is a concern to me. Based on that, I prefer the v1 implementation. > + srng->u.dst_ring.cached_hp = hp; > + /* Make sure descriptor is read after the head > + * pointer. > + */ > + dma_rmb(); > + } > + } > } > > /* Update cached ring head/tail pointers to HW. ath12k_hal_srng_access_begin()
On 6/5/2025 4:44 PM, Johan Hovold wrote: > On Thu, Jun 05, 2025 at 04:37:13PM +0800, Baochen Qiang wrote: >> On 6/4/2025 10:45 PM, Johan Hovold wrote: >>> As a follow up to commit: >>> >>> b67d2cf14ea ("wifi: ath12k: fix ring-buffer corruption") >>> >>> add the remaining missing memory barriers to make sure that destination >>> ring descriptors are read after the head pointers to avoid using stale >>> data on weakly ordered architectures like aarch64. >>> >>> Also switch back to plain accesses for the descriptor fields which is >>> sufficient after the memory barrier. >>> >>> New in v2 are two patches that add the missing barriers also for source >>> rings and when updating the tail pointer for destination rings. >>> >>> To avoid leaking ring details from the "hal" (lmac or non-lmac), the >>> barriers are added to the ath12k_hal_srng_access_end() helper. For >> >> Could you elaborate? what do you mean by "leaking ring details from the 'hal'"? > > The type of barrier needed depends on the type of the ring. If we add > the barrier directly in the caller, the caller would need to know what > kind of ring (lmac or non-lmac) it is operating on, something which is > currently abstracted away in the hal helpers. > Thanks, I get your point. I can see the difference in patch [3/4] >>> symmetry I therefore moved also the dest ring barriers into >>> ath12k_hal_srng_access_begin() and made the barrier conditional. > > Johan
On Thu, Jun 05, 2025 at 04:41:32PM +0800, Baochen Qiang wrote: > On 6/4/2025 10:45 PM, Johan Hovold wrote: > > Add the missing memory barrier to make sure that destination ring > > descriptors are read after the head pointers to avoid using stale data > > on weakly ordered architectures like aarch64. > > > > The barrier is added to the ath12k_hal_srng_access_begin() helper for > > symmetry with follow-on fixes for source ring buffer corruption which > > will add barriers to ath12k_hal_srng_access_end(). > > > > Note that this may fix the empty descriptor issue recently worked around > > by commit 51ad34a47e9f ("wifi: ath12k: Add drop descriptor handling for > > monitor ring"). > > why? I would expect drunk cookies are valid in case of HAL_MON_DEST_INFO0_EMPTY_DESC, > rather than anything caused by reordering. Based on a quick look it seemed like this could possibly fall in the same category as some of the other workarounds I've spotted while looking into these ordering issues (e.g. f9fff67d2d7c ("wifi: ath11k: Fix SKB corruption in REO destination ring")). If you say this one is clearly unrelated, I'll drop the comment. > > @@ -343,9 +343,6 @@ static int ath12k_ce_completed_recv_next(struct ath12k_ce_pipe *pipe, > > goto err; > > } > > > > - /* Make sure descriptor is read after the head pointer. */ > > - dma_rmb(); > > - > > *nbytes = ath12k_hal_ce_dst_status_get_length(desc); > > > > *skb = pipe->dest_ring->skb[sw_index]; > > diff --git a/drivers/net/wireless/ath/ath12k/hal.c b/drivers/net/wireless/ath/ath12k/hal.c > > index 91d5126ca149..9eea13ed5565 100644 > > --- a/drivers/net/wireless/ath/ath12k/hal.c > > +++ b/drivers/net/wireless/ath/ath12k/hal.c > > @@ -2126,13 +2126,24 @@ void *ath12k_hal_srng_src_get_next_reaped(struct ath12k_base *ab, > > > > void ath12k_hal_srng_access_begin(struct ath12k_base *ab, struct hal_srng *srng) > > { > > + u32 hp; > > + > > lockdep_assert_held(&srng->lock); > > > > - if (srng->ring_dir == HAL_SRNG_DIR_SRC) > > + if (srng->ring_dir == HAL_SRNG_DIR_SRC) { > > srng->u.src_ring.cached_tp = > > *(volatile u32 *)srng->u.src_ring.tp_addr; > > - else > > - srng->u.dst_ring.cached_hp = READ_ONCE(*srng->u.dst_ring.hp_addr); > > + } else { > > + hp = READ_ONCE(*srng->u.dst_ring.hp_addr); > > + > > + if (hp != srng->u.dst_ring.cached_hp) { > > This consumes additional CPU cycles in hot path, which is a concern to me. > > Based on that, I prefer the v1 implementation. The conditional avoids a memory barrier in case the ring is empty, so for all callers but ath12k_ce_completed_recv_next() it's an improvement over v1 in that sense. I could make the barrier unconditional, which will only add one barrier to ath12k_ce_completed_recv_next() in case the ring is empty compared to v1. Perhaps that's a good compromise if you worry about the extra comparison? I very much want to avoid having both explicit barriers in the caller and barriers in the hal end() helper. I think it should be either or. > > + srng->u.dst_ring.cached_hp = hp; > > + /* Make sure descriptor is read after the head > > + * pointer. > > + */ > > + dma_rmb(); > > + } > > + } Johan