Message ID | 20230425093547.1131-1-mirsad.todorovac@alu.unizg.hr |
---|---|
State | Superseded |
Headers | show |
Series | [v3,1/1] wifi: mac80211: fortify the spinlock against deadlock by interrupt | expand |
On Tue, Apr 25, 2023 at 11:35:48AM +0200, Mirsad Goran Todorovac wrote: > In the function ieee80211_tx_dequeue() there is a particular locking > sequence: > > begin: > spin_lock(&local->queue_stop_reason_lock); > q_stopped = local->queue_stop_reasons[q]; > spin_unlock(&local->queue_stop_reason_lock); > > However small the chance (increased by ftracetest), an asynchronous > interrupt can occur in between of spin_lock() and spin_unlock(), > and the interrupt routine will attempt to lock the same > &local->queue_stop_reason_lock again. > > This will cause a costly reset of the CPU and the wifi device or an > altogether hang in the single CPU and single core scenario. > > This is the probable trace of the deadlock: > > Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: Possible unsafe locking scenario: > Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: CPU0 > Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: ---- > Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: lock(&local->queue_stop_reason_lock); > Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: <Interrupt> > Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: lock(&local->queue_stop_reason_lock); > Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: > *** DEADLOCK *** Can you please add to the commit message whole lockdep trace? And please trim "Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: " line prefix, it doesn't add any value. Thanks
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c index 7699fb410670..45cb8e7bcc61 100644 --- a/net/mac80211/tx.c +++ b/net/mac80211/tx.c @@ -3781,6 +3781,7 @@ struct sk_buff *ieee80211_tx_dequeue(struct ieee80211_hw *hw, ieee80211_tx_result r; struct ieee80211_vif *vif = txq->vif; int q = vif->hw_queue[txq->ac]; + unsigned long flags; bool q_stopped; WARN_ON_ONCE(softirq_count() == 0); @@ -3789,9 +3790,9 @@ struct sk_buff *ieee80211_tx_dequeue(struct ieee80211_hw *hw, return NULL; begin: - spin_lock(&local->queue_stop_reason_lock); + spin_lock_irqsave(&local->queue_stop_reason_lock, flags); q_stopped = local->queue_stop_reasons[q]; - spin_unlock(&local->queue_stop_reason_lock); + spin_unlock_irqrestore(&local->queue_stop_reason_lock, flags); if (unlikely(q_stopped)) { /* mark for waking later */
In the function ieee80211_tx_dequeue() there is a particular locking sequence: begin: spin_lock(&local->queue_stop_reason_lock); q_stopped = local->queue_stop_reasons[q]; spin_unlock(&local->queue_stop_reason_lock); However small the chance (increased by ftracetest), an asynchronous interrupt can occur in between of spin_lock() and spin_unlock(), and the interrupt routine will attempt to lock the same &local->queue_stop_reason_lock again. This will cause a costly reset of the CPU and the wifi device or an altogether hang in the single CPU and single core scenario. This is the probable trace of the deadlock: Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: Possible unsafe locking scenario: Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: CPU0 Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: ---- Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: lock(&local->queue_stop_reason_lock); Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: <Interrupt> Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: lock(&local->queue_stop_reason_lock); Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: *** DEADLOCK *** Fixes: 4444bc2116ae ("wifi: mac80211: Proper mark iTXQs for resumption") Link: https://lore.kernel.org/all/1f58a0d1-d2b9-d851-73c3-93fcc607501c@alu.unizg.hr/ Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Cc: Gregory Greenman <gregory.greenman@intel.com> Cc: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/all/cdc80531-f25f-6f9d-b15f-25e16130b53a@alu.unizg.hr/ Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Alexander Wetzel <alexander@wetzel-home.de> Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> --- v2 -> v3: - Fix the Fixes: tag as advised. - change the net: to wifi: to comply with the original patch that is being fixed. v1 -> v2: - Minor rewording and clarification. - Cc:-ed people that replied to the original bug report (forgotten in v1 by omission). net/mac80211/tx.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)