From patchwork Thu Jul 18 11:56:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Edmund Raile X-Patchwork-Id: 814195 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from alsa0.perex.cz (alsa0.perex.cz [77.48.224.243]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B674EC3DA59 for ; Mon, 22 Jul 2024 13:00:33 +0000 (UTC) Received: from alsa1.perex.cz (alsa1.perex.cz [207.180.221.201]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by alsa0.perex.cz (Postfix) with ESMTPS id 94738E7F; Mon, 22 Jul 2024 15:00:21 +0200 (CEST) DKIM-Filter: OpenDKIM Filter v2.11.0 alsa0.perex.cz 94738E7F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=alsa-project.org; s=default; t=1721653231; bh=xO1ybirnwHc/5PYbnp8br6AI3nTAS/ZKlIeSt8WCIZk=; h=Date:To:From:Cc:Subject:List-Id:List-Archive:List-Help:List-Owner: List-Post:List-Subscribe:List-Unsubscribe:From; b=F6/ophOwKMLAAEt7pTjmEbOG6Dyr+GnXwnvBZKQmbTkjSEpdpdn9YbZ/RSM8icmGS T1h0RwwhUu/lg0HnpMzZi/4GO16lSq7eQEKsbPMoXXJAvF0C0yJnvZpYv0umosDgzI UV1ehUSkA/jYCAPVtVkDS8JDOa7GSnSKEUcWAPL8= Received: by alsa1.perex.cz (Postfix, from userid 50401) id AC690F805FA; Mon, 22 Jul 2024 14:59:31 +0200 (CEST) Received: from mailman-core.alsa-project.org (mailman-core.alsa-project.org [10.254.200.10]) by alsa1.perex.cz (Postfix) with ESMTP id 7E853F80602; Mon, 22 Jul 2024 14:59:31 +0200 (CEST) Received: by alsa1.perex.cz (Postfix, from userid 50401) id 616D5F8026D; Thu, 18 Jul 2024 13:57:16 +0200 (CEST) Received: from mail-43166.protonmail.ch (mail-43166.protonmail.ch [185.70.43.166]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by alsa1.perex.cz (Postfix) with ESMTPS id A6721F8007E for ; Thu, 18 Jul 2024 13:57:00 +0200 (CEST) DKIM-Filter: OpenDKIM Filter v2.11.0 alsa1.perex.cz A6721F8007E Authentication-Results: alsa1.perex.cz; dkim=pass (2048-bit key, unprotected) header.d=proton.me header.i=@proton.me header.a=rsa-sha256 header.s=protonmail header.b=YoDfyPfB DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=proton.me; s=protonmail; t=1721303818; x=1721563018; bh=0IgroBuJlaGzNroVwV8aEok8gKxHNerwJQw2E/oX80E=; h=Date:To:From:Cc:Subject:Message-ID:Feedback-ID:From:To:Cc:Date: Subject:Reply-To:Feedback-ID:Message-ID:BIMI-Selector; b=YoDfyPfBqD0sUU5Uk00Xy4E8TgUjWcOGrjWalEaYZHesMTsEKxvUqKrJGUbwUcQjd pKLEvLtS0853RsSakHwHkT8hbyAZ7yzjz2i8s83h+j2OS5wLQxJH+uUzuOLHGwWRiR TQY2Fmor4Qq9zY7nRYlIX4SNhT4nliVD+BO+B1Y5GwoYyv6WVq3jO7D5KrX+BbIPhz Wi1gOxrVyfVpktqpyEU6j9nb2HEw1OBS3CZzDvB7O4+RPNlxH3mUZJFRs35hWYKpwk HTzKPPBXMomh1AdApZybDZt6Yho3721FT+Zxnuf3QFr+q/lS8wKyC7UzdT4ErVGk5p rI4brd9Udt+UQ== Date: Thu, 18 Jul 2024 11:56:54 +0000 To: o-takashi@sakamocchi.jp, clemens@ladisch.de From: Edmund Raile Cc: tiwai@suse.com, alsa-devel@alsa-project.org, linux-sound@vger.kernel.org, linux-kernel@vger.kernel.org, Edmund Raile Subject: [PATCH] ALSA: firewire-lib: restore process context workqueue to prevent deadlock Message-ID: <20240718115637.12816-1-edmund.raile@proton.me> Feedback-ID: 45198251:user:proton X-Pm-Message-ID: 5f36b356ef27217606fffea2ecf7f96514ee86eb MIME-Version: 1.0 X-MailFrom: edmund.raile@proton.me X-Mailman-Rule-Hits: nonmember-moderation X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-alsa-devel.alsa-project.org-0; header-match-alsa-devel.alsa-project.org-1 Message-ID-Hash: TVSFEFOE74PAXGUI4RJD4SQEBWNWMJ35 X-Message-ID-Hash: TVSFEFOE74PAXGUI4RJD4SQEBWNWMJ35 X-Mailman-Approved-At: Mon, 22 Jul 2024 12:59:25 +0000 X-Mailman-Version: 3.3.9 Precedence: list List-Id: "Alsa-devel mailing list for ALSA developers - http://www.alsa-project.org" Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Commit b5b519965c4c ("ALSA: firewire-lib: operate for period elapse event in process context") removed the process context workqueue from amdtp_domain_stream_pcm_pointer() and update_pcm_pointers() to remove its overhead. With RME Fireface 800, this lead to a regression since Kernels 5.14.0, causing a deadlock with eventual system freeze under ALSA operation: ? tasklet_unlock_spin_wait ohci_flush_iso_completions firewire_ohci amdtp_domain_stream_pcm_pointer snd_firewire_lib snd_pcm_update_hw_ptr0 snd_pcm snd_pcm_status64 snd_pcm ? native_queued_spin_lock_slowpath _raw_spin_lock_irqsave snd_pcm_period_elapsed snd_pcm process_rx_packets snd_firewire_lib irq_target_callback snd_firewire_lib handle_it_packet firewire_ohci context_tasklet firewire_ohci Restore the work queue to prevent deadlock between ALSA substream process context spin_lock of snd_pcm_stream_lock_irq() in snd_pcm_status64() and OHCI 1394 IT softIRQ context spin_lock of snd_pcm_stream_lock_irqsave() in snd_pcm_period_elapsed(). to reproduce the issue: direct ALSA playback to the device: mpv --audio-device=alsa/sysdefault:CARD=Fireface800 Spor-Ignition.flac Time to occurrence: 2s to 30m Likelihood increased by: - high CPU load stress --cpu $(nproc) - switching between applications via workspaces tested with i915 in Xfce PulsaAudio / PipeWire conceal the issue as they run PCM substream without period wakeup mode, issuing less hardIRQs. Closes: https://lore.kernel.org/regressions/kwryofzdmjvzkuw6j3clftsxmoolynljztxqwg76hzeo4simnl@jn3eo7pe642q/T/#u Fixes: 7ba5ca32fe6e ("ALSA: firewire-lib: operate for period elapse event in process context") Signed-off-by: Edmund Raile --- This is the follow-up patch to the 5.14.0 regression I reported: https://lore.kernel.org/regressions/kwryofzdmjvzkuw6j3clftsxmoolynljztxqwg76hzeo4simnl@jn3eo7pe642q/T/#u ("[REGRESSION] ALSA: firewire-lib: snd_pcm_period_elapsed deadlock with Fireface 800") Takashi Sakamoto explained the issue in his response to the regression: A. In the process context * (lock A) Acquiring spin_lock by snd_pcm_stream_lock_irq() in snd_pcm_status64() * (lock B) Then attempt to enter tasklet B. In the softIRQ context * (lock B) Enter tasklet * (lock A) Attempt to acquire spin_lock by snd_pcm_stream_lock_irqsave() in snd_pcm_period_elapsed() This leads me to believe this isn't just an issue limited to the RME Fireface driver (snd_fireface), though I can not test the other devices. sound/firewire/amdtp-stream.c | 32 +++++++++++++++++++++----------- sound/firewire/amdtp-stream.h | 1 + 2 files changed, 22 insertions(+), 11 deletions(-) diff --git a/sound/firewire/amdtp-stream.c b/sound/firewire/amdtp-stream.c index d35d0a420ee0..77b99a2117f4 100644 --- a/sound/firewire/amdtp-stream.c +++ b/sound/firewire/amdtp-stream.c @@ -77,6 +77,8 @@ // overrun. Actual device can skip more, then this module stops the packet streaming. #define IR_JUMBO_PAYLOAD_MAX_SKIP_CYCLES 5 +static void pcm_period_work(struct work_struct *work); + /** * amdtp_stream_init - initialize an AMDTP stream structure * @s: the AMDTP stream to initialize @@ -105,6 +107,7 @@ int amdtp_stream_init(struct amdtp_stream *s, struct fw_unit *unit, s->flags = flags; s->context = ERR_PTR(-1); mutex_init(&s->mutex); + INIT_WORK(&s->period_work, pcm_period_work); s->packet_index = 0; init_waitqueue_head(&s->ready_wait); @@ -347,6 +350,7 @@ EXPORT_SYMBOL(amdtp_stream_get_max_payload); */ void amdtp_stream_pcm_prepare(struct amdtp_stream *s) { + cancel_work_sync(&s->period_work); s->pcm_buffer_pointer = 0; s->pcm_period_pointer = 0; } @@ -611,19 +615,21 @@ static void update_pcm_pointers(struct amdtp_stream *s, // The program in user process should periodically check the status of intermediate // buffer associated to PCM substream to process PCM frames in the buffer, instead // of receiving notification of period elapsed by poll wait. - if (!pcm->runtime->no_period_wakeup) { - if (in_softirq()) { - // In software IRQ context for 1394 OHCI. - snd_pcm_period_elapsed(pcm); - } else { - // In process context of ALSA PCM application under acquired lock of - // PCM substream. - snd_pcm_period_elapsed_under_stream_lock(pcm); - } - } + if (!pcm->runtime->no_period_wakeup) + queue_work(system_highpri_wq, &s->period_work); } } +static void pcm_period_work(struct work_struct *work) +{ + struct amdtp_stream *s = container_of(work, struct amdtp_stream, + period_work); + struct snd_pcm_substream *pcm = READ_ONCE(s->pcm); + + if (pcm) + snd_pcm_period_elapsed(pcm); +} + static int queue_packet(struct amdtp_stream *s, struct fw_iso_packet *params, bool sched_irq) { @@ -1854,7 +1860,10 @@ unsigned long amdtp_domain_stream_pcm_pointer(struct amdtp_domain *d, if (irq_target && amdtp_stream_running(irq_target)) { // In software IRQ context, the call causes dead-lock to disable the tasklet // synchronously. - if (!in_softirq()) + // workqueue serves to prevent deadlock between process context spinlock + // of snd_pcm_stream_lock_irq() in snd_pcm_status64() and softIRQ spinlock + // of snd_pcm_stream_lock_irqsave() in snd_pcm_period_elapsed() + if ((!in_softirq()) && (current_work() != &s->period_work)) fw_iso_context_flush_completions(irq_target->context); } @@ -1910,6 +1919,7 @@ static void amdtp_stream_stop(struct amdtp_stream *s) return; } + cancel_work_sync(&s->period_work); fw_iso_context_stop(s->context); fw_iso_context_destroy(s->context); s->context = ERR_PTR(-1); diff --git a/sound/firewire/amdtp-stream.h b/sound/firewire/amdtp-stream.h index a1ed2e80f91a..775db3fc4959 100644 --- a/sound/firewire/amdtp-stream.h +++ b/sound/firewire/amdtp-stream.h @@ -191,6 +191,7 @@ struct amdtp_stream { /* For a PCM substream processing. */ struct snd_pcm_substream *pcm; + struct work_struct period_work; snd_pcm_uframes_t pcm_buffer_pointer; unsigned int pcm_period_pointer; unsigned int pcm_frame_multiplier;