diff mbox series

dmaengine: qcom: bam_dma: keep remotely controlled units on during boot

Message ID 20250503-bam-dma-reset-v1-1-266b6cecb844@oss.qualcomm.com
State New
Headers show
Series dmaengine: qcom: bam_dma: keep remotely controlled units on during boot | expand

Commit Message

Dmitry Baryshkov May 3, 2025, 12:41 a.m. UTC
The commit 0ac9c3dd0d6f ("dmaengine: qcom: bam_dma: fix runtime PM
underflow") made sure the BAM DMA device gets suspended, disabling the
bam_clk. However for remotely controlled BAM DMA devices the clock might
be disabled prematurely (e.g. in case of the earlycon this frequently
happens before UART driver is able to probe), which causes device reset.

Use sync_state callback to ensure that bam_clk stays on until all users
are probed (and are able to vote upon corresponding clocks).

Fixes: 0ac9c3dd0d6f ("dmaengine: qcom: bam_dma: fix runtime PM underflow")
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
---
 drivers/dma/qcom/bam_dma.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)


---
base-commit: 6ac908f24cd7ddae52c496bbc888e97ee7b033ac
change-id: 20250503-bam-dma-reset-1766d2d12cec

Best regards,

Comments

Dmitry Baryshkov May 5, 2025, 12:17 p.m. UTC | #1
On Mon, May 05, 2025 at 10:56:56AM +0200, Stephan Gerhold wrote:
> On Sat, May 03, 2025 at 03:41:43AM +0300, Dmitry Baryshkov wrote:
> > The commit 0ac9c3dd0d6f ("dmaengine: qcom: bam_dma: fix runtime PM
> > underflow") made sure the BAM DMA device gets suspended, disabling the
> > bam_clk. However for remotely controlled BAM DMA devices the clock might
> > be disabled prematurely (e.g. in case of the earlycon this frequently
> > happens before UART driver is able to probe), which causes device reset.
> > 
> > Use sync_state callback to ensure that bam_clk stays on until all users
> > are probed (and are able to vote upon corresponding clocks).
> > 
> > Fixes: 0ac9c3dd0d6f ("dmaengine: qcom: bam_dma: fix runtime PM underflow")
> > Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
> 
> Thanks for the patch! I actually created almost the same patch on
> Friday, after struggling with this issue on DB410c when trying to add
> the MPM as wakeup-parent for GPIOs. :-)
> 
> How is this issue related to _remotely-controlled_ BAMs?

My understanding is that for locally controlled BAMs we can disable the
clock at the probe time as all the users of the BAM will be probed
before accessing the BAM. In case of a remotely controlled BAM there can
be a user (e.g. UART) which is running, but didn't request DMA channel
yet.

Please correct me if I'm wrong here.

> The BAM clock will get disabled for all types of BAM control, so I don't
> think the type of BAM control plays any role here. The BLSP DMA instance
> that would most likely interfere with UART earlycon is
> controlled-remotely on some SoCs (e.g. MSM8916), but currently not all
> of them (e.g. MSM8974, IPQ8074, IPQ9574, ...).

This probably means that the definition of the flag needs to be
clarified and maybe some of those platforms should use it.

> The fixes tag also doesn't look correct to me, since commit 0ac9c3dd0d6f
> ("dmaengine: qcom: bam_dma: fix runtime PM underflow") only changed the
> behavior for BAMs with "if (!bdev->bamclk)". This applies to some/most
> remotely-controlled BAMs, but the issue we have here occurs only because
> we do have a clock and cause it to get disabled prematurely.

Well... It is a commit which broke earlycon on on db410c.

I started to describe here the usecase of the remotely-controlled DMA
controller being used by the BLSP and then I understood, that I myself
don't completely understand if the issue is because DMA block is
controlled remotely (and we should not be disabling it because the BLSP
still attempts to use it) or if it's a simple case of the clock being
shared between several consumers and one of the consumers shutting it
down before other running consumers had a chance to vote on it.

> Checking for if (bdev->bamclk) would probably make more sense. In my
> patch I did it just unconditionally, because runtime PM is currently
> a no-op for BAMs without clock anyway.

Please share your patch.

> 
> I think it's also worth noting in the commit message that this is sort
> of a stop-gap solution. The root problem is that the earlycon code
> doesn't claim the clock while active. Any of the drivers that consume
> this shared clock could trigger the issue, I had to fix a similar issue
> in the spi-qup driver before in commit 0c331fd1dccf ("spi: qup: Request
> DMA before enabling clocks"). On some SoCs (e.g. MSM8974), we have
> "dmas" currently only on &blsp2_i2c5, so the UART controller wouldn't
> even be considered as consumer to wait for before calling the bam_dma
> .sync_state.
> 
> It may be more reliable to implement something like in
> drivers/clk/imx/clk.c imx_register_uart_clocks(), which tries to claim
> only the actually used UART clocks until late_initcall_sync(). That
> would at least make it independent from individual drivers, but assumes
> the UART driver can actually probe before late_initcall_sync() ...
> Most of this code is generic though, so perhaps releasing the clocks
> could be hooked up somewhere generic, when earlycon exits ...?

The spi-qup commit looks like another stop-gap workaround. Let's add CCF
and serial maintainers to the discussion with the hope of finding some
generic solution.

Most likely the easiest solution for Qualcomm platforms is to add
additional vote on earlycon clocks and then to try to generalise that.

> 
> Thanks,
> Stephan
Stephan Gerhold May 5, 2025, 1:06 p.m. UTC | #2
On Mon, May 05, 2025 at 03:17:00PM +0300, Dmitry Baryshkov wrote:
> On Mon, May 05, 2025 at 10:56:56AM +0200, Stephan Gerhold wrote:
> > On Sat, May 03, 2025 at 03:41:43AM +0300, Dmitry Baryshkov wrote:
> > > The commit 0ac9c3dd0d6f ("dmaengine: qcom: bam_dma: fix runtime PM
> > > underflow") made sure the BAM DMA device gets suspended, disabling the
> > > bam_clk. However for remotely controlled BAM DMA devices the clock might
> > > be disabled prematurely (e.g. in case of the earlycon this frequently
> > > happens before UART driver is able to probe), which causes device reset.
> > > 
> > > Use sync_state callback to ensure that bam_clk stays on until all users
> > > are probed (and are able to vote upon corresponding clocks).
> > > 
> > > Fixes: 0ac9c3dd0d6f ("dmaengine: qcom: bam_dma: fix runtime PM underflow")
> > > Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
> > 
> > Thanks for the patch! I actually created almost the same patch on
> > Friday, after struggling with this issue on DB410c when trying to add
> > the MPM as wakeup-parent for GPIOs. :-)
> > 
> > How is this issue related to _remotely-controlled_ BAMs?
> 
> My understanding is that for locally controlled BAMs we can disable the
> clock at the probe time as all the users of the BAM will be probed
> before accessing the BAM. In case of a remotely controlled BAM there can
> be a user (e.g. UART) which is running, but didn't request DMA channel
> yet.
> 
> Please correct me if I'm wrong here.
> 

Yes, I think there is a misunderstanding:

 - UART doesn't use DMA for earlycon (and I'm not aware of any boot
   firmware using DMA for UART either). The bam_dma driver really works
   as intended here, it votes for the clock during probe to perform the
   register access, then it drops the vote when going to runtime
   suspend. We don't really need .sync_state() for the BAM, because
   there are no active DMA consumers during boot.

 - All of the BLSP peripherals (BAM DMA, UART, I2C, SPI) need the
   shared GCC_BLSP1_AHB_CLK for register access. What happens is:

   1. GCC_BLSP1_AHB_CLK is left running by the bootloader.
   2. earlycon makes use the UART registers (without DMA), but doesn't
      vote for the clock (also: the clock driver isn't probed yet).
   3. GCC driver probes, leaves GCC_BLSP1_AHB_CLK running since unused
      clock cleanup happens later.
   4. bam_dma probes, requests GCC_BLSP1_AHB_CLK, votes for enable to
      access registers during probe.
   5. bam_dma goes to runtime suspend. The last (known) consumer of
      GCC_BLSP1_AHB_CLK is gone, so the clock is disabled.
   6. earlycon continues accessing the UART registers.
   [Device reset]
   7. The full UART driver probes and votes for GCC_BLSP1_AHB_CLK.

Conceptionally, we don't need .sync_state for the BAM, we need some kind
of .sync_state for the GCC_BLSP1_AHB_CLK. Or we need earlycon to vote
for the UART clocks as soon as the clock driver probes. Which is roughly
what drivers/clk/imx/clk.c imx_register_uart_clocks() does, except that
releasing the vote on late_initcall_sync() could still be too early.
(It's a stupid configuration, but what if UART needs some modules to
 probe successfully?)

> > The BAM clock will get disabled for all types of BAM control, so I don't
> > think the type of BAM control plays any role here. The BLSP DMA instance
> > that would most likely interfere with UART earlycon is
> > controlled-remotely on some SoCs (e.g. MSM8916), but currently not all
> > of them (e.g. MSM8974, IPQ8074, IPQ9574, ...).
> 
> This probably means that the definition of the flag needs to be
> clarified and maybe some of those platforms should use it.
> 

Yeah, probably. It would require testing and/or checking downstream
sources though. Here is a slightly clearer definition for the flags:

 - Controlled-remotely = reset+initialized outside of Linux, skip reset
   and initialization of global BAM registers.
 - Powered-remotely = powered on outside of Linux, but global BAM
   registers must be reset and initialized inside Linux.

And for the clock:

 - Clock specified: We can vote for the BAM to power on by turning on
   the clock. After that, we can access registers during probe.
 - Clock missing: BAM power management is integrated into the consumer
   interface (e.g. SLIMbus, BAM DMUX). We cannot access registers until
   the consumer requests the DMA channel.

I think the BLSP DMA (for UART/I2C/SPI) is often "controlled-remotely",
because there is the option to use some of the instances inside one of
the remoteprocs (RPM/Modem/WCNSS/ADSP/etc). It's not our job to keep the
BAM on though, the GCC_BLSP1_AHB_CLK is a clock that every remoteproc
can vote for independently.

> > The fixes tag also doesn't look correct to me, since commit 0ac9c3dd0d6f
> > ("dmaengine: qcom: bam_dma: fix runtime PM underflow") only changed the
> > behavior for BAMs with "if (!bdev->bamclk)". This applies to some/most
> > remotely-controlled BAMs, but the issue we have here occurs only because
> > we do have a clock and cause it to get disabled prematurely.
> 
> Well... It is a commit which broke earlycon on on db410c.
> 

FWIW, earlycon works for me on 6.15-rc4 on DB410c, I've only run into
this issue after adding the MPM definition, which changes probe order
enough to trigger this issue. I'm not sure how that commit changes
anything, will try reverting it for my setup later to see what happens.

> I started to describe here the usecase of the remotely-controlled DMA
> controller being used by the BLSP and then I understood, that I myself
> don't completely understand if the issue is because DMA block is
> controlled remotely (and we should not be disabling it because the BLSP
> still attempts to use it) or if it's a simple case of the clock being
> shared between several consumers and one of the consumers shutting it
> down before other running consumers had a chance to vote on it.
> 

The latter, as explained above. The DMA block doesn't play any role
here, it's not used during boot.

> > Checking for if (bdev->bamclk) would probably make more sense. In my
> > patch I did it just unconditionally, because runtime PM is currently
> > a no-op for BAMs without clock anyway.
> 
> Please share your patch.
> 

It's more or less equivalent to yours with some if checks removed.
I would be in favor of leaving the bam_dma driver as-is and solving this
on the clock/earlycon level.

> > 
> > I think it's also worth noting in the commit message that this is sort
> > of a stop-gap solution. The root problem is that the earlycon code
> > doesn't claim the clock while active. Any of the drivers that consume
> > this shared clock could trigger the issue, I had to fix a similar issue
> > in the spi-qup driver before in commit 0c331fd1dccf ("spi: qup: Request
> > DMA before enabling clocks"). On some SoCs (e.g. MSM8974), we have
> > "dmas" currently only on &blsp2_i2c5, so the UART controller wouldn't
> > even be considered as consumer to wait for before calling the bam_dma
> > .sync_state.
> > 
> > It may be more reliable to implement something like in
> > drivers/clk/imx/clk.c imx_register_uart_clocks(), which tries to claim
> > only the actually used UART clocks until late_initcall_sync(). That
> > would at least make it independent from individual drivers, but assumes
> > the UART driver can actually probe before late_initcall_sync() ...
> > Most of this code is generic though, so perhaps releasing the clocks
> > could be hooked up somewhere generic, when earlycon exits ...?
> 
> The spi-qup commit looks like another stop-gap workaround. Let's add CCF
> and serial maintainers to the discussion with the hope of finding some
> generic solution.
> 
> Most likely the easiest solution for Qualcomm platforms is to add
> additional vote on earlycon clocks and then to try to generalise that.
> 

Yeah, that would be the best option I think.

Thanks,
Stephan
diff mbox series

Patch

diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
index bbc3276992bb012a1b79937bdbd069fc01f75331..09ef450a9fb8e3efe6e664eb17caffb0bd337f84 100644
--- a/drivers/dma/qcom/bam_dma.c
+++ b/drivers/dma/qcom/bam_dma.c
@@ -1367,6 +1367,9 @@  static int bam_dma_probe(struct platform_device *pdev)
 	pm_runtime_set_active(&pdev->dev);
 	pm_runtime_enable(&pdev->dev);
 
+	if (bdev->controlled_remotely || bdev->powered_remotely)
+		pm_runtime_get_sync(&pdev->dev);
+
 	return 0;
 
 err_unregister_dma:
@@ -1414,6 +1417,16 @@  static void bam_dma_remove(struct platform_device *pdev)
 	clk_disable_unprepare(bdev->bamclk);
 }
 
+static void bam_dma_sync_state(struct device *dev)
+{
+	struct bam_device *bdev = dev_get_drvdata(dev);
+
+	if (bdev->controlled_remotely || bdev->powered_remotely) {
+		pm_runtime_mark_last_busy(bdev->dev);
+		pm_runtime_put_autosuspend(bdev->dev);
+	}
+}
+
 static int __maybe_unused bam_dma_runtime_suspend(struct device *dev)
 {
 	struct bam_device *bdev = dev_get_drvdata(dev);
@@ -1474,6 +1487,7 @@  static struct platform_driver bam_dma_driver = {
 		.name = "bam-dma-engine",
 		.pm = &bam_dma_pm_ops,
 		.of_match_table = bam_of_match,
+		.sync_state = bam_dma_sync_state,
 	},
 };