Message ID | 20220809152019.461741-1-narmstrong@baylibre.com |
---|---|
State | New |
Headers | show |
Series | spi: meson-spicc: save pow2 datarate between messages | expand |
On Tue, Aug 09, 2022 at 05:20:19PM +0200, Neil Armstrong wrote: > At the end of a message, the HW gets a reset in meson_spicc_unprepare_transfer(), > this resets the SPICC_CONREG register and notably the value set by the > Common Clock Framework. > This saves the datarate dividor value between message to keep the last > set value by the Common Clock Framework. When you say the value set by the clock framework does that mean that the clock driver is adjusting hardware inside the SPI controller IP block which is then getting reset by the SPI driver without the SPI driver knowing about it? That seems like a bad idea as you're finding here. > This didn't appear before commit 3e0cf4d3fc29 ("spi: meson-spicc: add a linear clock divider support") > because we recalculated and wrote the rate for each xfer. Note that the rate might change per transfer.
On 09/08/2022 17:27, Mark Brown wrote: > On Tue, Aug 09, 2022 at 05:20:19PM +0200, Neil Armstrong wrote: >> At the end of a message, the HW gets a reset in meson_spicc_unprepare_transfer(), >> this resets the SPICC_CONREG register and notably the value set by the >> Common Clock Framework. > >> This saves the datarate dividor value between message to keep the last >> set value by the Common Clock Framework. > > When you say the value set by the clock framework does that mean that > the clock driver is adjusting hardware inside the SPI controller IP > block which is then getting reset by the SPI driver without the SPI > driver knowing about it? That seems like a bad idea as you're finding > here. The SPI driver is explicitely triggering a reset at the end of each message to get back to a clean HW state, but it does reset the content of the "legacy" registers containing the power of 2 divider value, the new registers configuring the new clock divider path (only on newer SoCs) doesn't get cleared. > >> This didn't appear before commit 3e0cf4d3fc29 ("spi: meson-spicc: add a linear clock divider support") >> because we recalculated and wrote the rate for each xfer. > > Note that the rate might change per transfer. It's taken in account, this case is when the rate doesn't change.
On Wed, Aug 10, 2022 at 11:17:14AM +0200, Neil Armstrong wrote: > On 09/08/2022 17:27, Mark Brown wrote: > > On Tue, Aug 09, 2022 at 05:20:19PM +0200, Neil Armstrong wrote: > > When you say the value set by the clock framework does that mean that > > the clock driver is adjusting hardware inside the SPI controller IP > > block which is then getting reset by the SPI driver without the SPI > > driver knowing about it? That seems like a bad idea as you're finding > > here. > The SPI driver is explicitely triggering a reset at the end of each message > to get back to a clean HW state, but it does reset the content of the "legacy" > registers containing the power of 2 divider value, the new registers configuring > the new clock divider path (only on newer SoCs) doesn't get cleared. Sure, but that doesn't really address the concern - is this something that the clk driver programmed or is this the driver forgetting to restore a register that it programmed itself? The commit message sounds like the former which is a much bigger problem.
On 10/08/2022 14:37, Mark Brown wrote: > On Wed, Aug 10, 2022 at 11:17:14AM +0200, Neil Armstrong wrote: >> On 09/08/2022 17:27, Mark Brown wrote: >>> On Tue, Aug 09, 2022 at 05:20:19PM +0200, Neil Armstrong wrote: > >>> When you say the value set by the clock framework does that mean that >>> the clock driver is adjusting hardware inside the SPI controller IP >>> block which is then getting reset by the SPI driver without the SPI >>> driver knowing about it? That seems like a bad idea as you're finding >>> here. > >> The SPI driver is explicitely triggering a reset at the end of each message >> to get back to a clean HW state, but it does reset the content of the "legacy" >> registers containing the power of 2 divider value, the new registers configuring >> the new clock divider path (only on newer SoCs) doesn't get cleared. > > Sure, but that doesn't really address the concern - is this something > that the clk driver programmed or is this the driver forgetting to > restore a register that it programmed itself? The commit message sounds > like the former which is a much bigger problem. It's what is programmed by the Clock Framework yes, it was designed as-is so the Clock Framework takes the most accurate clock path but the reset case wasn't taken in account. Neil
On Wed, Aug 10, 2022 at 04:01:33PM +0200, Neil Armstrong wrote: > On 10/08/2022 14:37, Mark Brown wrote: > > Sure, but that doesn't really address the concern - is this something > > that the clk driver programmed or is this the driver forgetting to > > restore a register that it programmed itself? The commit message sounds > > like the former which is a much bigger problem. > It's what is programmed by the Clock Framework yes, it was designed as-is > so the Clock Framework takes the most accurate clock path but the reset case > wasn't taken in account. This seems like a bad idea, we shouldn't have two different drivers managing the same register without explicit and visible coordination with each other, this is at best asking for trouble as you've found here. I've not looked in detail but I think if you want to use the clock framework here then this driver should register a clock provider for the clock hardware in the IP block. How does this work with runtime PM, what happens if the clock driver decides to change something while the device is powered down?
On 10/08/2022 16:31, Mark Brown wrote: > On Wed, Aug 10, 2022 at 04:01:33PM +0200, Neil Armstrong wrote: >> On 10/08/2022 14:37, Mark Brown wrote: > >>> Sure, but that doesn't really address the concern - is this something >>> that the clk driver programmed or is this the driver forgetting to >>> restore a register that it programmed itself? The commit message sounds >>> like the former which is a much bigger problem. > >> It's what is programmed by the Clock Framework yes, it was designed as-is >> so the Clock Framework takes the most accurate clock path but the reset case >> wasn't taken in account. > > This seems like a bad idea, we shouldn't have two different drivers > managing the same register without explicit and visible coordination > with each other, this is at best asking for trouble as you've found > here. I've not looked in detail but I think if you want to use the > clock framework here then this driver should register a clock provider > for the clock hardware in the IP block. I totally understand, this wasn't explicit until I found the bug. I don't think it's worth adding so much code for this since we already had an open-coded function which perfectly worked before. > > How does this work with runtime PM, what happens if the clock driver > decides to change something while the device is powered down? There's no runtime PM implemented, and yes it would be an issue. I'm perfectly OK to remove the CCF driver for the legacy clock path and return back to the old open coded calculation since it perfectly worked and stop using the legacy clock path for new SoCs since it would never be selected anyway... ... but GX SoCs are broken so it would need an intermediate fix until I push the refactoring to cleanup all this. Neil
On Wed, Aug 10, 2022 at 04:40:04PM +0200, Neil Armstrong wrote: > I don't think it's worth adding so much code for this since we already I don't recall the code for clock providers being that hard? They're generally pretty small, some of the ASoC CODEC drivers did something similar. > had an open-coded function which perfectly worked before. Except in the cases it didn't... > I'm perfectly OK to remove the CCF driver for the legacy clock path > and return back to the old open coded calculation since it perfectly > worked and stop using the legacy clock path for new SoCs since it would > never be selected anyway... It does seem better to go the clock provider route TBH. > ... but GX SoCs are broken so it would need an intermediate fix until > I push the refactoring to cleanup all this. I'm trying to figure out if this is actually fixing the problem or just papering over one case where things happened to go badly.
On 10/08/2022 16:49, Mark Brown wrote: > On Wed, Aug 10, 2022 at 04:40:04PM +0200, Neil Armstrong wrote: > >> I don't think it's worth adding so much code for this since we already > > I don't recall the code for clock providers being that hard? They're > generally pretty small, some of the ASoC CODEC drivers did something > similar. Seems over-engineering to me, but I can explore this path if it's the best route to follow. > >> had an open-coded function which perfectly worked before. > > Except in the cases it didn't... It did but wasn't generic enough to take the new clock path introduced in the new SoCs. > >> I'm perfectly OK to remove the CCF driver for the legacy clock path >> and return back to the old open coded calculation since it perfectly >> worked and stop using the legacy clock path for new SoCs since it would >> never be selected anyway... > > It does seem better to go the clock provider route TBH. I'm afraid this won't fix the problem since CCF won't set the clock again if the rate is already ok in it's cache, we'd still need to save the divider value and restore it after the reset as I did on this exact patch. > >> ... but GX SoCs are broken so it would need an intermediate fix until >> I push the refactoring to cleanup all this. > > I'm trying to figure out if this is actually fixing the problem or just > papering over one case where things happened to go badly. It does, when clk_set_rate() is called, the datarate field would be the same as after the previous call. Neil
On Wed, Aug 10, 2022 at 05:51:33PM +0200, Neil Armstrong wrote: > On 10/08/2022 16:49, Mark Brown wrote: > > I don't recall the code for clock providers being that hard? They're > > generally pretty small, some of the ASoC CODEC drivers did something > > similar. > Seems over-engineering to me, but I can explore this path if it's the best > route to follow. Please. > > > had an open-coded function which perfectly worked before. > > Except in the cases it didn't... > It did but wasn't generic enough to take the new clock path introduced > in the new SoCs. Right, it's leaving landmines lying around - it's hard for me to be confident that we couldn't also get another surprise due to a new test case exercising the code differently somehow, never mind the fragility of the code. > > > I'm perfectly OK to remove the CCF driver for the legacy clock path > > > and return back to the old open coded calculation since it perfectly > > > worked and stop using the legacy clock path for new SoCs since it would > > > never be selected anyway... > > It does seem better to go the clock provider route TBH. > I'm afraid this won't fix the problem since CCF won't set the clock again > if the rate is already ok in it's cache, we'd still need to save the divider > value and restore it after the reset as I did on this exact patch. The goal here is to ensure that the clock framework's idea of what the programmed configuration is and the SPI driver's idea of what that is can't get out of sync - instead of saving the value by virtue of reading it back out of the hardware at some specific point and hoping that there's never any changes triggered by the clock framework between the save and restore the driver is getting notified whenever there's a change being made and can update the cache then. That way we ensure we can't miss any cases and things are robust.
diff --git a/drivers/spi/spi-meson-spicc.c b/drivers/spi/spi-meson-spicc.c index 0bc7daa7afc8..e58686e28439 100644 --- a/drivers/spi/spi-meson-spicc.c +++ b/drivers/spi/spi-meson-spicc.c @@ -166,6 +166,7 @@ struct meson_spicc_device { unsigned long tx_remain; unsigned long rx_remain; unsigned long xfer_remain; + unsigned int pow2_datarate; }; static void meson_spicc_oen_enable(struct meson_spicc_device *spicc) @@ -458,7 +459,8 @@ static int meson_spicc_prepare_message(struct spi_master *master, /* Select CS */ conf |= FIELD_PREP(SPICC_CS_MASK, spi->chip_select); - /* Default Clock rate core/4 */ + /* Saved pow2 Clock rate */ + conf |= FIELD_PREP(SPICC_DATARATE_MASK, spicc->pow2_datarate); /* Default 8bit word */ conf |= FIELD_PREP(SPICC_BITLENGTH_MASK, 8 - 1); @@ -480,6 +482,10 @@ static int meson_spicc_unprepare_transfer(struct spi_master *master) /* Disable all IRQs */ writel(0, spicc->base + SPICC_INTREG); + /* Save last pow2 datarate before HW reset */ + spicc->pow2_datarate = FIELD_GET(SPICC_DATARATE_MASK, + readl_relaxed(spicc->base + SPICC_CONREG)); + device_reset_optional(&spicc->pdev->dev); return 0;
At the end of a message, the HW gets a reset in meson_spicc_unprepare_transfer(), this resets the SPICC_CONREG register and notably the value set by the Common Clock Framework. This saves the datarate dividor value between message to keep the last set value by the Common Clock Framework. This didn't appear before commit 3e0cf4d3fc29 ("spi: meson-spicc: add a linear clock divider support") because we recalculated and wrote the rate for each xfer. Fixes: 3e0cf4d3fc29 ("spi: meson-spicc: add a linear clock divider support") Reported-by: Da Xue <da@libre.computer> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> --- drivers/spi/spi-meson-spicc.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)