Message ID | 20240127003607.501086-6-andre.draszik@linaro.org |
---|---|
State | Superseded |
Headers | show |
Series | [1/5] clk: samsung: gs101: gpio_peric0_pclk needs to be kept on | expand |
Hi Sam, On Fri, 2024-01-26 at 21:30 -0600, Sam Protsenko wrote: > On Fri, Jan 26, 2024 at 6:37 PM André Draszik <andre.draszik@linaro.org> wrote: > > > > > Note that this commit has the side-effect of causing earlycon to stop > > to work sometime into the boot for two reasons: > > * peric0_top1_ipclk_0 requires its parent gout_cmu_peric0_ip to be > > running, but because earlycon doesn't deal with clocks that > > parent will be disabled when none of the other drivers that > > actually deal with clocks correctly require it to be running and > > the real serial driver (which does deal with clocks) hasn't taken > > over yet > > That's weird. Doesn't your bootloader setup serial clocks properly? > AFAIU, earlycon should rely on everything already configured in > bootloader. I tried to explain that above, but let me try again... The console UART, and I2C bus 8 are on the same cmu_peric0 controller, and that cmu_peric0 has two clocks coming from cmu_top, ip and bus. For I2C8 & UART to work, both of these clocks from cmu_top need to to be on as they are the parent of the i2c8-(ip|pclk) and uart-(ip|pclk) each. The bootloader leaves those clocks running, yes. So earlycon works (for a while). At some point into the boot, one of two things happens: 1) Linux will load the i2c driver. That driver does clock handling (correctly), it will initialise and then it has nothing to do, therefore it disables cmu_peric0's i2c8 ip and pclk clocks. Because at that stage nothing appears to be using the cmu_peric0's ip clock (the real serial driver hasn't initialised yet), Linux decides to also disable the parent ip clock coming from cmu_top. At this stage, the earlycon driver stops working, as the parent ip clock of the uart ip clock is not running any more. No serial output can be observed from this stage onwards. I think what is probably happening is that the console uart FIFO doesn't get emptied anymore, and earlycon will simply wait forever for space to become available in the FIFO (but I didn't debug this). Anyway, the boot doesn't progress, the system appears to hang. In any case it's not usable as we have no other means of using it at this stage (network / usb / display etc.). 2) Alternatively, the UART driver will load at this stage. Again, it will tweak the clocks and after probe it will leave its clocks disabled. The serial console driver hasn't taken over at this stage and earlycon is still active. Again, the system will hang, because IP and PCLK have been disabled by the UART driver. Once the serial console is enabled, clocks are being enabled again, but because earlycon is still waiting for progress, the boot doesn't progress past disabling ip and pclk. It never gets to enabling the serial console (re-enabling the clocks). So in both cases we get some output from earlycon, but the system hangs once the first consumer driver of an IP attached to cmu_peric0 has completed probing. > > * hand-over between earlycon and serial driver appears to be > > fragile and clocks get enabled and disabled a few times, which > > also causes register access to hang while earlycon is still > > active > > Nonetheless we shouldn't keep these clocks running unconditionally just > > for earlycon. Clocks should be disabled where possible. If earlycon is > > required in the future, e.g. for debug, this commit can simply be > > reverted (locally!). > > That sounds... not ideal. The ability to enable earlycon just by > adding some string to bootargs can be very useful for developers. > Maybe just make those clocks CLK_IGNORE_UNUSED, if that keeps earlycon > functional? With corresponding comments of course. CLK_IGNORE_UNUSED doesn't help in this case, the i2c and uart drivers will load and probe before earlycon gets disabled and as part of their probing disable the cmu_top ip clock going to cmu_peric0 If earlycon is not enabled in kernel command line, everything works fine, the kernel buffers its messages and once the real serial console driver starts, all messages since boot are being printed. Other than keeping it as CLK_IS_CRITICAL, there is no way that I can see to way to make the hand-over from earlycon to the real serial driver work in all cases. They are not critical clocks for the system, though, so it's wrong to always keep them running unconditionally. We are past a stage where earlycon is generally required. If it's required for some local development, people can revert this patch locally. BTW, downstream doesn't suffer from this problem because downstream uses ACG throughout and clocks are enabled automatically in hardware as required. Cheers, Andre'
On 1/27/24 00:35, André Draszik wrote: > The peric0_top1_ipclk_0 and peric0_top1_pclk_0 are the clocks going to > peric0/uart_usi, with pclk being the bus clock. Without pclk running, > any bus access will hang. > Unfortunately, in commit d97b6c902a40 ("arm64: dts: exynos: gs101: > update USI UART to use peric0 clocks") the gs101 DT ended up specifying > an incorrect pclkk in the respective node and instead the two clocks > here were marked as critical. > > We have fixed the gs101 DT and can therefore drop this incorrect > work-around here, the uart driver will claim these clocks as needed. > > Note that this commit has the side-effect of causing earlycon to stop > to work sometime into the boot for two reasons: > * peric0_top1_ipclk_0 requires its parent gout_cmu_peric0_ip to be > running, but because earlycon doesn't deal with clocks that > parent will be disabled when none of the other drivers that > actually deal with clocks correctly require it to be running and > the real serial driver (which does deal with clocks) hasn't taken > over yet > * hand-over between earlycon and serial driver appears to be > fragile and clocks get enabled and disabled a few times, which > also causes register access to hang while earlycon is still > active > Nonetheless we shouldn't keep these clocks running unconditionally just > for earlycon. Clocks should be disabled where possible. If earlycon is > required in the future, e.g. for debug, this commit can simply be > reverted (locally!). > > Fixes: 893f133a040b ("clk: samsung: gs101: add support for cmu_peric0") > Signed-off-by: André Draszik <andre.draszik@linaro.org> I find the logic fine: Reviewed-by: Tudor Ambarus <tudor.ambarus@linaro.org> > --- > drivers/clk/samsung/clk-gs101.c | 6 ++---- > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/drivers/clk/samsung/clk-gs101.c b/drivers/clk/samsung/clk-gs101.c > index 61bb0dcf84ee..5c338ac9231c 100644 > --- a/drivers/clk/samsung/clk-gs101.c > +++ b/drivers/clk/samsung/clk-gs101.c > @@ -2982,20 +2982,18 @@ static const struct samsung_gate_clock peric0_gate_clks[] __initconst = { > "gout_peric0_peric0_top0_pclk_9", "mout_peric0_bus_user", > CLK_CON_GAT_GOUT_BLK_PERIC0_UID_PERIC0_TOP0_IPCLKPORT_PCLK_9, > 21, 0, 0), > - /* Disabling this clock makes the system hang. Mark the clock as critical. */ > GATE(CLK_GOUT_PERIC0_PERIC0_TOP1_IPCLK_0, > "gout_peric0_peric0_top1_ipclk_0", "dout_peric0_usi0_uart", > CLK_CON_GAT_GOUT_BLK_PERIC0_UID_PERIC0_TOP1_IPCLKPORT_IPCLK_0, > - 21, CLK_IS_CRITICAL, 0), > + 21, 0, 0), > GATE(CLK_GOUT_PERIC0_PERIC0_TOP1_IPCLK_2, > "gout_peric0_peric0_top1_ipclk_2", "dout_peric0_usi14_usi", > CLK_CON_GAT_GOUT_BLK_PERIC0_UID_PERIC0_TOP1_IPCLKPORT_IPCLK_2, > 21, 0, 0), > - /* Disabling this clock makes the system hang. Mark the clock as critical. */ > GATE(CLK_GOUT_PERIC0_PERIC0_TOP1_PCLK_0, > "gout_peric0_peric0_top1_pclk_0", "mout_peric0_bus_user", > CLK_CON_GAT_GOUT_BLK_PERIC0_UID_PERIC0_TOP1_IPCLKPORT_PCLK_0, > - 21, CLK_IS_CRITICAL, 0), > + 21, 0, 0), > GATE(CLK_GOUT_PERIC0_PERIC0_TOP1_PCLK_2, > "gout_peric0_peric0_top1_pclk_2", "mout_peric0_bus_user", > CLK_CON_GAT_GOUT_BLK_PERIC0_UID_PERIC0_TOP1_IPCLKPORT_PCLK_2,
On Mon, Jan 29, 2024 at 8:37 AM André Draszik <andre.draszik@linaro.org> wrote: > > Hi Sam, > > On Fri, 2024-01-26 at 21:30 -0600, Sam Protsenko wrote: > > On Fri, Jan 26, 2024 at 6:37 PM André Draszik <andre.draszik@linaro.org> wrote: > > > > > > > > Note that this commit has the side-effect of causing earlycon to stop > > > to work sometime into the boot for two reasons: > > > * peric0_top1_ipclk_0 requires its parent gout_cmu_peric0_ip to be > > > running, but because earlycon doesn't deal with clocks that > > > parent will be disabled when none of the other drivers that > > > actually deal with clocks correctly require it to be running and > > > the real serial driver (which does deal with clocks) hasn't taken > > > over yet > > > > That's weird. Doesn't your bootloader setup serial clocks properly? > > AFAIU, earlycon should rely on everything already configured in > > bootloader. > > I tried to explain that above, but let me try again... > > The console UART, and I2C bus 8 are on the same cmu_peric0 controller, and > that cmu_peric0 has two clocks coming from cmu_top, ip and bus. For I2C8 & UART > to work, both of these clocks from cmu_top need to to be on as they are the > parent of the i2c8-(ip|pclk) and uart-(ip|pclk) each. > > The bootloader leaves those clocks running, yes. So earlycon works (for a > while). > > At some point into the boot, one of two things happens: > 1) Linux will load the i2c driver. That driver does clock handling > (correctly), it will initialise and then it has nothing to do, therefore it > disables cmu_peric0's i2c8 ip and pclk clocks. Because at that stage nothing > appears to be using the cmu_peric0's ip clock (the real serial driver hasn't > initialised yet), Linux decides to also disable the parent ip clock coming > from cmu_top. > > At this stage, the earlycon driver stops working, as the parent ip clock of > the uart ip clock is not running any more. No serial output can be observed > from this stage onwards. I think what is probably happening is that the > console uart FIFO doesn't get emptied anymore, and earlycon will simply wait > forever for space to become available in the FIFO (but I didn't debug this). > > Anyway, the boot doesn't progress, the system appears to hang. In any case it's > not usable as we have no other means of using it at this stage (network / > usb / display etc.). > > 2) Alternatively, the UART driver will load at this stage. Again, it will > tweak the clocks and after probe it will leave its clocks disabled. The > serial console driver hasn't taken over at this stage and earlycon is still > active. Again, the system will hang, because IP and PCLK have been disabled > by the UART driver. Once the serial console is enabled, clocks are being > enabled again, but because earlycon is still waiting for progress, the > boot doesn't progress past disabling ip and pclk. It never gets to enabling > the serial console (re-enabling the clocks). > > So in both cases we get some output from earlycon, but the system hangs once > the first consumer driver of an IP attached to cmu_peric0 has completed > probing. > > > > > > * hand-over between earlycon and serial driver appears to be > > > fragile and clocks get enabled and disabled a few times, which > > > also causes register access to hang while earlycon is still > > > active > > > Nonetheless we shouldn't keep these clocks running unconditionally just > > > for earlycon. Clocks should be disabled where possible. If earlycon is > > > required in the future, e.g. for debug, this commit can simply be > > > reverted (locally!). > > > > That sounds... not ideal. The ability to enable earlycon just by > > adding some string to bootargs can be very useful for developers. > > Maybe just make those clocks CLK_IGNORE_UNUSED, if that keeps earlycon > > functional? With corresponding comments of course. > > CLK_IGNORE_UNUSED doesn't help in this case, the i2c and uart drivers will load > and probe before earlycon gets disabled and as part of their probing disable > the cmu_top ip clock going to cmu_peric0 > > If earlycon is not enabled in kernel command line, everything works fine, the > kernel buffers its messages and once the real serial console driver starts, > all messages since boot are being printed. > > Other than keeping it as CLK_IS_CRITICAL, there is no way that I can see to > way to make the hand-over from earlycon to the real serial driver work in > all cases. > > They are not critical clocks for the system, though, so it's wrong to always > keep them running unconditionally. > > We are past a stage where earlycon is generally required. > > If it's required for some local development, people can revert this patch locally. > That sounds reasonable. But I wonder if that bit (about making this clock CLK_IS_CRITICAL to make earlycon functional) can be documented somewhere. Perhaps in the serial driver (earlycon function), or somewhere in device tree bindings? Because otherwise it might remain an arcane knowledge and people won't be able to use earlycon later. Anyways, for this patch: Reviewed-by: Sam Protsenko <semen.protsenko@linaro.org> and if you think it makes sense to document the bit above, please do. > > BTW, downstream doesn't suffer from this problem because downstream uses ACG > throughout and clocks are enabled automatically in hardware as required. > Yes, using QCH clocks (HWACG) seems like a correct way to fix this, and would be nice to have otherwise. Alas, it doesn't seems very easy to implement, and should probably be based on top of regular clock driver anyway. I thought about it for a while, but never came up with particular ideas on how to implement HWACG support in Samsung CCF framework properly. > > Cheers, > Andre' >
diff --git a/drivers/clk/samsung/clk-gs101.c b/drivers/clk/samsung/clk-gs101.c index 61bb0dcf84ee..5c338ac9231c 100644 --- a/drivers/clk/samsung/clk-gs101.c +++ b/drivers/clk/samsung/clk-gs101.c @@ -2982,20 +2982,18 @@ static const struct samsung_gate_clock peric0_gate_clks[] __initconst = { "gout_peric0_peric0_top0_pclk_9", "mout_peric0_bus_user", CLK_CON_GAT_GOUT_BLK_PERIC0_UID_PERIC0_TOP0_IPCLKPORT_PCLK_9, 21, 0, 0), - /* Disabling this clock makes the system hang. Mark the clock as critical. */ GATE(CLK_GOUT_PERIC0_PERIC0_TOP1_IPCLK_0, "gout_peric0_peric0_top1_ipclk_0", "dout_peric0_usi0_uart", CLK_CON_GAT_GOUT_BLK_PERIC0_UID_PERIC0_TOP1_IPCLKPORT_IPCLK_0, - 21, CLK_IS_CRITICAL, 0), + 21, 0, 0), GATE(CLK_GOUT_PERIC0_PERIC0_TOP1_IPCLK_2, "gout_peric0_peric0_top1_ipclk_2", "dout_peric0_usi14_usi", CLK_CON_GAT_GOUT_BLK_PERIC0_UID_PERIC0_TOP1_IPCLKPORT_IPCLK_2, 21, 0, 0), - /* Disabling this clock makes the system hang. Mark the clock as critical. */ GATE(CLK_GOUT_PERIC0_PERIC0_TOP1_PCLK_0, "gout_peric0_peric0_top1_pclk_0", "mout_peric0_bus_user", CLK_CON_GAT_GOUT_BLK_PERIC0_UID_PERIC0_TOP1_IPCLKPORT_PCLK_0, - 21, CLK_IS_CRITICAL, 0), + 21, 0, 0), GATE(CLK_GOUT_PERIC0_PERIC0_TOP1_PCLK_2, "gout_peric0_peric0_top1_pclk_2", "mout_peric0_bus_user", CLK_CON_GAT_GOUT_BLK_PERIC0_UID_PERIC0_TOP1_IPCLKPORT_PCLK_2,
The peric0_top1_ipclk_0 and peric0_top1_pclk_0 are the clocks going to peric0/uart_usi, with pclk being the bus clock. Without pclk running, any bus access will hang. Unfortunately, in commit d97b6c902a40 ("arm64: dts: exynos: gs101: update USI UART to use peric0 clocks") the gs101 DT ended up specifying an incorrect pclkk in the respective node and instead the two clocks here were marked as critical. We have fixed the gs101 DT and can therefore drop this incorrect work-around here, the uart driver will claim these clocks as needed. Note that this commit has the side-effect of causing earlycon to stop to work sometime into the boot for two reasons: * peric0_top1_ipclk_0 requires its parent gout_cmu_peric0_ip to be running, but because earlycon doesn't deal with clocks that parent will be disabled when none of the other drivers that actually deal with clocks correctly require it to be running and the real serial driver (which does deal with clocks) hasn't taken over yet * hand-over between earlycon and serial driver appears to be fragile and clocks get enabled and disabled a few times, which also causes register access to hang while earlycon is still active Nonetheless we shouldn't keep these clocks running unconditionally just for earlycon. Clocks should be disabled where possible. If earlycon is required in the future, e.g. for debug, this commit can simply be reverted (locally!). Fixes: 893f133a040b ("clk: samsung: gs101: add support for cmu_peric0") Signed-off-by: André Draszik <andre.draszik@linaro.org> --- drivers/clk/samsung/clk-gs101.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)