Message ID | 20241204155806.3781200-7-claudiu.beznea.uj@bp.renesas.com |
---|---|
State | New |
Headers | show |
Series | serial: sh-sci: Fixes for earlycon and keep_bootcon | expand |
Hi Claudiu, On Wed, Dec 4, 2024 at 4:58 PM Claudiu <claudiu.beznea@tuxon.dev> wrote: > From: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> > > In the sh-sci driver, serial ports are mapped to the sci_ports[] array, > with earlycon mapped at index zero. > > The uart_add_one_port() function eventually calls __device_attach(), > which, in turn, calls pm_request_idle(). The identified code path is as > follows: > > uart_add_one_port() -> > serial_ctrl_register_port() -> > serial_core_register_port() -> > serial_core_port_device_add() -> > serial_base_port_add() -> > device_add() -> > bus_probe_device() -> > device_initial_probe() -> > __device_attach() -> > // ... > if (dev->p->dead) { > // ... > } else if (dev->driver) { > // ... > } else { > // ... > pm_request_idle(dev); > // ... > } > > The earlycon device clocks are enabled by the bootloader. However, the > pm_request_idle() call in __device_attach() disables the SCI port clocks > while earlycon is still active. > > The earlycon write function, serial_console_write(), calls > sci_poll_put_char() via serial_console_putchar(). If the SCI port clocks > are disabled, writing to earlycon may sometimes cause the SR.TDFE bit to > remain unset indefinitely, causing the while loop in sci_poll_put_char() > to never exit. On single-core SoCs, this can result in the system being > blocked during boot when this issue occurs. > > To resolve this, increment the runtime PM usage counter for the earlycon > SCI device before registering the UART port. > > Fixes: 0b0cced19ab1 ("serial: sh-sci: Add CONFIG_SERIAL_EARLYCON support") > Cc: stable@vger.kernel.org > Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Thanks for your patch! > --- a/drivers/tty/serial/sh-sci.c > +++ b/drivers/tty/serial/sh-sci.c > @@ -3435,7 +3435,24 @@ static int sci_probe_single(struct platform_device *dev, > sciport->port.flags |= UPF_HARD_FLOW; > } > > + /* > + * In case: > + * - this is the earlycon port (mapped on index 0 in sci_ports[]) and > + * - it now maps to an alias other than zero and > + * - the earlycon is still alive (e.g., "earlycon keep_bootcon" is > + * available in bootargs) > + * > + * we need to avoid disabling clocks and PM domains through the runtime > + * PM APIs called in __device_attach(). For this, increment the runtime > + * PM reference counter (the clocks and PM domains were already enabled > + * by the bootloader). Otherwise the earlycon may access the HW when it > + * has no clocks enabled leading to failures (infinite loop in > + * sci_poll_put_char()). > + */ > + > if (sci_ports[0].earlycon && sci_ports[0].port.mapbase == sci_res->start) { Now there are two tests for mapbase: here and in sci_probe()... > + pm_runtime_get_noresume(&dev->dev); > + > /* > * Skip cleanup up the sci_port[0] in early_console_exit(), this > * port is the same as the earlycon one. Gr{oetje,eeting}s, Geert
On 19.12.2024 16:30, Geert Uytterhoeven wrote: > Hi Claudiu, > > On Wed, Dec 4, 2024 at 4:58 PM Claudiu <claudiu.beznea@tuxon.dev> wrote: >> From: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> >> >> In the sh-sci driver, serial ports are mapped to the sci_ports[] array, >> with earlycon mapped at index zero. >> >> The uart_add_one_port() function eventually calls __device_attach(), >> which, in turn, calls pm_request_idle(). The identified code path is as >> follows: >> >> uart_add_one_port() -> >> serial_ctrl_register_port() -> >> serial_core_register_port() -> >> serial_core_port_device_add() -> >> serial_base_port_add() -> >> device_add() -> >> bus_probe_device() -> >> device_initial_probe() -> >> __device_attach() -> >> // ... >> if (dev->p->dead) { >> // ... >> } else if (dev->driver) { >> // ... >> } else { >> // ... >> pm_request_idle(dev); >> // ... >> } >> >> The earlycon device clocks are enabled by the bootloader. However, the >> pm_request_idle() call in __device_attach() disables the SCI port clocks >> while earlycon is still active. >> >> The earlycon write function, serial_console_write(), calls >> sci_poll_put_char() via serial_console_putchar(). If the SCI port clocks >> are disabled, writing to earlycon may sometimes cause the SR.TDFE bit to >> remain unset indefinitely, causing the while loop in sci_poll_put_char() >> to never exit. On single-core SoCs, this can result in the system being >> blocked during boot when this issue occurs. >> >> To resolve this, increment the runtime PM usage counter for the earlycon >> SCI device before registering the UART port. >> >> Fixes: 0b0cced19ab1 ("serial: sh-sci: Add CONFIG_SERIAL_EARLYCON support") >> Cc: stable@vger.kernel.org >> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> > > Thanks for your patch! > >> --- a/drivers/tty/serial/sh-sci.c >> +++ b/drivers/tty/serial/sh-sci.c >> @@ -3435,7 +3435,24 @@ static int sci_probe_single(struct platform_device *dev, >> sciport->port.flags |= UPF_HARD_FLOW; >> } >> >> + /* >> + * In case: >> + * - this is the earlycon port (mapped on index 0 in sci_ports[]) and >> + * - it now maps to an alias other than zero and >> + * - the earlycon is still alive (e.g., "earlycon keep_bootcon" is >> + * available in bootargs) >> + * >> + * we need to avoid disabling clocks and PM domains through the runtime >> + * PM APIs called in __device_attach(). For this, increment the runtime >> + * PM reference counter (the clocks and PM domains were already enabled >> + * by the bootloader). Otherwise the earlycon may access the HW when it >> + * has no clocks enabled leading to failures (infinite loop in >> + * sci_poll_put_char()). >> + */ >> + >> if (sci_ports[0].earlycon && sci_ports[0].port.mapbase == sci_res->start) { > > Now there are two tests for mapbase: here and in sci_probe()... I'll adjust it! Thank you for your review, Claudiu > >> + pm_runtime_get_noresume(&dev->dev); >> + >> /* >> * Skip cleanup up the sci_port[0] in early_console_exit(), this >> * port is the same as the earlycon one. > > Gr{oetje,eeting}s, > > Geert >
Hi, Geert, On 19.12.2024 16:30, Geert Uytterhoeven wrote: > Hi Claudiu, > > On Wed, Dec 4, 2024 at 4:58 PM Claudiu <claudiu.beznea@tuxon.dev> wrote: >> From: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> >> >> In the sh-sci driver, serial ports are mapped to the sci_ports[] array, >> with earlycon mapped at index zero. >> >> The uart_add_one_port() function eventually calls __device_attach(), >> which, in turn, calls pm_request_idle(). The identified code path is as >> follows: >> >> uart_add_one_port() -> >> serial_ctrl_register_port() -> >> serial_core_register_port() -> >> serial_core_port_device_add() -> >> serial_base_port_add() -> >> device_add() -> >> bus_probe_device() -> >> device_initial_probe() -> >> __device_attach() -> >> // ... >> if (dev->p->dead) { >> // ... >> } else if (dev->driver) { >> // ... >> } else { >> // ... >> pm_request_idle(dev); >> // ... >> } >> >> The earlycon device clocks are enabled by the bootloader. However, the >> pm_request_idle() call in __device_attach() disables the SCI port clocks >> while earlycon is still active. >> >> The earlycon write function, serial_console_write(), calls >> sci_poll_put_char() via serial_console_putchar(). If the SCI port clocks >> are disabled, writing to earlycon may sometimes cause the SR.TDFE bit to >> remain unset indefinitely, causing the while loop in sci_poll_put_char() >> to never exit. On single-core SoCs, this can result in the system being >> blocked during boot when this issue occurs. >> >> To resolve this, increment the runtime PM usage counter for the earlycon >> SCI device before registering the UART port. >> >> Fixes: 0b0cced19ab1 ("serial: sh-sci: Add CONFIG_SERIAL_EARLYCON support") >> Cc: stable@vger.kernel.org >> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> > > Thanks for your patch! > >> --- a/drivers/tty/serial/sh-sci.c >> +++ b/drivers/tty/serial/sh-sci.c >> @@ -3435,7 +3435,24 @@ static int sci_probe_single(struct platform_device *dev, >> sciport->port.flags |= UPF_HARD_FLOW; >> } >> >> + /* >> + * In case: >> + * - this is the earlycon port (mapped on index 0 in sci_ports[]) and >> + * - it now maps to an alias other than zero and >> + * - the earlycon is still alive (e.g., "earlycon keep_bootcon" is >> + * available in bootargs) >> + * >> + * we need to avoid disabling clocks and PM domains through the runtime >> + * PM APIs called in __device_attach(). For this, increment the runtime >> + * PM reference counter (the clocks and PM domains were already enabled >> + * by the bootloader). Otherwise the earlycon may access the HW when it >> + * has no clocks enabled leading to failures (infinite loop in >> + * sci_poll_put_char()). >> + */ >> + >> if (sci_ports[0].earlycon && sci_ports[0].port.mapbase == sci_res->start) { > > Now there are two tests for mapbase: here and in sci_probe()... I'm not sure how can we avoid it. We need to re-check it in this function as the sci_probe_single() is the one that enables the runtime PM. Would you prefer to move the devm_pm_runtime_enable() in sci_probe() and have the pm_runtime_get_noresume() in sci_probe() as well? Thank you, Claudiu > >> + pm_runtime_get_noresume(&dev->dev); >> + >> /* >> * Skip cleanup up the sci_port[0] in early_console_exit(), this >> * port is the same as the earlycon one. > > Gr{oetje,eeting}s, > > Geert >
diff --git a/drivers/tty/serial/sh-sci.c b/drivers/tty/serial/sh-sci.c index f74eb68774ca..6acdc8588d2d 100644 --- a/drivers/tty/serial/sh-sci.c +++ b/drivers/tty/serial/sh-sci.c @@ -3435,7 +3435,24 @@ static int sci_probe_single(struct platform_device *dev, sciport->port.flags |= UPF_HARD_FLOW; } + /* + * In case: + * - this is the earlycon port (mapped on index 0 in sci_ports[]) and + * - it now maps to an alias other than zero and + * - the earlycon is still alive (e.g., "earlycon keep_bootcon" is + * available in bootargs) + * + * we need to avoid disabling clocks and PM domains through the runtime + * PM APIs called in __device_attach(). For this, increment the runtime + * PM reference counter (the clocks and PM domains were already enabled + * by the bootloader). Otherwise the earlycon may access the HW when it + * has no clocks enabled leading to failures (infinite loop in + * sci_poll_put_char()). + */ + if (sci_ports[0].earlycon && sci_ports[0].port.mapbase == sci_res->start) { + pm_runtime_get_noresume(&dev->dev); + /* * Skip cleanup up the sci_port[0] in early_console_exit(), this * port is the same as the earlycon one.