diff mbox series

[RFT,6/6] serial: sh-sci: Increment the runtime usage counter for the earlycon device

Message ID 20241204155806.3781200-7-claudiu.beznea.uj@bp.renesas.com
State New
Headers show
Series serial: sh-sci: Fixes for earlycon and keep_bootcon | expand

Commit Message

Claudiu Beznea Dec. 4, 2024, 3:58 p.m. UTC
From: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>

In the sh-sci driver, serial ports are mapped to the sci_ports[] array,
with earlycon mapped at index zero.

The uart_add_one_port() function eventually calls __device_attach(),
which, in turn, calls pm_request_idle(). The identified code path is as
follows:

uart_add_one_port() ->
  serial_ctrl_register_port() ->
    serial_core_register_port() ->
      serial_core_port_device_add() ->
        serial_base_port_add() ->
	  device_add() ->
	    bus_probe_device() ->
	      device_initial_probe() ->
	        __device_attach() ->
		  // ...
		  if (dev->p->dead) {
		    // ...
		  } else if (dev->driver) {
		    // ...
		  } else {
		    // ...
		    pm_request_idle(dev);
		    // ...
		  }

The earlycon device clocks are enabled by the bootloader. However, the
pm_request_idle() call in __device_attach() disables the SCI port clocks
while earlycon is still active.

The earlycon write function, serial_console_write(), calls
sci_poll_put_char() via serial_console_putchar(). If the SCI port clocks
are disabled, writing to earlycon may sometimes cause the SR.TDFE bit to
remain unset indefinitely, causing the while loop in sci_poll_put_char()
to never exit. On single-core SoCs, this can result in the system being
blocked during boot when this issue occurs.

To resolve this, increment the runtime PM usage counter for the earlycon
SCI device before registering the UART port.

Fixes: 0b0cced19ab1 ("serial: sh-sci: Add CONFIG_SERIAL_EARLYCON support")
Cc: stable@vger.kernel.org
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
---
 drivers/tty/serial/sh-sci.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

Comments

Geert Uytterhoeven Dec. 19, 2024, 2:30 p.m. UTC | #1
Hi Claudiu,

On Wed, Dec 4, 2024 at 4:58 PM Claudiu <claudiu.beznea@tuxon.dev> wrote:
> From: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
>
> In the sh-sci driver, serial ports are mapped to the sci_ports[] array,
> with earlycon mapped at index zero.
>
> The uart_add_one_port() function eventually calls __device_attach(),
> which, in turn, calls pm_request_idle(). The identified code path is as
> follows:
>
> uart_add_one_port() ->
>   serial_ctrl_register_port() ->
>     serial_core_register_port() ->
>       serial_core_port_device_add() ->
>         serial_base_port_add() ->
>           device_add() ->
>             bus_probe_device() ->
>               device_initial_probe() ->
>                 __device_attach() ->
>                   // ...
>                   if (dev->p->dead) {
>                     // ...
>                   } else if (dev->driver) {
>                     // ...
>                   } else {
>                     // ...
>                     pm_request_idle(dev);
>                     // ...
>                   }
>
> The earlycon device clocks are enabled by the bootloader. However, the
> pm_request_idle() call in __device_attach() disables the SCI port clocks
> while earlycon is still active.
>
> The earlycon write function, serial_console_write(), calls
> sci_poll_put_char() via serial_console_putchar(). If the SCI port clocks
> are disabled, writing to earlycon may sometimes cause the SR.TDFE bit to
> remain unset indefinitely, causing the while loop in sci_poll_put_char()
> to never exit. On single-core SoCs, this can result in the system being
> blocked during boot when this issue occurs.
>
> To resolve this, increment the runtime PM usage counter for the earlycon
> SCI device before registering the UART port.
>
> Fixes: 0b0cced19ab1 ("serial: sh-sci: Add CONFIG_SERIAL_EARLYCON support")
> Cc: stable@vger.kernel.org
> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>

Thanks for your patch!

> --- a/drivers/tty/serial/sh-sci.c
> +++ b/drivers/tty/serial/sh-sci.c
> @@ -3435,7 +3435,24 @@ static int sci_probe_single(struct platform_device *dev,
>                 sciport->port.flags |= UPF_HARD_FLOW;
>         }
>
> +       /*
> +        * In case:
> +        * - this is the earlycon port (mapped on index 0 in sci_ports[]) and
> +        * - it now maps to an alias other than zero and
> +        * - the earlycon is still alive (e.g., "earlycon keep_bootcon" is
> +        *   available in bootargs)
> +        *
> +        * we need to avoid disabling clocks and PM domains through the runtime
> +        * PM APIs called in __device_attach(). For this, increment the runtime
> +        * PM reference counter (the clocks and PM domains were already enabled
> +        * by the bootloader). Otherwise the earlycon may access the HW when it
> +        * has no clocks enabled leading to failures (infinite loop in
> +        * sci_poll_put_char()).
> +        */
> +
>         if (sci_ports[0].earlycon && sci_ports[0].port.mapbase == sci_res->start) {

Now there are two tests for mapbase: here and in sci_probe()...

> +               pm_runtime_get_noresume(&dev->dev);
> +
>                 /*
>                  * Skip cleanup up the sci_port[0] in early_console_exit(), this
>                  * port is the same as the earlycon one.

Gr{oetje,eeting}s,

                        Geert
Claudiu Beznea Dec. 21, 2024, 9:40 a.m. UTC | #2
On 19.12.2024 16:30, Geert Uytterhoeven wrote:
> Hi Claudiu,
> 
> On Wed, Dec 4, 2024 at 4:58 PM Claudiu <claudiu.beznea@tuxon.dev> wrote:
>> From: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
>>
>> In the sh-sci driver, serial ports are mapped to the sci_ports[] array,
>> with earlycon mapped at index zero.
>>
>> The uart_add_one_port() function eventually calls __device_attach(),
>> which, in turn, calls pm_request_idle(). The identified code path is as
>> follows:
>>
>> uart_add_one_port() ->
>>   serial_ctrl_register_port() ->
>>     serial_core_register_port() ->
>>       serial_core_port_device_add() ->
>>         serial_base_port_add() ->
>>           device_add() ->
>>             bus_probe_device() ->
>>               device_initial_probe() ->
>>                 __device_attach() ->
>>                   // ...
>>                   if (dev->p->dead) {
>>                     // ...
>>                   } else if (dev->driver) {
>>                     // ...
>>                   } else {
>>                     // ...
>>                     pm_request_idle(dev);
>>                     // ...
>>                   }
>>
>> The earlycon device clocks are enabled by the bootloader. However, the
>> pm_request_idle() call in __device_attach() disables the SCI port clocks
>> while earlycon is still active.
>>
>> The earlycon write function, serial_console_write(), calls
>> sci_poll_put_char() via serial_console_putchar(). If the SCI port clocks
>> are disabled, writing to earlycon may sometimes cause the SR.TDFE bit to
>> remain unset indefinitely, causing the while loop in sci_poll_put_char()
>> to never exit. On single-core SoCs, this can result in the system being
>> blocked during boot when this issue occurs.
>>
>> To resolve this, increment the runtime PM usage counter for the earlycon
>> SCI device before registering the UART port.
>>
>> Fixes: 0b0cced19ab1 ("serial: sh-sci: Add CONFIG_SERIAL_EARLYCON support")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
> 
> Thanks for your patch!
> 
>> --- a/drivers/tty/serial/sh-sci.c
>> +++ b/drivers/tty/serial/sh-sci.c
>> @@ -3435,7 +3435,24 @@ static int sci_probe_single(struct platform_device *dev,
>>                 sciport->port.flags |= UPF_HARD_FLOW;
>>         }
>>
>> +       /*
>> +        * In case:
>> +        * - this is the earlycon port (mapped on index 0 in sci_ports[]) and
>> +        * - it now maps to an alias other than zero and
>> +        * - the earlycon is still alive (e.g., "earlycon keep_bootcon" is
>> +        *   available in bootargs)
>> +        *
>> +        * we need to avoid disabling clocks and PM domains through the runtime
>> +        * PM APIs called in __device_attach(). For this, increment the runtime
>> +        * PM reference counter (the clocks and PM domains were already enabled
>> +        * by the bootloader). Otherwise the earlycon may access the HW when it
>> +        * has no clocks enabled leading to failures (infinite loop in
>> +        * sci_poll_put_char()).
>> +        */
>> +
>>         if (sci_ports[0].earlycon && sci_ports[0].port.mapbase == sci_res->start) {
> 
> Now there are two tests for mapbase: here and in sci_probe()...

I'll adjust it!

Thank you for your review,
Claudiu

> 
>> +               pm_runtime_get_noresume(&dev->dev);
>> +
>>                 /*
>>                  * Skip cleanup up the sci_port[0] in early_console_exit(), this
>>                  * port is the same as the earlycon one.
> 
> Gr{oetje,eeting}s,
> 
>                         Geert
>
Claudiu Beznea Jan. 2, 2025, 5:57 p.m. UTC | #3
Hi, Geert,

On 19.12.2024 16:30, Geert Uytterhoeven wrote:
> Hi Claudiu,
> 
> On Wed, Dec 4, 2024 at 4:58 PM Claudiu <claudiu.beznea@tuxon.dev> wrote:
>> From: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
>>
>> In the sh-sci driver, serial ports are mapped to the sci_ports[] array,
>> with earlycon mapped at index zero.
>>
>> The uart_add_one_port() function eventually calls __device_attach(),
>> which, in turn, calls pm_request_idle(). The identified code path is as
>> follows:
>>
>> uart_add_one_port() ->
>>   serial_ctrl_register_port() ->
>>     serial_core_register_port() ->
>>       serial_core_port_device_add() ->
>>         serial_base_port_add() ->
>>           device_add() ->
>>             bus_probe_device() ->
>>               device_initial_probe() ->
>>                 __device_attach() ->
>>                   // ...
>>                   if (dev->p->dead) {
>>                     // ...
>>                   } else if (dev->driver) {
>>                     // ...
>>                   } else {
>>                     // ...
>>                     pm_request_idle(dev);
>>                     // ...
>>                   }
>>
>> The earlycon device clocks are enabled by the bootloader. However, the
>> pm_request_idle() call in __device_attach() disables the SCI port clocks
>> while earlycon is still active.
>>
>> The earlycon write function, serial_console_write(), calls
>> sci_poll_put_char() via serial_console_putchar(). If the SCI port clocks
>> are disabled, writing to earlycon may sometimes cause the SR.TDFE bit to
>> remain unset indefinitely, causing the while loop in sci_poll_put_char()
>> to never exit. On single-core SoCs, this can result in the system being
>> blocked during boot when this issue occurs.
>>
>> To resolve this, increment the runtime PM usage counter for the earlycon
>> SCI device before registering the UART port.
>>
>> Fixes: 0b0cced19ab1 ("serial: sh-sci: Add CONFIG_SERIAL_EARLYCON support")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
> 
> Thanks for your patch!
> 
>> --- a/drivers/tty/serial/sh-sci.c
>> +++ b/drivers/tty/serial/sh-sci.c
>> @@ -3435,7 +3435,24 @@ static int sci_probe_single(struct platform_device *dev,
>>                 sciport->port.flags |= UPF_HARD_FLOW;
>>         }
>>
>> +       /*
>> +        * In case:
>> +        * - this is the earlycon port (mapped on index 0 in sci_ports[]) and
>> +        * - it now maps to an alias other than zero and
>> +        * - the earlycon is still alive (e.g., "earlycon keep_bootcon" is
>> +        *   available in bootargs)
>> +        *
>> +        * we need to avoid disabling clocks and PM domains through the runtime
>> +        * PM APIs called in __device_attach(). For this, increment the runtime
>> +        * PM reference counter (the clocks and PM domains were already enabled
>> +        * by the bootloader). Otherwise the earlycon may access the HW when it
>> +        * has no clocks enabled leading to failures (infinite loop in
>> +        * sci_poll_put_char()).
>> +        */
>> +
>>         if (sci_ports[0].earlycon && sci_ports[0].port.mapbase == sci_res->start) {
> 
> Now there are two tests for mapbase: here and in sci_probe()...

I'm not sure how can we avoid it. We need to re-check it in this function
as the sci_probe_single() is the one that enables the runtime PM. Would you
prefer to move the devm_pm_runtime_enable() in sci_probe() and have the
pm_runtime_get_noresume() in sci_probe() as well?

Thank you,
Claudiu

> 
>> +               pm_runtime_get_noresume(&dev->dev);
>> +
>>                 /*
>>                  * Skip cleanup up the sci_port[0] in early_console_exit(), this
>>                  * port is the same as the earlycon one.
> 
> Gr{oetje,eeting}s,
> 
>                         Geert
>
diff mbox series

Patch

diff --git a/drivers/tty/serial/sh-sci.c b/drivers/tty/serial/sh-sci.c
index f74eb68774ca..6acdc8588d2d 100644
--- a/drivers/tty/serial/sh-sci.c
+++ b/drivers/tty/serial/sh-sci.c
@@ -3435,7 +3435,24 @@  static int sci_probe_single(struct platform_device *dev,
 		sciport->port.flags |= UPF_HARD_FLOW;
 	}
 
+	/*
+	 * In case:
+	 * - this is the earlycon port (mapped on index 0 in sci_ports[]) and
+	 * - it now maps to an alias other than zero and
+	 * - the earlycon is still alive (e.g., "earlycon keep_bootcon" is
+	 *   available in bootargs)
+	 *
+	 * we need to avoid disabling clocks and PM domains through the runtime
+	 * PM APIs called in __device_attach(). For this, increment the runtime
+	 * PM reference counter (the clocks and PM domains were already enabled
+	 * by the bootloader). Otherwise the earlycon may access the HW when it
+	 * has no clocks enabled leading to failures (infinite loop in
+	 * sci_poll_put_char()).
+	 */
+
 	if (sci_ports[0].earlycon && sci_ports[0].port.mapbase == sci_res->start) {
+		pm_runtime_get_noresume(&dev->dev);
+
 		/*
 		 * Skip cleanup up the sci_port[0] in early_console_exit(), this
 		 * port is the same as the earlycon one.