diff mbox series

[V2] tty: serial: fsl_lpuart: add timeout for wait_event_interruptible in .shutdown()

Message ID 20211203030441.22873-1-sherry.sun@nxp.com
State New
Headers show
Series [V2] tty: serial: fsl_lpuart: add timeout for wait_event_interruptible in .shutdown() | expand

Commit Message

Sherry Sun Dec. 3, 2021, 3:04 a.m. UTC
Use wait_event_interruptible in lpuart_dma_shutdown isn't a reasonable
behavior, since it may cause the system hang here if the condition
!sport->dma_tx_in_progress never to be true in some corner case, such as
when enable the flow control, the dma tx request may never be completed
due to the peer's CTS setting when run .shutdown().

So here change to use wait_event_interruptible_timeout instead of
wait_event_interruptible, the tx dma will be forcibly terminated if the
tx dma request cannot be completed within 300ms.
Considering the worst tx dma case is to have a 4K bytes tx buffer, which
would require about 300ms to complete when the baudrate is 115200.

Signed-off-by: Sherry Sun <sherry.sun@nxp.com>
---
changes in V2
1. Increase the timeout to 300ms, need to consider the worst tx dma case.
---
 drivers/tty/serial/fsl_lpuart.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Jiri Slaby Dec. 8, 2021, 7:21 a.m. UTC | #1
Hi,

On 03. 12. 21, 4:04, Sherry Sun wrote:
> Use wait_event_interruptible in lpuart_dma_shutdown isn't a reasonable
> behavior, since it may cause the system hang here if the condition

Wait, _interruptible causes hangs? Under what circumstances?

> !sport->dma_tx_in_progress never to be true in some corner case, such as
> when enable the flow control, the dma tx request may never be completed
> due to the peer's CTS setting when run .shutdown().
> 
> So here change to use wait_event_interruptible_timeout instead of
> wait_event_interruptible, the tx dma will be forcibly terminated if the
> tx dma request cannot be completed within 300ms.
> Considering the worst tx dma case is to have a 4K bytes tx buffer, which
> would require about 300ms to complete when the baudrate is 115200.

300 looks like a magic number -- what if the rate is < 115200? Why not 
using port->timeout?

Anyway, in what scenario is this a problem? Both lpuart*_tx_empty() do:
if (sport->dma_tx_in_progress)
         return 0;

So wait_until_sent() should have waited for long enough already.

> Signed-off-by: Sherry Sun <sherry.sun@nxp.com>
> ---
> changes in V2
> 1. Increase the timeout to 300ms, need to consider the worst tx dma case.
> ---
>   drivers/tty/serial/fsl_lpuart.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/tty/serial/fsl_lpuart.c b/drivers/tty/serial/fsl_lpuart.c
> index ac5112def40d..3affe52a364d 100644
> --- a/drivers/tty/serial/fsl_lpuart.c
> +++ b/drivers/tty/serial/fsl_lpuart.c
> @@ -1793,8 +1793,8 @@ static void lpuart_dma_shutdown(struct lpuart_port *sport)
>   	}
>   
>   	if (sport->lpuart_dma_tx_use) {
> -		if (wait_event_interruptible(sport->dma_wait,
> -			!sport->dma_tx_in_progress) != false) {
> +		if (wait_event_interruptible_timeout(sport->dma_wait,
> +			!sport->dma_tx_in_progress, msecs_to_jiffies(300)) <= 0) {
>   			sport->dma_tx_in_progress = false;
>   			dmaengine_terminate_all(sport->dma_tx_chan);
>   		}
>
Sherry Sun Dec. 8, 2021, 10:21 a.m. UTC | #2
Hi Jiri

> > Use wait_event_interruptible in lpuart_dma_shutdown isn't a reasonable
> > behavior, since it may cause the system hang here if the condition
> 
> Wait, _interruptible causes hangs? Under what circumstances?

To be more precise, the system is not hang here, but keep waiting here.
While we run the suspend tests when the lpuart is transferring data, the last dma tx request may never be completed due to the peer device flow control.
So the suspend process will keep waiting the dma_tx_in_progress flag here.
When we enter the CTRL+C to interrupt this, the system will export the suspend timeout warning.
With this fix patch, the suspend process will waiting here for 300ms if the condition is false, which won't break the system suspend.

root@imx8ulpevk:~# echo mem > /sys/power/state
[  186.814430] PM: suspend entry (deep)
[  186.823943] Filesystems sync: 0.005 seconds
[  186.838361] Freezing user space processes ... (elapsed 0.001 seconds) done.
[  186.847218] OOM killer disabled.
[  186.850551] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[  186.928796] fsl-lpuart 29860000.serial: ttyLP2: Unable to drain transmitter


^C[  265.830421] PM: suspend devices took 79.792 seconds
[  265.835417] ------------[ cut here ]------------
[  265.840124] Component: suspend devices, time: 79792
[  265.845104] WARNING: CPU: 0 PID: 450 at kernel/power/suspend_test.c:53 suspend_test_finish+0x90/0xb0
[  265.854326] Modules linked in:
[  265.857465] CPU: 0 PID: 450 Comm: sh Not tainted 5.15.5-07157-g18ae36370434-dirty #310
[  265.865455] Hardware name: NXP i.MX8ULP EVK (DT)
[  265.870146] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  265.877183] pc : suspend_test_finish+0x90/0xb0
[  265.881705] lr : suspend_test_finish+0x90/0xb0
[  265.886227] sp : ffff800012b4bb50
[  265.889617] x29: ffff800012b4bb50 x28: ffff000004428000 x27: 0000000000000000
[  265.896837] x26: 0000000000000000 x25: ffff800011d7f000 x24: ffff000004428000
[  265.904055] x23: 0000000000000000 x22: ffff800011c0fc60 x21: 0000000000000003
[  265.911272] x20: ffff8000116aea78 x19: 00000000000137b0 x18: 0000000000000030
[  265.918490] x17: 0000000000000001 x16: 0000000000000001 x15: ffff000004428458
[  265.925708] x14: 0000000000000000 x13: ffff800011c11e40 x12: 0000000000000570
[  265.932926] x11: 00000000000001d0 x10: ffff800011c69e40 x9 : 00000000fffff000
[  265.940144] x8 : ffff800011c11e40 x7 : ffff800011c69e40 x6 : 0000000000000000
[  265.947362] x5 : 0000000000000000 x4 : ffff000057bbb988 x3 : ffff000057bbe910
[  265.954580] x2 : ffff000057bbb988 x1 : 1083e1c426ed1e00 x0 : 0000000000000000
[  265.961799] Call trace:
[  265.964322]  suspend_test_finish+0x90/0xb0
[  265.968498]  suspend_devices_and_enter+0x110/0x5a0
[  265.973367]  pm_suspend+0x2e4/0x350
[  265.976935]  state_store+0x90/0x114
[  265.980503]  kobj_attr_store+0x1c/0x30
[  265.984334]  sysfs_kf_write+0x48/0x60
[  265.988081]  kernfs_fop_write_iter+0x11c/0x1ac
[  265.992605]  new_sync_write+0xe8/0x180
[  265.996437]  vfs_write+0x230/0x290
[  265.999920]  ksys_write+0x6c/0x100
[  266.003402]  __arm64_sys_write+0x20/0x30
[  266.007406]  invoke_syscall+0x48/0x114
[  266.011238]  el0_svc_common.constprop.0+0xcc/0xec
[  266.016022]  do_el0_svc+0x28/0x90
[  266.019418]  el0_svc+0x20/0x60
[  266.022556]  el0t_64_sync_handler+0x1a8/0x1b0
[  266.026993]  el0t_64_sync+0x1a0/0x1a4
[  266.030736] ---[ end trace 70303e69391f202a ]---
[  266.039880] Disabling non-boot CPUs ...
[  266.045583] psci: CPU1 killed (polled 0 ms)


> 
> > !sport->dma_tx_in_progress never to be true in some corner case, such
> > as when enable the flow control, the dma tx request may never be
> > completed due to the peer's CTS setting when run .shutdown().
> >
> > So here change to use wait_event_interruptible_timeout instead of
> > wait_event_interruptible, the tx dma will be forcibly terminated if
> > the tx dma request cannot be completed within 300ms.
> > Considering the worst tx dma case is to have a 4K bytes tx buffer,
> > which would require about 300ms to complete when the baudrate is
> 115200.
> 
> 300 looks like a magic number -- what if the rate is < 115200? Why not using
> port->timeout?
> 
> Anyway, in what scenario is this a problem? Both lpuart*_tx_empty() do:
> if (sport->dma_tx_in_progress)
>          return 0;
> 
> So wait_until_sent() should have waited for long enough already.

Actually in the  uart_suspend_port() function, it will call ops->tx_empty, but it only waiting for 30ms here, if the transmitter still not empty, it will just ignore it and keep running.

        /*
         * Wait for the transmitter to empty.
         */
        for (tries = 3; !ops->tx_empty(uport) && tries; tries--)
            msleep(10);

Best regards
Sherry

> 
> > Signed-off-by: Sherry Sun <sherry.sun@nxp.com>
> > ---
> > changes in V2
> > 1. Increase the timeout to 300ms, need to consider the worst tx dma case.
> > ---
> >   drivers/tty/serial/fsl_lpuart.c | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/tty/serial/fsl_lpuart.c
> > b/drivers/tty/serial/fsl_lpuart.c index ac5112def40d..3affe52a364d
> > 100644
> > --- a/drivers/tty/serial/fsl_lpuart.c
> > +++ b/drivers/tty/serial/fsl_lpuart.c
> > @@ -1793,8 +1793,8 @@ static void lpuart_dma_shutdown(struct
> lpuart_port *sport)
> >   	}
> >
> >   	if (sport->lpuart_dma_tx_use) {
> > -		if (wait_event_interruptible(sport->dma_wait,
> > -			!sport->dma_tx_in_progress) != false) {
> > +		if (wait_event_interruptible_timeout(sport->dma_wait,
> > +			!sport->dma_tx_in_progress, msecs_to_jiffies(300))
> <= 0) {
> >   			sport->dma_tx_in_progress = false;
> >   			dmaengine_terminate_all(sport->dma_tx_chan);
> >   		}
> >
> 
> 
> --
> js
> suse labs
diff mbox series

Patch

diff --git a/drivers/tty/serial/fsl_lpuart.c b/drivers/tty/serial/fsl_lpuart.c
index ac5112def40d..3affe52a364d 100644
--- a/drivers/tty/serial/fsl_lpuart.c
+++ b/drivers/tty/serial/fsl_lpuart.c
@@ -1793,8 +1793,8 @@  static void lpuart_dma_shutdown(struct lpuart_port *sport)
 	}
 
 	if (sport->lpuart_dma_tx_use) {
-		if (wait_event_interruptible(sport->dma_wait,
-			!sport->dma_tx_in_progress) != false) {
+		if (wait_event_interruptible_timeout(sport->dma_wait,
+			!sport->dma_tx_in_progress, msecs_to_jiffies(300)) <= 0) {
 			sport->dma_tx_in_progress = false;
 			dmaengine_terminate_all(sport->dma_tx_chan);
 		}