diff mbox series

serial: pxa: Disable TX interrupt if there is no more data to transmit

Message ID 20240519193109.122466-1-doug@schmorgal.com
State New
Headers show
Series serial: pxa: Disable TX interrupt if there is no more data to transmit | expand

Commit Message

Doug Brown May 19, 2024, 7:31 p.m. UTC
If a TX interrupt occurs and no new data gets loaded into the TX FIFO,
the UART will never fire another TX interrupt until the UART_IER_THRI
flag is toggled off and back on. If nothing ever calls stop_tx(), this
effectively results in transmissions getting hung up until another
unrelated UART IRQ occurs, such as an RX interrupt.

The driver used to do this correctly until the transition to
uart_port_tx_limited(). This didn't matter until the behavior of
__uart_port_tx changed in commit 7bfb915a597a ("serial: core: only stop
transmit when HW fifo is empty").

Fixes: d11cc8c3c4b6 ("tty: serial: use uart_port_tx_limited()")
Signed-off-by: Doug Brown <doug@schmorgal.com>
---

Note: I based this on v6.9 instead of tty-next since it's a fix; please
let me know if that was the wrong move and I would be happy to resubmit
it based on tty-next. The patch changes ever so slightly because of the
circ_buf -> kfifo transition. The only difference is it needs this
condition in the "if" statement instead:

kfifo_is_empty(&up->port.state->port.xmit_fifo)

This has been tested to work properly on tty-next as well.

 drivers/tty/serial/pxa.c | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Jiri Slaby May 20, 2024, 6:41 a.m. UTC | #1
On 19. 05. 24, 21:31, Doug Brown wrote:
> If a TX interrupt occurs and no new data gets loaded into the TX FIFO,
> the UART will never fire another TX interrupt until the UART_IER_THRI
> flag is toggled off and back on. If nothing ever calls stop_tx(), this
> effectively results in transmissions getting hung up until another
> unrelated UART IRQ occurs, such as an RX interrupt.
> 
> The driver used to do this correctly until the transition to
> uart_port_tx_limited(). This didn't matter until the behavior of
> __uart_port_tx changed in commit 7bfb915a597a ("serial: core: only stop
> transmit when HW fifo is empty").
> 
> Fixes: d11cc8c3c4b6 ("tty: serial: use uart_port_tx_limited()")
> Signed-off-by: Doug Brown <doug@schmorgal.com>
> ---
> 
> Note: I based this on v6.9 instead of tty-next since it's a fix; please
> let me know if that was the wrong move and I would be happy to resubmit
> it based on tty-next. The patch changes ever so slightly because of the
> circ_buf -> kfifo transition. The only difference is it needs this
> condition in the "if" statement instead:
> 
> kfifo_is_empty(&up->port.state->port.xmit_fifo)
> 
> This has been tested to work properly on tty-next as well.
> 
>   drivers/tty/serial/pxa.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/tty/serial/pxa.c b/drivers/tty/serial/pxa.c
> index e395ff29c1a2..8abb85bee87c 100644
> --- a/drivers/tty/serial/pxa.c
> +++ b/drivers/tty/serial/pxa.c
> @@ -176,6 +176,14 @@ static void transmit_chars(struct uart_pxa_port *up)
>   {
>   	u8 ch;
>   
> +	/* If there is nothing left to transmit, disable the TX interrupt.
> +	 * Otherwise we can get stuck waiting for another IRQ that never happens.
> +	 */
> +	if (uart_circ_empty(&up->port.state->xmit)) {
> +		serial_pxa_stop_tx(&up->port);
> +		return;
> +	}

This does not make sense. If the circ buf is empty, 
uart_port_tx_limited() should stop the TX already. You simply revert to 
the state before 7bfb915a597a, but on a per-driver basis.

IOW all drivers using the helper would have exactly this issue after 
7bfb915a597a.

What driver was 7bfb915a597a about after all? The commit log of the 
commit is hopeless (it's very vague) in this respect:
commit 7bfb915a597a301abb892f620fe5c283a9fdbd77
Author: Jonas Gorski <jonas.gorski@gmail.com>
Date:   Sun Mar 3 16:08:07 2024 +0100

     serial: core: only stop transmit when HW fifo is empty

     If the circular buffer is empty, it just means we fit all characters to
     send into the HW fifo, but not that the hardware finished transmitting
     them.

     So if we immediately call stop_tx() after that, this may abort any
     pending characters in the HW fifo, and cause dropped characters on the
     console.

     Fix this by only stopping tx when the tx HW fifo is actually empty.

     Fixes: 8275b48b2780 ("tty: serial: introduce transmit helpers")



(And it barely fixes 8275b48b2780 per se. 8275b48b2780 only moved the 
processing to a single place, most/all of the drivers already did it 
that way.)

/me confused.

Perhaps, it should be reverted and the affected driver should just pass 
UART_TX_NOSTOP instead?

>   	uart_port_tx_limited(&up->port, ch, up->port.fifosize / 2,
>   		true,
>   		serial_out(up, UART_TX, ch),

thanks,
Jonas Gorski May 20, 2024, 9:30 a.m. UTC | #2
Hi

On Mon, 20 May 2024 at 08:41, Jiri Slaby <jirislaby@kernel.org> wrote:
>
> On 19. 05. 24, 21:31, Doug Brown wrote:
> > If a TX interrupt occurs and no new data gets loaded into the TX FIFO,
> > the UART will never fire another TX interrupt until the UART_IER_THRI
> > flag is toggled off and back on. If nothing ever calls stop_tx(), this
> > effectively results in transmissions getting hung up until another
> > unrelated UART IRQ occurs, such as an RX interrupt.
> >
> > The driver used to do this correctly until the transition to
> > uart_port_tx_limited(). This didn't matter until the behavior of
> > __uart_port_tx changed in commit 7bfb915a597a ("serial: core: only stop
> > transmit when HW fifo is empty").
> >
> > Fixes: d11cc8c3c4b6 ("tty: serial: use uart_port_tx_limited()")
> > Signed-off-by: Doug Brown <doug@schmorgal.com>
> > ---
> >
> > Note: I based this on v6.9 instead of tty-next since it's a fix; please
> > let me know if that was the wrong move and I would be happy to resubmit
> > it based on tty-next. The patch changes ever so slightly because of the
> > circ_buf -> kfifo transition. The only difference is it needs this
> > condition in the "if" statement instead:
> >
> > kfifo_is_empty(&up->port.state->port.xmit_fifo)
> >
> > This has been tested to work properly on tty-next as well.
> >
> >   drivers/tty/serial/pxa.c | 8 ++++++++
> >   1 file changed, 8 insertions(+)
> >
> > diff --git a/drivers/tty/serial/pxa.c b/drivers/tty/serial/pxa.c
> > index e395ff29c1a2..8abb85bee87c 100644
> > --- a/drivers/tty/serial/pxa.c
> > +++ b/drivers/tty/serial/pxa.c
> > @@ -176,6 +176,14 @@ static void transmit_chars(struct uart_pxa_port *up)
> >   {
> >       u8 ch;
> >
> > +     /* If there is nothing left to transmit, disable the TX interrupt.
> > +      * Otherwise we can get stuck waiting for another IRQ that never happens.
> > +      */
> > +     if (uart_circ_empty(&up->port.state->xmit)) {
> > +             serial_pxa_stop_tx(&up->port);
> > +             return;
> > +     }
>
> This does not make sense. If the circ buf is empty,
> uart_port_tx_limited() should stop the TX already. You simply revert to
> the state before 7bfb915a597a, but on a per-driver basis.
>
> IOW all drivers using the helper would have exactly this issue after
> 7bfb915a597a.
>
> What driver was 7bfb915a597a about after all? The commit log of the
> commit is hopeless (it's very vague) in this respect:
> commit 7bfb915a597a301abb892f620fe5c283a9fdbd77
> Author: Jonas Gorski <jonas.gorski@gmail.com>
> Date:   Sun Mar 3 16:08:07 2024 +0100
>
>      serial: core: only stop transmit when HW fifo is empty
>
>      If the circular buffer is empty, it just means we fit all characters to
>      send into the HW fifo, but not that the hardware finished transmitting
>      them.
>
>      So if we immediately call stop_tx() after that, this may abort any
>      pending characters in the HW fifo, and cause dropped characters on the
>      console.
>
>      Fix this by only stopping tx when the tx HW fifo is actually empty.
>
>      Fixes: 8275b48b2780 ("tty: serial: introduce transmit helpers")
>
>
>
> (And it barely fixes 8275b48b2780 per se. 8275b48b2780 only moved the
> processing to a single place, most/all of the drivers already did it
> that way.)
>
> /me confused.
>
> Perhaps, it should be reverted and the affected driver should just pass
> UART_TX_NOSTOP instead?
>
> >       uart_port_tx_limited(&up->port, ch, up->port.fifosize / 2,
> >               true,
> >               serial_out(up, UART_TX, ch),
>
> thanks,

Sorry, I should have added that this is for bcm63xx_uart.c. My
reasoning for doing it this way and not via a flag was:

1. The kernel doc for uart_ops::stop_tx() says "The driver should stop
transmitting characters as soon as possible."
2. bcm63xx_uart.c's stop_tx() does exactly that, by immediately
stopping tx (leading to potentially lost output).
3. Therefore uart_port_tx_limited() is wrong by calling stop_tx()
while the transmitter is still busy, and the core issue is this.

So adding the check for the hw fifo being empty seemed to me as the
"right" fix, fixing it for all drivers that abort tx in stop_tx().

Best Regards,
Jonas
Doug Brown May 21, 2024, 4:58 a.m. UTC | #3
Hi Jonas and Jiri,

On 5/20/2024 2:30 AM, Jonas Gorski wrote:
> Hi
> 
> On Mon, 20 May 2024 at 08:41, Jiri Slaby <jirislaby@kernel.org> wrote:
>>
>> This does not make sense. If the circ buf is empty,
>> uart_port_tx_limited() should stop the TX already. You simply revert to
>> the state before 7bfb915a597a, but on a per-driver basis.
>>
>> IOW all drivers using the helper would have exactly this issue after
>> 7bfb915a597a.

I tested more, and it appears you are correct, Jiri. I think
7bfb915a597a has broken other drivers too.

For some background on this, after upgrading to 6.9 I observed that
transmits would stop working on my PXA168-based device with the pxa
driver until I caused another UART IRQ to fire, for example by typing a
character in the console. I bisected the problem to 7bfb915a597a.

I don't have a bunch of hardware to test with, but I cobbled together a
way to emulate a BeagleBoard in an old Linaro QEMU branch, and the omap-
serial driver also appears to be affected. Shortly after boot, the UART
just completely hangs and all transmits stop working. I can't even type
a character to recover it like I can with the pxa driver. Reverting
7bfb915a597a fixes the issue.

I think the reason the pxa driver recovers when I type a character is
because it always checks the THRE bit of the LSR in the IRQ handler,
even if the IIR doesn't say there's a TX interrupt ready. (This appears
to be a workaround for an erratum in the PXA serial IP...)

>> Perhaps, it should be reverted and the affected driver should just pass
>> UART_TX_NOSTOP instead?
>>
>>>        uart_port_tx_limited(&up->port, ch, up->port.fifosize / 2,
>>>                true,
>>>                serial_out(up, UART_TX, ch),

I can't speak for whether that would fix the issue Jonas originally saw,
but it does look like it needs to be reverted because other drivers are
also broken due to this change.

> 1. The kernel doc for uart_ops::stop_tx() says "The driver should stop
> transmitting characters as soon as possible."

Very interesting. I guess the correct behavior depends on the exact
interpretation of "stop transmitting characters as soon as possible".
:-) It looks like, at least for the existing 16550-esque drivers that
are using the helper, they are treating stop_tx() more as a signal to
disable the TX interrupt, but still allow any characters already in the
FIFO to go out.

All I know is, in the PXA UART, I was observing the TX IRQ firing when
we had no data to load in the FIFO, and we weren't disabling the
interrupt. This causes a 16550A-style UART to get hung up because the TX
interrupt never fires again unless you disable it and re-enable it.

Doug
Doug Brown June 2, 2024, 5:24 p.m. UTC | #4
Hi Jonas,

>> On Mon, 20 May 2024 at 08:41, Jiri Slaby <jirislaby@kernel.org> wrote:
>>>
>>> Perhaps, it should be reverted and the affected driver should just pass
>>> UART_TX_NOSTOP instead?
>>>
>>>>        uart_port_tx_limited(&up->port, ch, up->port.fifosize / 2,
>>>>                true,
>>>>                serial_out(up, UART_TX, ch),
Does the bcm63xx_uart driver work correctly if 7bfb915a597a is reverted 
and UART_TX_NOSTOP is used instead?

I was able to further confirm on hardware (BeagleBoard) that omap-serial 
is also broken after 7bfb915a597a. I'm trying to figure out how to 
safely revert it while also allowing bcm63xx to continue to work properly.

Thanks,
Doug
Jonas Gorski June 4, 2024, 1:04 p.m. UTC | #5
Hi,

On Sun, 2 Jun 2024 at 19:24, Doug Brown <doug@schmorgal.com> wrote:
>
> Hi Jonas,
>
> >> On Mon, 20 May 2024 at 08:41, Jiri Slaby <jirislaby@kernel.org> wrote:
> >>>
> >>> Perhaps, it should be reverted and the affected driver should just pass
> >>> UART_TX_NOSTOP instead?
> >>>
> >>>>        uart_port_tx_limited(&up->port, ch, up->port.fifosize / 2,
> >>>>                true,
> >>>>                serial_out(up, UART_TX, ch),
> Does the bcm63xx_uart driver work correctly if 7bfb915a597a is reverted
> and UART_TX_NOSTOP is used instead?
>
> I was able to further confirm on hardware (BeagleBoard) that omap-serial
> is also broken after 7bfb915a597a. I'm trying to figure out how to
> safely revert it while also allowing bcm63xx to continue to work properly.

So I went through all users of uart_port_tx(_limited()), and
expectedly most do just disable the IRQ (or do nothing) in their
stop_tx() call back.

The notable exceptions I could identify are:

- bcm63xx_uart: disables the emitter as well
- mxs-uart: disables the emitter (introduced and uses UART_TX_NOSTOP)
- sprd_serial: stops alls dma transfers as well
- atmel_serial: stops dma and disables emitter as well

I suspect sprd_serial and atmel_serial to not work properly with the
old behavior, but I have no devices where I could test this.

There are a few more exceptions in those that do not use
uart_port_tx(_limited()), like I've seen drivers blocking in stop_tx()
until the FIFO is empty. Not sure if that is expected.

The quick solutions for bcm63xx_uart to either pass UART_TX_NOSTOP
[1], or make stop_tx() (and start_tx()) not touch the emitter (or
revert the uart_port_tx_limited() usage).

Regardless of the chosen quick fix, it feels like stop_tx() is the
wrong thing to use for uart_port_tx(), and just happened to do the
right thing for the majority or drivers. But the correct function
(something along "done submitting tx") also doesn't exist.

But I'm no maintainer for tty/serial, nor do I have much (intimate)
knowledge there. This is just from looking at it from the outside, and
my thoughts may very well be wrong.

[1] https://lore.kernel.org/lkml/20240225151426.1342285-1-jonas.gorski@gmail.com/

Best Regards,
Jonas
Doug Brown June 6, 2024, 6:25 p.m. UTC | #6
Hi Jonas,

On 6/4/2024 6:04 AM, Jonas Gorski wrote:
> Hi,
> 
> On Sun, 2 Jun 2024 at 19:24, Doug Brown <doug@schmorgal.com> wrote:
>>
>> Hi Jonas,
>>
>>>> On Mon, 20 May 2024 at 08:41, Jiri Slaby <jirislaby@kernel.org> wrote:
>>>>>
>>>>> Perhaps, it should be reverted and the affected driver should just pass
>>>>> UART_TX_NOSTOP instead?
>>>>>
>>>>>>         uart_port_tx_limited(&up->port, ch, up->port.fifosize / 2,
>>>>>>                 true,
>>>>>>                 serial_out(up, UART_TX, ch),
>> Does the bcm63xx_uart driver work correctly if 7bfb915a597a is reverted
>> and UART_TX_NOSTOP is used instead?
>>
>> I was able to further confirm on hardware (BeagleBoard) that omap-serial
>> is also broken after 7bfb915a597a. I'm trying to figure out how to
>> safely revert it while also allowing bcm63xx to continue to work properly.
> 
> So I went through all users of uart_port_tx(_limited()), and
> expectedly most do just disable the IRQ (or do nothing) in their
> stop_tx() call back.
> 
> The notable exceptions I could identify are:
> 
> - bcm63xx_uart: disables the emitter as well
> - mxs-uart: disables the emitter (introduced and uses UART_TX_NOSTOP)
> - sprd_serial: stops alls dma transfers as well
> - atmel_serial: stops dma and disables emitter as well
> 
> I suspect sprd_serial and atmel_serial to not work properly with the
> old behavior, but I have no devices where I could test this.

I see what you mean. Unfortunately I also don't have the ability to test
either of those drivers. It looks like atmel_serial didn't previously
have a stop_tx() anywhere in the code that was refactored to use
uart_port_tx(), so I'm assuming it should also have UART_TX_NOSTOP
added. I'm not confident enough to submit a patch for it though.

sprd_serial prior to the uart_port_tx_limited() refactor was calling
stop_tx() when uart_circ_empty() was true -- in fact it was almost
identical to the pxa driver's code -- so I feel like any bug in there
related to stopping too early would have predated any of these changes.

> Regardless of the chosen quick fix, it feels like stop_tx() is the
> wrong thing to use for uart_port_tx(), and just happened to do the
> right thing for the majority or drivers. But the correct function
> (something along "done submitting tx") also doesn't exist.

That makes complete sense to me.

> But I'm no maintainer for tty/serial, nor do I have much (intimate)
> knowledge there. This is just from looking at it from the outside, and
> my thoughts may very well be wrong.

I'm in the same boat. I wouldn't feel comfortable submitting a large fix
for something like that. There are so many hardware platforms that are
using this code, it seems very intimidating to work on. Thanks for the
analysis, and thanks for linking your previous patch! I'll work on
submitting a quick fix for this at least. I think if we go back to the
approach your earlier patchset used, it will fix the issue I'm seeing on
omap-serial and pxa while also working properly on bcm63xx_uart.

Doug
diff mbox series

Patch

diff --git a/drivers/tty/serial/pxa.c b/drivers/tty/serial/pxa.c
index e395ff29c1a2..8abb85bee87c 100644
--- a/drivers/tty/serial/pxa.c
+++ b/drivers/tty/serial/pxa.c
@@ -176,6 +176,14 @@  static void transmit_chars(struct uart_pxa_port *up)
 {
 	u8 ch;
 
+	/* If there is nothing left to transmit, disable the TX interrupt.
+	 * Otherwise we can get stuck waiting for another IRQ that never happens.
+	 */
+	if (uart_circ_empty(&up->port.state->xmit)) {
+		serial_pxa_stop_tx(&up->port);
+		return;
+	}
+
 	uart_port_tx_limited(&up->port, ch, up->port.fifosize / 2,
 		true,
 		serial_out(up, UART_TX, ch),