diff mbox series

serial: qcom-geni: Show '@' characters if we have a FIFO underrun

Message ID 20240709162841.1.I93bf39f29d1887c46c74fbf8d4b937f6497cdfaa@changeid
State New
Headers show
Series serial: qcom-geni: Show '@' characters if we have a FIFO underrun | expand

Commit Message

Doug Anderson July 9, 2024, 11:28 p.m. UTC
As of commit 2ac33975abda ("serial: qcom-geni: do not kill the machine
on fifo underrun") a FIFO underrun will no longer hard lockup the
machine. Instead, a FIFO underrun will cause the UART to output a
bunch of '\0' characters. The '\0' characters don't seem to show up on
most terminal programs and this hides the fact that we had an
underrun. An underrun is aq sign of problems in the driver and
should be obvious / debugged.

Change the driver to put '@' characters in the case of an underrun
which should make it much more obvious.

Adding this extra initialization doesn't add any real overhead. In
fact, this patch reduces code size because the code was calling
memset() to init 4 bytes of data. Disassembling the new code shows
that early in the function w22 is setup to hold the '@@@@' constant:
  mov     w22, #0x40404040

Each time through the loop w22 is simply stored:
  str     w22, [sp, #4]

Cc: Johan Hovold <johan@kernel.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

 drivers/tty/serial/qcom_geni_serial.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Doug Anderson July 10, 2024, 4:01 p.m. UTC | #1
Hi,

On Tue, Jul 9, 2024 at 10:35 PM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Tue, Jul 09, 2024 at 04:28:45PM -0700, Douglas Anderson wrote:
> > As of commit 2ac33975abda ("serial: qcom-geni: do not kill the machine
> > on fifo underrun") a FIFO underrun will no longer hard lockup the
> > machine. Instead, a FIFO underrun will cause the UART to output a
> > bunch of '\0' characters. The '\0' characters don't seem to show up on
> > most terminal programs and this hides the fact that we had an
> > underrun. An underrun is aq sign of problems in the driver and
> > should be obvious / debugged.
> >
> > Change the driver to put '@' characters in the case of an underrun
> > which should make it much more obvious.
> >
> > Adding this extra initialization doesn't add any real overhead. In
> > fact, this patch reduces code size because the code was calling
> > memset() to init 4 bytes of data. Disassembling the new code shows
> > that early in the function w22 is setup to hold the '@@@@' constant:
> >   mov     w22, #0x40404040
> >
> > Each time through the loop w22 is simply stored:
> >   str     w22, [sp, #4]
> >
> > Cc: Johan Hovold <johan@kernel.org>
> > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > ---
> >
> >  drivers/tty/serial/qcom_geni_serial.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
> > index 69a632fefc41..332eaa2faa2b 100644
> > --- a/drivers/tty/serial/qcom_geni_serial.c
> > +++ b/drivers/tty/serial/qcom_geni_serial.c
> > @@ -872,10 +872,10 @@ static void qcom_geni_serial_send_chunk_fifo(struct uart_port *uport,
> >  {
> >       struct qcom_geni_serial_port *port = to_dev_port(uport);
> >       unsigned int tx_bytes, remaining = chunk;
> > -     u8 buf[BYTES_PER_FIFO_WORD];
> >
> >       while (remaining) {
> > -             memset(buf, 0, sizeof(buf));
> > +             u8 buf[BYTES_PER_FIFO_WORD] = { '@', '@', '@', '@' };
>
> Why is '@' a valid character for an underrun?  Why would any characters
> be ok?  Where is this now documented?

'@' is arbitrary. If you have a different character suggestion then
I'm happy to change it. I'm mostly looking for something other than
'\0' to be printed out in the case of underruns, which is what happens
now. Printing out '\0' is much harder to notice but could still end up
causing problems with file transfers / automated programs trying to
work with serial data.

NOTE: this is not an underrun in the sense that we didn't get an
interrupt fast enough or that we couldn't keep up. This is an underrun
that could only happen as a result of a bug in the driver. The driver
should not ever get into this situation.

I have no idea where one would document something like this. Do you
have any suggestions? I'm happy to add a comment in the code saying
that the '@' should never show up but if it does then that's a driver
bug.


> And shouldn't you use a memset to get the BYTES_PER_FIFO_WORD amount of
> '@' here?

That feels like overkill and results in much less efficient code, but
I can change it to that if you want. Let me know. In this case
BYTES_PER_FIFO_WORD is more of a constant to add to readability than
something that we'd expect to change. Technically, the hardware can
handle BYTES_PER_FIFO_WORD of 1/2/4, but if we changed that there
would be a bunch of places in the driver that would need to change. In
this case if someone wanted to try to change it to something other
than 4 at least we'd get a compile error.


-Doug
Greg Kroah-Hartman July 10, 2024, 5:28 p.m. UTC | #2
On Wed, Jul 10, 2024 at 09:01:10AM -0700, Doug Anderson wrote:
> Hi,
> 
> On Tue, Jul 9, 2024 at 10:35 PM Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > On Tue, Jul 09, 2024 at 04:28:45PM -0700, Douglas Anderson wrote:
> > > As of commit 2ac33975abda ("serial: qcom-geni: do not kill the machine
> > > on fifo underrun") a FIFO underrun will no longer hard lockup the
> > > machine. Instead, a FIFO underrun will cause the UART to output a
> > > bunch of '\0' characters. The '\0' characters don't seem to show up on
> > > most terminal programs and this hides the fact that we had an
> > > underrun. An underrun is aq sign of problems in the driver and
> > > should be obvious / debugged.
> > >
> > > Change the driver to put '@' characters in the case of an underrun
> > > which should make it much more obvious.
> > >
> > > Adding this extra initialization doesn't add any real overhead. In
> > > fact, this patch reduces code size because the code was calling
> > > memset() to init 4 bytes of data. Disassembling the new code shows
> > > that early in the function w22 is setup to hold the '@@@@' constant:
> > >   mov     w22, #0x40404040
> > >
> > > Each time through the loop w22 is simply stored:
> > >   str     w22, [sp, #4]
> > >
> > > Cc: Johan Hovold <johan@kernel.org>
> > > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > > ---
> > >
> > >  drivers/tty/serial/qcom_geni_serial.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
> > > index 69a632fefc41..332eaa2faa2b 100644
> > > --- a/drivers/tty/serial/qcom_geni_serial.c
> > > +++ b/drivers/tty/serial/qcom_geni_serial.c
> > > @@ -872,10 +872,10 @@ static void qcom_geni_serial_send_chunk_fifo(struct uart_port *uport,
> > >  {
> > >       struct qcom_geni_serial_port *port = to_dev_port(uport);
> > >       unsigned int tx_bytes, remaining = chunk;
> > > -     u8 buf[BYTES_PER_FIFO_WORD];
> > >
> > >       while (remaining) {
> > > -             memset(buf, 0, sizeof(buf));
> > > +             u8 buf[BYTES_PER_FIFO_WORD] = { '@', '@', '@', '@' };
> >
> > Why is '@' a valid character for an underrun?  Why would any characters
> > be ok?  Where is this now documented?
> 
> '@' is arbitrary. If you have a different character suggestion then
> I'm happy to change it. I'm mostly looking for something other than
> '\0' to be printed out in the case of underruns, which is what happens
> now. Printing out '\0' is much harder to notice but could still end up
> causing problems with file transfers / automated programs trying to
> work with serial data.

Any character is "wrong", so picking this one feels odd.

Do we know when an underrun happens?  If so, handle that error.  If not,
well, something else is really wrong with this uart then (why are people
making new uarts still...)

> NOTE: this is not an underrun in the sense that we didn't get an
> interrupt fast enough or that we couldn't keep up. This is an underrun
> that could only happen as a result of a bug in the driver. The driver
> should not ever get into this situation.

But obviously it has otherwise you wouldn't have seen it, right?

> I have no idea where one would document something like this. Do you
> have any suggestions? I'm happy to add a comment in the code saying
> that the '@' should never show up but if it does then that's a driver
> bug.
> 
> 
> > And shouldn't you use a memset to get the BYTES_PER_FIFO_WORD amount of
> > '@' here?
> 
> That feels like overkill and results in much less efficient code, but
> I can change it to that if you want.

Are you sure it will?  Constant memcopies should be really fast by the
compiler these days.

> Let me know. In this case
> BYTES_PER_FIFO_WORD is more of a constant to add to readability than
> something that we'd expect to change. Technically, the hardware can
> handle BYTES_PER_FIFO_WORD of 1/2/4, but if we changed that there
> would be a bunch of places in the driver that would need to change. In
> this case if someone wanted to try to change it to something other
> than 4 at least we'd get a compile error.

So this is a define that can never be changed?

Anyway, I don't know what to suggest here.  If this is something that
can happen, handle it properly.  If it can't, then why are we having
this discussion?  :)

thanks,

greg k-h
Doug Anderson July 10, 2024, 5:47 p.m. UTC | #3
Hi,

On Wed, Jul 10, 2024 at 10:28 AM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Wed, Jul 10, 2024 at 09:01:10AM -0700, Doug Anderson wrote:
> > Hi,
> >
> > On Tue, Jul 9, 2024 at 10:35 PM Greg Kroah-Hartman
> > <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Tue, Jul 09, 2024 at 04:28:45PM -0700, Douglas Anderson wrote:
> > > > As of commit 2ac33975abda ("serial: qcom-geni: do not kill the machine
> > > > on fifo underrun") a FIFO underrun will no longer hard lockup the
> > > > machine. Instead, a FIFO underrun will cause the UART to output a
> > > > bunch of '\0' characters. The '\0' characters don't seem to show up on
> > > > most terminal programs and this hides the fact that we had an
> > > > underrun. An underrun is aq sign of problems in the driver and
> > > > should be obvious / debugged.
> > > >
> > > > Change the driver to put '@' characters in the case of an underrun
> > > > which should make it much more obvious.
> > > >
> > > > Adding this extra initialization doesn't add any real overhead. In
> > > > fact, this patch reduces code size because the code was calling
> > > > memset() to init 4 bytes of data. Disassembling the new code shows
> > > > that early in the function w22 is setup to hold the '@@@@' constant:
> > > >   mov     w22, #0x40404040
> > > >
> > > > Each time through the loop w22 is simply stored:
> > > >   str     w22, [sp, #4]
> > > >
> > > > Cc: Johan Hovold <johan@kernel.org>
> > > > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > > > ---
> > > >
> > > >  drivers/tty/serial/qcom_geni_serial.c | 4 ++--
> > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
> > > > index 69a632fefc41..332eaa2faa2b 100644
> > > > --- a/drivers/tty/serial/qcom_geni_serial.c
> > > > +++ b/drivers/tty/serial/qcom_geni_serial.c
> > > > @@ -872,10 +872,10 @@ static void qcom_geni_serial_send_chunk_fifo(struct uart_port *uport,
> > > >  {
> > > >       struct qcom_geni_serial_port *port = to_dev_port(uport);
> > > >       unsigned int tx_bytes, remaining = chunk;
> > > > -     u8 buf[BYTES_PER_FIFO_WORD];
> > > >
> > > >       while (remaining) {
> > > > -             memset(buf, 0, sizeof(buf));
> > > > +             u8 buf[BYTES_PER_FIFO_WORD] = { '@', '@', '@', '@' };
> > >
> > > Why is '@' a valid character for an underrun?  Why would any characters
> > > be ok?  Where is this now documented?
> >
> > '@' is arbitrary. If you have a different character suggestion then
> > I'm happy to change it. I'm mostly looking for something other than
> > '\0' to be printed out in the case of underruns, which is what happens
> > now. Printing out '\0' is much harder to notice but could still end up
> > causing problems with file transfers / automated programs trying to
> > work with serial data.
>
> Any character is "wrong", so picking this one feels odd.
>
> Do we know when an underrun happens?  If so, handle that error.  If not,
> well, something else is really wrong with this uart then

It no longer happens. Johan's recent patches fixed it. Quick history:

1. Pre-kfifo, we used to output stale characters (ones that had been
dropped) in the FIFO underrun case. Nobody noticed for years.

2. After kfifo we got a hard lockup.

3. Johan's early patches to fix the hard lockup caused us to output
'\0' characters upon FIFO underrun. It was not obvious that the '\0'
characters were being output. To make it easier to debug / see, both
he and I locally made it output some other character which was more
obvious.

4. Johan fixed the FIFO underrun.

5. Johan added a patch such that if we ever get another FIFO underrun
in the future we'll output '\0' characters in the FIFO instead of
getting a hard lockup.

If we're really confident that we can't get a FIFO underun we could
just revert commit 2ac33975abda ("serial: qcom-geni: do not kill the
machine on fifo underrun") and we'll get a hard lockup if we ever
underrun. IMO, though, it's better to output _something_ in this case
to make it more obvious. If you hate this patch, though, fine. Let's
drop it and we'll hope that either we never introduce a bug causing a
FIFO underrun in the future or that someone notices the '\0'
characters.


> (why are people
> making new uarts still...)

Yeah, tell me about it. :( ...and doing a bad job of it, too...


> > NOTE: this is not an underrun in the sense that we didn't get an
> > interrupt fast enough or that we couldn't keep up. This is an underrun
> > that could only happen as a result of a bug in the driver. The driver
> > should not ever get into this situation.
>
> But obviously it has otherwise you wouldn't have seen it, right?

Right, because there was a bug.


> > I have no idea where one would document something like this. Do you
> > have any suggestions? I'm happy to add a comment in the code saying
> > that the '@' should never show up but if it does then that's a driver
> > bug.
> >
> >
> > > And shouldn't you use a memset to get the BYTES_PER_FIFO_WORD amount of
> > > '@' here?
> >
> > That feels like overkill and results in much less efficient code, but
> > I can change it to that if you want.
>
> Are you sure it will?  Constant memcopies should be really fast by the
> compiler these days.

Fair. If we care to keep this patch we could just change the memset to:

  memset(buf, '@', sizeof(buf));


-Doug
Greg Kroah-Hartman July 10, 2024, 7:03 p.m. UTC | #4
On Wed, Jul 10, 2024 at 10:47:16AM -0700, Doug Anderson wrote:
> Hi,
> 
> On Wed, Jul 10, 2024 at 10:28 AM Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > On Wed, Jul 10, 2024 at 09:01:10AM -0700, Doug Anderson wrote:
> > > Hi,
> > >
> > > On Tue, Jul 9, 2024 at 10:35 PM Greg Kroah-Hartman
> > > <gregkh@linuxfoundation.org> wrote:
> > > >
> > > > On Tue, Jul 09, 2024 at 04:28:45PM -0700, Douglas Anderson wrote:
> > > > > As of commit 2ac33975abda ("serial: qcom-geni: do not kill the machine
> > > > > on fifo underrun") a FIFO underrun will no longer hard lockup the
> > > > > machine. Instead, a FIFO underrun will cause the UART to output a
> > > > > bunch of '\0' characters. The '\0' characters don't seem to show up on
> > > > > most terminal programs and this hides the fact that we had an
> > > > > underrun. An underrun is aq sign of problems in the driver and
> > > > > should be obvious / debugged.
> > > > >
> > > > > Change the driver to put '@' characters in the case of an underrun
> > > > > which should make it much more obvious.
> > > > >
> > > > > Adding this extra initialization doesn't add any real overhead. In
> > > > > fact, this patch reduces code size because the code was calling
> > > > > memset() to init 4 bytes of data. Disassembling the new code shows
> > > > > that early in the function w22 is setup to hold the '@@@@' constant:
> > > > >   mov     w22, #0x40404040
> > > > >
> > > > > Each time through the loop w22 is simply stored:
> > > > >   str     w22, [sp, #4]
> > > > >
> > > > > Cc: Johan Hovold <johan@kernel.org>
> > > > > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > > > > ---
> > > > >
> > > > >  drivers/tty/serial/qcom_geni_serial.c | 4 ++--
> > > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
> > > > > index 69a632fefc41..332eaa2faa2b 100644
> > > > > --- a/drivers/tty/serial/qcom_geni_serial.c
> > > > > +++ b/drivers/tty/serial/qcom_geni_serial.c
> > > > > @@ -872,10 +872,10 @@ static void qcom_geni_serial_send_chunk_fifo(struct uart_port *uport,
> > > > >  {
> > > > >       struct qcom_geni_serial_port *port = to_dev_port(uport);
> > > > >       unsigned int tx_bytes, remaining = chunk;
> > > > > -     u8 buf[BYTES_PER_FIFO_WORD];
> > > > >
> > > > >       while (remaining) {
> > > > > -             memset(buf, 0, sizeof(buf));
> > > > > +             u8 buf[BYTES_PER_FIFO_WORD] = { '@', '@', '@', '@' };
> > > >
> > > > Why is '@' a valid character for an underrun?  Why would any characters
> > > > be ok?  Where is this now documented?
> > >
> > > '@' is arbitrary. If you have a different character suggestion then
> > > I'm happy to change it. I'm mostly looking for something other than
> > > '\0' to be printed out in the case of underruns, which is what happens
> > > now. Printing out '\0' is much harder to notice but could still end up
> > > causing problems with file transfers / automated programs trying to
> > > work with serial data.
> >
> > Any character is "wrong", so picking this one feels odd.
> >
> > Do we know when an underrun happens?  If so, handle that error.  If not,
> > well, something else is really wrong with this uart then
> 
> It no longer happens. Johan's recent patches fixed it. Quick history:
> 
> 1. Pre-kfifo, we used to output stale characters (ones that had been
> dropped) in the FIFO underrun case. Nobody noticed for years.
> 
> 2. After kfifo we got a hard lockup.
> 
> 3. Johan's early patches to fix the hard lockup caused us to output
> '\0' characters upon FIFO underrun. It was not obvious that the '\0'
> characters were being output. To make it easier to debug / see, both
> he and I locally made it output some other character which was more
> obvious.
> 
> 4. Johan fixed the FIFO underrun.
> 
> 5. Johan added a patch such that if we ever get another FIFO underrun
> in the future we'll output '\0' characters in the FIFO instead of
> getting a hard lockup.
> 
> If we're really confident that we can't get a FIFO underun we could
> just revert commit 2ac33975abda ("serial: qcom-geni: do not kill the
> machine on fifo underrun") and we'll get a hard lockup if we ever
> underrun. IMO, though, it's better to output _something_ in this case
> to make it more obvious. If you hate this patch, though, fine. Let's
> drop it and we'll hope that either we never introduce a bug causing a
> FIFO underrun in the future or that someone notices the '\0'
> characters.

Let's just drop this one, if \0 are seen, that's a good enough character
as any to send when something bad happens.

thanks,

greg k-h
diff mbox series

Patch

diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
index 69a632fefc41..332eaa2faa2b 100644
--- a/drivers/tty/serial/qcom_geni_serial.c
+++ b/drivers/tty/serial/qcom_geni_serial.c
@@ -872,10 +872,10 @@  static void qcom_geni_serial_send_chunk_fifo(struct uart_port *uport,
 {
 	struct qcom_geni_serial_port *port = to_dev_port(uport);
 	unsigned int tx_bytes, remaining = chunk;
-	u8 buf[BYTES_PER_FIFO_WORD];
 
 	while (remaining) {
-		memset(buf, 0, sizeof(buf));
+		u8 buf[BYTES_PER_FIFO_WORD] = { '@', '@', '@', '@' };
+
 		tx_bytes = min(remaining, BYTES_PER_FIFO_WORD);
 
 		uart_fifo_out(uport, buf, tx_bytes);