Message ID | 20211222112831.1968392-2-wander@redhat.com |
---|---|
State | Superseded |
Headers | show |
Series | [v3,1/1] tty: serial: Use fifo in 8250 console driver | expand |
On 2021-12-22 08:28:30 [-0300], Wander Lairson Costa wrote: > Note: I am using a small test app + driver located at [0] for the > problem description. serco is a driver whose write function dispatches > to the serial controller. sertest is a user-mode app that writes n bytes > to the serial console using the serco driver. > > While investigating a bug in the RHEL kernel, I noticed that the serial > console throughput is way below the configured speed of 115200 bps in > a HP Proliant DL380 Gen9. I was expecting something above 10KB/s, but > I got 2.5KB/s. > > $ time ./sertest -n 2500 /tmp/serco > > real 0m0.997s > user 0m0.000s > sys 0m0.997s > > With the help of the function tracer, I then noticed the serial > controller was taking around 410us seconds to dispatch one single byte: was this the HW access or did this include the wait-for-fifo empty? > $ trace-cmd record -p function_graph -g serial8250_console_write \ > ./sertest -n 1 /tmp/serco > > $ trace-cmd report > > | serial8250_console_write() { > 0.384 us | _raw_spin_lock_irqsave(); > 1.836 us | io_serial_in(); > 1.667 us | io_serial_out(); > | uart_console_write() { > | serial8250_console_putchar() { > | wait_for_xmitr() { > 1.870 us | io_serial_in(); > 2.238 us | } > 1.737 us | io_serial_out(); > 4.318 us | } > 4.675 us | } > | wait_for_xmitr() { > 1.635 us | io_serial_in(); > | __const_udelay() { > 1.125 us | delay_tsc(); > 1.429 us | } > ... > ... > ... > 1.683 us | io_serial_in(); > | __const_udelay() { > 1.248 us | delay_tsc(); > 1.486 us | } > 1.671 us | io_serial_in(); > 411.342 us | } So this includes waiting for empty slot. It is wait_for_xmitr() only. > In another machine, I measured a throughput of 11.5KB/s, with the serial > controller taking between 80-90us to send each byte. That matches the > expected throughput for a configuration of 115200 bps. > > This patch changes the serial8250_console_write to use the 16550 fifo > if available. In my benchmarks I got around 25% improvement in the slow > machine, and no performance penalty in the fast machine. Either the HW is slow on starting to work, or… What I noticed now in -rc1 is this output during boot: |[ 6.370196] ACPI: button: Power Button [PWRF] |[ 6.443501] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled |[0I 15 | [0I 15 | [No | [ld | [a2 | [a20tm | [a2nct | [s |[s |[s |[s … |[sk65, | [rt | [Pe | [a | [ 6.873611] ata1: SATA link down (SStatus 0 SControl 300) |[ 6.879680] ata3: SATA link down (SStatus 0 SControl 300) The kernel buffer reports here: |[ 6.370196] ACPI: button: Power Button [PWRF] |[ 6.443501] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled |[ 6.450643] 00:03: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A |[ 6.451625] 00:04: ttyS1 at I/O 0x2f8 (irq = 3, base_baud = 115200) is a 16550A |[ 6.453808] Non-volatile memory driver v1.3 |[ 6.475688] loop: module loaded |[ 6.476401] ahci 0000:00:1f.2: version 3.0 |[ 6.487238] ahci 0000:00:1f.2: AHCI 0001.0300 32 slots 6 ports 6 Gbps 0x3f impl SATA mode I did remove the last few lines but it appears that since the initialisation of the port some of the lines got lost. Do you see the same? > Signed-off-by: Wander Lairson Costa <wander@redhat.com> Sebastian
On Tue, Jan 25, 2022 at 10:07 AM Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote: > > On 2021-12-22 08:28:30 [-0300], Wander Lairson Costa wrote: > > Note: I am using a small test app + driver located at [0] for the > > problem description. serco is a driver whose write function dispatches > > to the serial controller. sertest is a user-mode app that writes n bytes > > to the serial console using the serco driver. > > > > While investigating a bug in the RHEL kernel, I noticed that the serial > > console throughput is way below the configured speed of 115200 bps in > > a HP Proliant DL380 Gen9. I was expecting something above 10KB/s, but > > I got 2.5KB/s. > > > > $ time ./sertest -n 2500 /tmp/serco > > > > real 0m0.997s > > user 0m0.000s > > sys 0m0.997s > > > > With the help of the function tracer, I then noticed the serial > > controller was taking around 410us seconds to dispatch one single byte: > > was this the HW access or did this include the wait-for-fifo empty? > > > $ trace-cmd record -p function_graph -g serial8250_console_write \ > > ./sertest -n 1 /tmp/serco > > > > $ trace-cmd report > > > > | serial8250_console_write() { > > 0.384 us | _raw_spin_lock_irqsave(); > > 1.836 us | io_serial_in(); > > 1.667 us | io_serial_out(); > > | uart_console_write() { > > | serial8250_console_putchar() { > > | wait_for_xmitr() { > > 1.870 us | io_serial_in(); > > 2.238 us | } > > 1.737 us | io_serial_out(); > > 4.318 us | } > > 4.675 us | } > > | wait_for_xmitr() { > > 1.635 us | io_serial_in(); > > | __const_udelay() { > > 1.125 us | delay_tsc(); > > 1.429 us | } > > ... > > ... > > ... > > 1.683 us | io_serial_in(); > > | __const_udelay() { > > 1.248 us | delay_tsc(); > > 1.486 us | } > > 1.671 us | io_serial_in(); > > 411.342 us | } > > So this includes waiting for empty slot. It is wait_for_xmitr() only. > > > In another machine, I measured a throughput of 11.5KB/s, with the serial > > controller taking between 80-90us to send each byte. That matches the > > expected throughput for a configuration of 115200 bps. > > > > This patch changes the serial8250_console_write to use the 16550 fifo > > if available. In my benchmarks I got around 25% improvement in the slow > > machine, and no performance penalty in the fast machine. > > Either the HW is slow on starting to work, or… > > What I noticed now in -rc1 is this output during boot: > > |[ 6.370196] ACPI: button: Power Button [PWRF] > |[ 6.443501] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled > |[0I 15 > | [0I 15 > | [No > | [ld > | [a2 > | [a20tm > | [a2nct > | [s > |[s > |[s > |[s > … > |[sk65, > | [rt > | [Pe > | [a > | [ 6.873611] ata1: SATA link down (SStatus 0 SControl 300) > |[ 6.879680] ata3: SATA link down (SStatus 0 SControl 300) > > The kernel buffer reports here: > > |[ 6.370196] ACPI: button: Power Button [PWRF] > |[ 6.443501] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled > |[ 6.450643] 00:03: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A > |[ 6.451625] 00:04: ttyS1 at I/O 0x2f8 (irq = 3, base_baud = 115200) is a 16550A > |[ 6.453808] Non-volatile memory driver v1.3 > |[ 6.475688] loop: module loaded > |[ 6.476401] ahci 0000:00:1f.2: version 3.0 > |[ 6.487238] ahci 0000:00:1f.2: AHCI 0001.0300 32 slots 6 ports 6 Gbps 0x3f impl SATA mode > > I did remove the last few lines but it appears that since the > initialisation of the port some of the lines got lost. > > Do you see the same? There is another thread[1] reporting some issues with this patch. There, this diff seems to fix the problems, could you please try and report if it works for you too? diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c index 2abb3de11a48..d3a93e5d55f7 100644 --- a/drivers/tty/serial/8250/8250_port.c +++ b/drivers/tty/serial/8250/8250_port.c @@ -3343,7 +3343,7 @@ static void serial8250_console_fifo_write(struct uart_8250_port *up, { int i; const char *end = s + count; - unsigned int fifosize = up->port.fifosize; + unsigned int fifosize = up->tx_loadsz; bool cr_sent = false; while (s != end) { @@ -3409,8 +3409,8 @@ void serial8250_console_write(struct uart_8250_port *up, const char *s, } use_fifo = (up->capabilities & UART_CAP_FIFO) && - port->fifosize > 1 && - (serial_port_in(port, UART_FCR) & UART_FCR_ENABLE_FIFO) && + up->tx_loadsz > 1 && + (up->fcr & UART_FCR_ENABLE_FIFO) && /* * After we put a data in the fifo, the controller will send * it regardless of the CTS state. Therefore, only use fifo [1] https://lore.kernel.org/all/fa42a60c-954a-acc0-3962-f00427153f78@nvidia.com/ > > > Signed-off-by: Wander Lairson Costa <wander@redhat.com> > > Sebastian >
On 2022-01-25 10:15:04 [-0300], Wander Costa wrote: > There is another thread[1] reporting some issues with this patch. > There, this diff seems to fix the problems, could you please try and > report if it works for you too? Nope. Still there. Sebastian
On Tue, Jan 25, 2022 at 11:11 AM Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote: > > On 2022-01-25 10:15:04 [-0300], Wander Costa wrote: > > There is another thread[1] reporting some issues with this patch. > > There, this diff seems to fix the problems, could you please try and > > report if it works for you too? > > Nope. Still there. I will check on my side if I see something similar. Do you still get lines lost as well?
On 2022-01-25 11:26:03 [-0300], Wander Costa wrote: > I will check on my side if I see something similar. Do you still get > lines lost as well? Yes. The output is garbage for a while and then it fixes itself. Setting "use_fifo = 0" works. Thanks. Sebastian
diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c index 46e2079ad1aa..5805f18520dd 100644 --- a/drivers/tty/serial/8250/8250_port.c +++ b/drivers/tty/serial/8250/8250_port.c @@ -2056,10 +2056,7 @@ static void serial8250_break_ctl(struct uart_port *port, int break_state) serial8250_rpm_put(up); } -/* - * Wait for transmitter & holding register to empty - */ -static void wait_for_xmitr(struct uart_8250_port *up, int bits) +static void wait_for_lsr(struct uart_8250_port *up, int bits) { unsigned int status, tmout = 10000; @@ -2076,6 +2073,16 @@ static void wait_for_xmitr(struct uart_8250_port *up, int bits) udelay(1); touch_nmi_watchdog(); } +} + +/* + * Wait for transmitter & holding register to empty + */ +static void wait_for_xmitr(struct uart_8250_port *up, int bits) +{ + unsigned int tmout; + + wait_for_lsr(up, bits); /* Wait up to 1s for flow control if necessary */ if (up->port.flags & UPF_CONS_FLOW) { @@ -3325,6 +3332,35 @@ static void serial8250_console_restore(struct uart_8250_port *up) serial8250_out_MCR(up, UART_MCR_DTR | UART_MCR_RTS); } +/* + * Print a string to the serial port using the device FIFO + * + * It sends fifosize bytes and then waits for the fifo + * to get empty. + */ +static void serial8250_console_fifo_write(struct uart_8250_port *up, + const char *s, unsigned int count) +{ + int i; + const char *end = s + count; + unsigned int fifosize = up->port.fifosize; + bool cr_sent = false; + + while (s != end) { + wait_for_lsr(up, UART_LSR_THRE); + + for (i = 0; i < fifosize && s != end; ++i) { + if (*s == '\n' && !cr_sent) { + serial_out(up, UART_TX, '\r'); + cr_sent = true; + } else { + serial_out(up, UART_TX, *s++); + cr_sent = false; + } + } + } +} + /* * Print a string to the serial port trying not to disturb * any possible real use of the port... @@ -3340,7 +3376,7 @@ void serial8250_console_write(struct uart_8250_port *up, const char *s, struct uart_8250_em485 *em485 = up->em485; struct uart_port *port = &up->port; unsigned long flags; - unsigned int ier; + unsigned int ier, use_fifo; int locked = 1; touch_nmi_watchdog(); @@ -3372,7 +3408,20 @@ void serial8250_console_write(struct uart_8250_port *up, const char *s, mdelay(port->rs485.delay_rts_before_send); } - uart_console_write(port, s, count, serial8250_console_putchar); + use_fifo = (up->capabilities & UART_CAP_FIFO) && + port->fifosize > 1 && + (serial_port_in(port, UART_FCR) & UART_FCR_ENABLE_FIFO) && + /* + * After we put a data in the fifo, the controller will send + * it regardless of the CTS state. Therefore, only use fifo + * if we don't use control flow. + */ + !(up->port.flags & UPF_CONS_FLOW); + + if (likely(use_fifo)) + serial8250_console_fifo_write(up, s, count); + else + uart_console_write(port, s, count, serial8250_console_putchar); /* * Finally, wait for transmitter to become empty
Note: I am using a small test app + driver located at [0] for the problem description. serco is a driver whose write function dispatches to the serial controller. sertest is a user-mode app that writes n bytes to the serial console using the serco driver. While investigating a bug in the RHEL kernel, I noticed that the serial console throughput is way below the configured speed of 115200 bps in a HP Proliant DL380 Gen9. I was expecting something above 10KB/s, but I got 2.5KB/s. $ time ./sertest -n 2500 /tmp/serco real 0m0.997s user 0m0.000s sys 0m0.997s With the help of the function tracer, I then noticed the serial controller was taking around 410us seconds to dispatch one single byte: $ trace-cmd record -p function_graph -g serial8250_console_write \ ./sertest -n 1 /tmp/serco $ trace-cmd report | serial8250_console_write() { 0.384 us | _raw_spin_lock_irqsave(); 1.836 us | io_serial_in(); 1.667 us | io_serial_out(); | uart_console_write() { | serial8250_console_putchar() { | wait_for_xmitr() { 1.870 us | io_serial_in(); 2.238 us | } 1.737 us | io_serial_out(); 4.318 us | } 4.675 us | } | wait_for_xmitr() { 1.635 us | io_serial_in(); | __const_udelay() { 1.125 us | delay_tsc(); 1.429 us | } ... ... ... 1.683 us | io_serial_in(); | __const_udelay() { 1.248 us | delay_tsc(); 1.486 us | } 1.671 us | io_serial_in(); 411.342 us | } In another machine, I measured a throughput of 11.5KB/s, with the serial controller taking between 80-90us to send each byte. That matches the expected throughput for a configuration of 115200 bps. This patch changes the serial8250_console_write to use the 16550 fifo if available. In my benchmarks I got around 25% improvement in the slow machine, and no performance penalty in the fast machine. Signed-off-by: Wander Lairson Costa <wander@redhat.com> --- drivers/tty/serial/8250/8250_port.c | 61 ++++++++++++++++++++++++++--- 1 file changed, 55 insertions(+), 6 deletions(-)