Message ID | 20221123110759.1836666-1-brgl@bgdev.pl |
---|---|
Headers | show |
Series | serial: qcom-geni-serial: implement support for SE DMA | expand |
On Thu, Nov 24, 2022 at 8:18 AM Jiri Slaby <jirislaby@kernel.org> wrote: > > On 23. 11. 22, 12:07, Bartosz Golaszewski wrote: > > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> > > > > qcom_geni_serial_handle_tx() is pretty big, let's move the code that > > handles the actual writing of data to a separate function which makes > > sense in preparation for introducing a dma variant of handle_tx(). > > > > Let's also shuffle the code a bit, drop unneeded variables and use > > uart_xmit_advance() instead of handling tail->xmit manually. > > > > Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> > > --- > > drivers/tty/serial/qcom_geni_serial.c | 54 +++++++++++++-------------- > > 1 file changed, 27 insertions(+), 27 deletions(-) > > > > diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c > > index 68a1402fbe58..658b6d596f58 100644 > > --- a/drivers/tty/serial/qcom_geni_serial.c > > +++ b/drivers/tty/serial/qcom_geni_serial.c > > @@ -704,19 +704,42 @@ static void qcom_geni_serial_start_rx(struct uart_port *uport) > > writel(irq_en, uport->membase + SE_GENI_M_IRQ_EN); > > } > > I know you just shuffle the code, but: > > > +static void qcom_geni_serial_send_chunk_fifo(struct uart_port *uport, > > + unsigned int chunk) > > +{ > > + struct qcom_geni_serial_port *port = to_dev_port(uport); > > + struct circ_buf *xmit = &uport->state->xmit; > > + u8 buf[BYTES_PER_FIFO_WORD]; > > + size_t remaining = chunk; > > Why size_t when the others are uints? Well, BYTES_PER_FIFO_WORD should > be defined as 4U. Good point. > > > + unsigned int tx_bytes; > > + int c; > > + > > + while (remaining) { > > + memset(buf, 0, sizeof(buf)); > > + tx_bytes = min_t(size_t, remaining, BYTES_PER_FIFO_WORD); > > Then, no need for min_t. > Same. > > + > > + for (c = 0; c < tx_bytes ; c++) { > > + buf[c] = xmit->buf[xmit->tail]; > > + uart_xmit_advance(uport, 1); > > + } > > + > > + iowrite32_rep(uport->membase + SE_GENI_TX_FIFOn, buf, 1); > > I wonder, why is _rep variant used to transfer a single word? Only to > hide the cast? > Even if - using writel() with a cast doesn't seem to improve the performance and this one looks prettier IMO. Bartosz
Thanks Bartosz for the patch, On 23/11/2022 11:07, Bartosz Golaszewski wrote: > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> > > The qcom-geni-serial driver currently only works in SE FIFO mode. This > limits the UART speed to around 180 kB/s. In order to achieve higher > speeds we need to use SE DMA mode. > > Keep the console port working in FIFO mode but extend the code to use DMA > for the high-speed port. > > Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> > --- > drivers/tty/serial/qcom_geni_serial.c | 289 ++++++++++++++++++++++---- > 1 file changed, 247 insertions(+), 42 deletions(-) > > diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c > index 82242a40a95a..0f4ba909c792 100644 > --- a/drivers/tty/serial/qcom_geni_serial.c > +++ b/drivers/tty/serial/qcom_geni_serial.c > @@ -70,6 +70,8 @@ > #define UART_START_TX 0x1 > /* UART S_CMD OP codes */ > #define UART_START_READ 0x1 > +#define UART_PARAM 0x1 > +#define UART_PARAM_RFR_OPEN BIT(7) > > #define UART_OVERSAMPLING 32 > #define STALE_TIMEOUT 16 > @@ -95,9 +97,11 @@ > /* We always configure 4 bytes per FIFO word */ > #define BYTES_PER_FIFO_WORD 4 > > +#define DMA_RX_BUF_SIZE 2048 > + > struct qcom_geni_device_data { > bool console; > - void (*handle_rx)(struct uart_port *uport, u32 bytes, bool drop); > + enum geni_se_xfer_mode mode; > }; > > struct qcom_geni_private_data { > @@ -118,9 +122,11 @@ struct qcom_geni_serial_port { > u32 tx_fifo_depth; > u32 tx_fifo_width; > u32 rx_fifo_depth; > + dma_addr_t tx_dma_addr; > + dma_addr_t rx_dma_addr; > bool setup; > unsigned int baud; > - void *rx_fifo; > + void *rx_buf; > u32 loopback; > bool brk; > > @@ -552,18 +558,11 @@ static void handle_rx_console(struct uart_port *uport, u32 bytes, bool drop) > > static void handle_rx_uart(struct uart_port *uport, u32 bytes, bool drop) > { > - struct tty_port *tport; > struct qcom_geni_serial_port *port = to_dev_port(uport); > - u32 num_bytes_pw = port->tx_fifo_width / BITS_PER_BYTE; > - u32 words = ALIGN(bytes, num_bytes_pw) / num_bytes_pw; > + struct tty_port *tport = &uport->state->port; > int ret; > > - tport = &uport->state->port; > - ioread32_rep(uport->membase + SE_GENI_RX_FIFOn, port->rx_fifo, words); > - if (drop) > - return; > - Are we removing FIFO support for uart? It it not a overhead to use dma for small transfers that are less than fifo size? > - ret = tty_insert_flip_string(tport, port->rx_fifo, bytes); > + ret = tty_insert_flip_string(tport, port->rx_buf, bytes); > if (ret != bytes) { > dev_err(uport->dev, "%s:Unable to push data ret %d_bytes %d\n", > __func__, ret, bytes); > @@ -578,7 +577,70 @@ static unsigned int qcom_geni_serial_tx_empty(struct uart_port *uport) > return !readl(uport->membase + SE_GENI_TX_FIFO_STATUS); > } > > -static void qcom_geni_serial_start_tx(struct uart_port *uport) > +static void qcom_geni_serial_stop_tx_dma(struct uart_port *uport) > +{ > + struct qcom_geni_serial_port *port = to_dev_port(uport); > + bool done; --> > + u32 status; ... > + > + status = readl(uport->membase + SE_GENI_STATUS); > + if (!(status & M_GENI_CMD_ACTIVE)) > + return; <--- this code snippet is repeating more than few times in the patches, looks like it could be made to a inline helper. > + > + if (port->rx_dma_addr) { > + geni_se_tx_dma_unprep(&port->se, port->tx_dma_addr, > + port->tx_remaining); > + port->tx_dma_addr = (dma_addr_t)NULL; > + port->tx_remaining = 0; > + } > + > + m_irq_en = readl(uport->membase + SE_GENI_M_IRQ_EN); > + writel(m_irq_en, uport->membase + SE_GENI_M_IRQ_EN); > + geni_se_cancel_m_cmd(&port->se); > + > + done = qcom_geni_serial_poll_bit(uport, SE_GENI_S_IRQ_STATUS, > + S_CMD_CANCEL_EN, true); > + if (!done) { > + geni_se_abort_m_cmd(&port->se); > + qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS, > + M_CMD_ABORT_EN, true); return is not checked, there are few more such instances in the patch. > + writel(M_CMD_ABORT_EN, uport->membase + SE_GENI_M_IRQ_CLEAR); > + } > + > + writel(M_CMD_CANCEL_EN, uport->membase + SE_GENI_M_IRQ_CLEAR); > +} > + > +static void qcom_geni_serial_start_tx_dma(struct uart_port *uport) > +{ > + struct qcom_geni_serial_port *port = to_dev_port(uport); > + struct circ_buf *xmit = &uport->state->xmit; > + unsigned int xmit_size; > + int ret; > + > + if (port->tx_dma_addr) > + return; Is this condition actually possible? > + > + xmit_size = uart_circ_chars_pending(xmit); > + if (xmit_size < WAKEUP_CHARS) > + uart_write_wakeup(uport); > + > + xmit_size = CIRC_CNT_TO_END(xmit->head, xmit->tail, UART_XMIT_SIZE); > + > + qcom_geni_serial_setup_tx(uport, xmit_size); > + > + ret = geni_se_tx_dma_prep(&port->se, &xmit->buf[xmit->tail], > + xmit_size, &port->tx_dma_addr); > + if (ret) { > + dev_err(uport->dev, "unable to start TX SE DMA: %d\n", ret); > + qcom_geni_serial_stop_tx_dma(uport); > + return; > + } > + > + port->tx_remaining = xmit_size; > +} > + ...
On Fri, 25 Nov 2022 at 15:37, Srinivas Kandagatla <srinivas.kandagatla@linaro.org> wrote: > <snip> > > > > @@ -552,18 +558,11 @@ static void handle_rx_console(struct uart_port *uport, u32 bytes, bool drop) > > > > static void handle_rx_uart(struct uart_port *uport, u32 bytes, bool drop) > > { > > - struct tty_port *tport; > > struct qcom_geni_serial_port *port = to_dev_port(uport); > > - u32 num_bytes_pw = port->tx_fifo_width / BITS_PER_BYTE; > > - u32 words = ALIGN(bytes, num_bytes_pw) / num_bytes_pw; > > + struct tty_port *tport = &uport->state->port; > > int ret; > > > > - tport = &uport->state->port; > > - ioread32_rep(uport->membase + SE_GENI_RX_FIFOn, port->rx_fifo, words); > > - if (drop) > > - return; > > - > > Are we removing FIFO support for uart? > > It it not a overhead to use dma for small transfers that are less than > fifo size? > You made me think about it but no - while I haven't measured it yet, I don't think it's worth the code complexity behind it. The i2c driver doesn't do it this way for short transfers either. If you insist I can test it and come forward with numbers but unless we encounter a real-life example of the need for lots of very short reads/writes in short succession, I don't think we should overcomplicate this. > > > - ret = tty_insert_flip_string(tport, port->rx_fifo, bytes); > > + ret = tty_insert_flip_string(tport, port->rx_buf, bytes); > > if (ret != bytes) { > > dev_err(uport->dev, "%s:Unable to push data ret %d_bytes %d\n", > > __func__, ret, bytes); > > @@ -578,7 +577,70 @@ static unsigned int qcom_geni_serial_tx_empty(struct uart_port *uport) > > return !readl(uport->membase + SE_GENI_TX_FIFO_STATUS); > > } > > > > -static void qcom_geni_serial_start_tx(struct uart_port *uport) > > +static void qcom_geni_serial_stop_tx_dma(struct uart_port *uport) > > +{ > > + struct qcom_geni_serial_port *port = to_dev_port(uport); > > + bool done; > > --> > > + u32 status; > ... > > + > > + status = readl(uport->membase + SE_GENI_STATUS); > > + if (!(status & M_GENI_CMD_ACTIVE)) > > + return; > <--- > > this code snippet is repeating more than few times in the patches, looks > like it could be made to a inline helper. > Makes sense. > > > + > > + if (port->rx_dma_addr) { > > + geni_se_tx_dma_unprep(&port->se, port->tx_dma_addr, > > + port->tx_remaining); > > + port->tx_dma_addr = (dma_addr_t)NULL; > > + port->tx_remaining = 0; > > + } > > + > > + m_irq_en = readl(uport->membase + SE_GENI_M_IRQ_EN); > > + writel(m_irq_en, uport->membase + SE_GENI_M_IRQ_EN); > > + geni_se_cancel_m_cmd(&port->se); > > + > > + done = qcom_geni_serial_poll_bit(uport, SE_GENI_S_IRQ_STATUS, > > + S_CMD_CANCEL_EN, true); > > + if (!done) { > > + geni_se_abort_m_cmd(&port->se); > > + qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS, > > + M_CMD_ABORT_EN, true); > > return is not checked, there are few more such instances in the patch. > Yes, but this is an error path. What would I do if the cancel failed to go through and then the abort failed as well? I can at best emit an error message. > > + writel(M_CMD_ABORT_EN, uport->membase + SE_GENI_M_IRQ_CLEAR); > > + } > > + > > + writel(M_CMD_CANCEL_EN, uport->membase + SE_GENI_M_IRQ_CLEAR); > > +} > > + > > +static void qcom_geni_serial_start_tx_dma(struct uart_port *uport) > > +{ > > + struct qcom_geni_serial_port *port = to_dev_port(uport); > > + struct circ_buf *xmit = &uport->state->xmit; > > + unsigned int xmit_size; > > + int ret; > > + > > + if (port->tx_dma_addr) > > + return; > Is this condition actually possible? > It should never happen but I wanted to be sure. Maybe a BUG_ON() or WARN_ON() would be better? > > > + > > + xmit_size = uart_circ_chars_pending(xmit); > > + if (xmit_size < WAKEUP_CHARS) > > + uart_write_wakeup(uport); > > + > > + xmit_size = CIRC_CNT_TO_END(xmit->head, xmit->tail, UART_XMIT_SIZE); > > + > > + qcom_geni_serial_setup_tx(uport, xmit_size); > > + > > + ret = geni_se_tx_dma_prep(&port->se, &xmit->buf[xmit->tail], > > + xmit_size, &port->tx_dma_addr); > > + if (ret) { > > + dev_err(uport->dev, "unable to start TX SE DMA: %d\n", ret); > > + qcom_geni_serial_stop_tx_dma(uport); > > + return; > > + } > > + > > + port->tx_remaining = xmit_size; > > +} > > + > > ... Bart
On Fri, Nov 25, 2022 at 3:37 PM Srinivas Kandagatla <srinivas.kandagatla@linaro.org> wrote: > [snip] > > + > > +static void qcom_geni_serial_start_tx_dma(struct uart_port *uport) > > +{ > > + struct qcom_geni_serial_port *port = to_dev_port(uport); > > + struct circ_buf *xmit = &uport->state->xmit; > > + unsigned int xmit_size; > > + int ret; > > + > > + if (port->tx_dma_addr) > > + return; > Is this condition actually possible? > So it turns out, it's possible that the subsystem calls start_tx when this is already set but the main engine is not yet in active state (so we can't simply test that bit). This needs to stay then. Bart
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> The goal of this series is to update the qcom-geni-serial driver to use the DMA mode of the QUPv3 serial engine. This is accomplished by the last patch in the series. The previous ones contain either various tweaks, reworks and refactoring or prepare the driver for adding DMA support. More work will follow on the serial engine in order to reduce code redundancy among its users and add support for SE DMA to the qcom GENI SPI driver. v2 -> v3: - drop devres patches from the series v1 -> v2: - turn to_dev_uport() macro into a static inline function - use CIRC_CNT_TO_END() and uart_xmit_advance() where applicable and don't handle xmit->tail directly - drop sizeof() where BYTES_PER_FIFO_WORD can be used - further refactor qcom_geni_serial_handle_tx_fifo() - collect review tags Bartosz Golaszewski (13): tty: serial: qcom-geni-serial: drop unneeded forward definitions tty: serial: qcom-geni-serial: remove unused symbols tty: serial: qcom-geni-serial: align #define values tty: serial: qcom-geni-serial: improve the to_dev_port() macro tty: serial: qcom-geni-serial: remove stray newlines tty: serial: qcom-geni-serial: refactor qcom_geni_serial_isr() tty: serial: qcom-geni-serial: remove unneeded tabs tty: serial: qcom-geni-serial: refactor qcom_geni_serial_handle_tx() tty: serial: qcom-geni-serial: drop the return value from handle_rx tty: serial: qcom-geni-serial: use of_device_id data tty: serial: qcom-geni-serial: stop operations in progress at shutdown soc: qcom-geni-se: add more symbol definitions tty: serial: qcom-geni-serial: add support for serial engine DMA drivers/tty/serial/qcom_geni_serial.c | 606 +++++++++++++++++--------- include/linux/qcom-geni-se.h | 3 + 2 files changed, 409 insertions(+), 200 deletions(-)