mbox series

[0/2] dmaengine: dw: Fix sys freeze and XFER-bit set error for UARTs

Message ID 20240911184710.4207-1-fancer.lancer@gmail.com
Headers show
Series dmaengine: dw: Fix sys freeze and XFER-bit set error for UARTs | expand

Message

Serge Semin Sept. 11, 2024, 6:46 p.m. UTC
The main goal of the series is to fix the DW DMAC driver to be working
better with the serial 8250 device driver implementation. In particular it
was discovered that there is a random system freeze (caused by a
deadlock) and an occasional "BUG: XFER bit set, but channel not idle"
error printed to the log when the DW APB UART interface is used in
conjunction with the DW DMA controller. Although I guess the problem can
be found for any 8250 device using DW DMAC for the Tx/Rx-transfers
execution. Anyway this short series contains two patches fixing these
bugs. Please see the respective patches log for details.

Link: https://lore.kernel.org/dmaengine/20240802080756.7415-1-fancer.lancer@gmail.com/
Changelog RFC:
- Add a new patch:
  [PATCH 2/2] dmaengine: dw: Fix XFER bit set, but channel not idle error
  fixing the "XFER bit set, but channel not idle" error.
- Instead of just dropping the dwc_scan_descriptors() method invocation
  calculate the residue in the Tx-status getter.

base-commit: 8400291e289ee6b2bf9779ff1c83a291501f017b
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Cc: "Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: dmaengine@vger.kernel.org
Cc: linux-serial@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Serge Semin (2):
  dmaengine: dw: Prevent tx-status calling DMA-desc callback
  dmaengine: dw: Fix XFER bit set, but channel not idle error

 drivers/dma/dw/core.c | 144 ++++++++++++++++++++++--------------------
 1 file changed, 75 insertions(+), 69 deletions(-)

Comments

Andy Shevchenko Sept. 16, 2024, 1:01 p.m. UTC | #1
On Wed, Sep 11, 2024 at 09:46:08PM +0300, Serge Semin wrote:
> The main goal of the series is to fix the DW DMAC driver to be working
> better with the serial 8250 device driver implementation. In particular it
> was discovered that there is a random system freeze (caused by a
> deadlock) and an occasional "BUG: XFER bit set, but channel not idle"
> error printed to the log when the DW APB UART interface is used in
> conjunction with the DW DMA controller. Although I guess the problem can
> be found for any 8250 device using DW DMAC for the Tx/Rx-transfers
> execution. Anyway this short series contains two patches fixing these
> bugs. Please see the respective patches log for details.
> 
> Link: https://lore.kernel.org/dmaengine/20240802080756.7415-1-fancer.lancer@gmail.com/
> Changelog RFC:
> - Add a new patch:
>   [PATCH 2/2] dmaengine: dw: Fix XFER bit set, but channel not idle error
>   fixing the "XFER bit set, but channel not idle" error.
> - Instead of just dropping the dwc_scan_descriptors() method invocation
>   calculate the residue in the Tx-status getter.

FWIW, this series does not regress on Intel Merrifield (SPI case),
Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

P.S.
However it might need an additional tests for the DW UART based platforms.
Cc'ed to Hans just in case (it might that he can add this to his repo for
testing on Bay Trail and Cherry Trail that may have use of DW UART for BT
operations).
Serge Semin Sept. 20, 2024, 9:33 a.m. UTC | #2
Hi Andy

On Mon, Sep 16, 2024 at 04:01:08PM +0300, Andy Shevchenko wrote:
> On Wed, Sep 11, 2024 at 09:46:08PM +0300, Serge Semin wrote:
> > The main goal of the series is to fix the DW DMAC driver to be working
> > better with the serial 8250 device driver implementation. In particular it
> > was discovered that there is a random system freeze (caused by a
> > deadlock) and an occasional "BUG: XFER bit set, but channel not idle"
> > error printed to the log when the DW APB UART interface is used in
> > conjunction with the DW DMA controller. Although I guess the problem can
> > be found for any 8250 device using DW DMAC for the Tx/Rx-transfers
> > execution. Anyway this short series contains two patches fixing these
> > bugs. Please see the respective patches log for details.
> > 
> > Link: https://lore.kernel.org/dmaengine/20240802080756.7415-1-fancer.lancer@gmail.com/
> > Changelog RFC:
> > - Add a new patch:
> >   [PATCH 2/2] dmaengine: dw: Fix XFER bit set, but channel not idle error
> >   fixing the "XFER bit set, but channel not idle" error.
> > - Instead of just dropping the dwc_scan_descriptors() method invocation
> >   calculate the residue in the Tx-status getter.
> 

> FWIW, this series does not regress on Intel Merrifield (SPI case),
> Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> 

Great! Thanks.

> P.S.
> However it might need an additional tests for the DW UART based platforms.
> Cc'ed to Hans just in case (it might that he can add this to his repo for
> testing on Bay Trail and Cherry Trail that may have use of DW UART for BT
> operations).

It's not enough though. The DW UART controller must be connected to
the DW DMAC handshaking interface on the platform. The kernel must be
properly setup for that too. In that case the test would be done on
a proper target. Do the Bay Trail and Cherry Trail chips support such
HW-setup? If so the additional test would be very welcome.

Sometime ago you said that you seemed to meet a similar issue on older
machines:
https://lore.kernel.org/dmaengine/CAHp75VdXqS6xqdsQCyhaMNLvzwkFn9HU8k9SLcT=KSwF9QPN4Q@mail.gmail.com/
If it's still possible could you please perform at least some smoke
test on those devices?

In case of my device this series and a previous one
https://lore.kernel.org/dmaengine/20240802075100.6475-1-fancer.lancer@gmail.com/
fixed all the critical issues for the DW UART + DW DMAC buddies:
1. Sudden data disappearing at the tail of the transfers (previous
patch set).
2. Random system freeze (this patch set).

There is another problem caused by the too slow coherent memory IO on
my device. Due to that the data gets to be copied too slow in the
__dma_rx_complete()->tty_insert_flip_string() call. As a result a fast
incoming traffic overflows the DW UART inbound FIFO. But that can be
worked around by decreasing the Rx DMA-buffer size. (There are some
more generic fixes possible, but they haven't shown to be as effective
as the buffer size reduction.)

-Serge(y)

> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 
>