diff mbox series

spi: atmel: Prevent false timeouts on long transfers

Message ID 20230616141225.2790073-1-miquel.raynal@bootlin.com
State Superseded
Headers show
Series spi: atmel: Prevent false timeouts on long transfers | expand

Commit Message

Miquel Raynal June 16, 2023, 2:12 p.m. UTC
A slow SPI bus clocks at ~20MHz, which means it would transfer about
2500 bytes per second with a single data line. Big transfers, like when
dealing with flashes can easily reach a few MiB. The current DMA timeout
is set to 1 second, which means any working transfer of about 4MiB will
always be cancelled.

With the above derivations, on a slow bus, we can assume every byte will
take at most 0.4ms. Said otherwise, we could add 4ms to the 1-second
timeout delay every 10kiB. On a 4MiB transfer, it would bring the
timeout delay up to 2.6s which still seems rather acceptable for a
timeout.

The consequence of this is that long transfers might be allowed, which
hence requires the need to interrupt the transfer if wanted by the
user. We can hence switch to the _interruptible variant of
wait_for_completion. This leads to a little bit more handling to also
handle the interrupted case but looks really acceptable overall.

While at it, we drop the useless, noisy and redundant WARN_ON() call.

Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
---
 drivers/spi/spi-atmel.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

Comments

Miquel Raynal June 16, 2023, 4:15 p.m. UTC | #1
Hi Mark,

broonie@kernel.org wrote on Fri, 16 Jun 2023 15:20:27 +0100:

> On Fri, Jun 16, 2023 at 04:12:25PM +0200, Miquel Raynal wrote:
> 
> > -#define SPI_DMA_TIMEOUT		(msecs_to_jiffies(1000))
> > +#define SPI_DMA_MIN_TIMEOUT	(msecs_to_jiffies(1000))
> > +#define SPI_DMA_TIMEOUT_PER_10K	(msecs_to_jiffies(4))  
> 
> Given that we know the bus speed can't we just calculate this like other
> drivers do (we should probably add a helper TBH)?

I agree we should probably have some kind of easy-to-use helper to
derive a decent timeout value. How do sound the heuristics
proposed here to you ? That would be:

	timeout = 1s + 4ms/10k

Thanks,
Miquèl
Mark Brown June 16, 2023, 4:43 p.m. UTC | #2
On Fri, Jun 16, 2023 at 06:15:35PM +0200, Miquel Raynal wrote:
> broonie@kernel.org wrote on Fri, 16 Jun 2023 15:20:27 +0100:

> > On Fri, Jun 16, 2023 at 04:12:25PM +0200, Miquel Raynal wrote:

> > > -#define SPI_DMA_TIMEOUT		(msecs_to_jiffies(1000))
> > > +#define SPI_DMA_MIN_TIMEOUT	(msecs_to_jiffies(1000))
> > > +#define SPI_DMA_TIMEOUT_PER_10K	(msecs_to_jiffies(4))  

> > Given that we know the bus speed can't we just calculate this like other
> > drivers do (we should probably add a helper TBH)?

> I agree we should probably have some kind of easy-to-use helper to
> derive a decent timeout value. How do sound the heuristics
> proposed here to you ? That would be:

> 	timeout = 1s + 4ms/10k

Like I say we should know the transfer speed so we can do better than
4ms/10k - we know how long it takes to clock out each byte, we can just
multiply that by the size of the transfer then add some fudge factor for
setup/teardown overhead.  1s feels pretty generous too.  The sun6i
driver for example does 

   max(tfr->len * 8 * 2 / (tfr->speed_hz / 1000), 100U)

and just doubles the length based timeout with a minimum of 100ms which
seems reasonable.
Miquel Raynal June 16, 2023, 4:59 p.m. UTC | #3
Hi Mark,

broonie@kernel.org wrote on Fri, 16 Jun 2023 17:43:06 +0100:

> On Fri, Jun 16, 2023 at 06:15:35PM +0200, Miquel Raynal wrote:
> > broonie@kernel.org wrote on Fri, 16 Jun 2023 15:20:27 +0100:  
> 
> > > On Fri, Jun 16, 2023 at 04:12:25PM +0200, Miquel Raynal wrote:  
> 
> > > > -#define SPI_DMA_TIMEOUT		(msecs_to_jiffies(1000))
> > > > +#define SPI_DMA_MIN_TIMEOUT	(msecs_to_jiffies(1000))
> > > > +#define SPI_DMA_TIMEOUT_PER_10K	(msecs_to_jiffies(4))    
> 
> > > Given that we know the bus speed can't we just calculate this like other
> > > drivers do (we should probably add a helper TBH)?  
> 
> > I agree we should probably have some kind of easy-to-use helper to
> > derive a decent timeout value. How do sound the heuristics
> > proposed here to you ? That would be:  
> 
> > 	timeout = 1s + 4ms/10k  
> 
> Like I say we should know the transfer speed so we can do better than
> 4ms/10k - we know how long it takes to clock out each byte, we can just
> multiply that by the size of the transfer then add some fudge factor for
> setup/teardown overhead.  1s feels pretty generous too.  The sun6i
> driver for example does 
> 
>    max(tfr->len * 8 * 2 / (tfr->speed_hz / 1000), 100U)
> 
> and just doubles the length based timeout with a minimum of 100ms which
> seems reasonable.

I already had issues with ~0.1s timeouts on NAND controllers, just
because the machine was heavily loaded. I believe we should avoid too
small timeouts, it does not make sense and make things worse under load.

I'll have a look.

Thanks,
Miquèl
Mark Brown June 16, 2023, 5:43 p.m. UTC | #4
On Fri, Jun 16, 2023 at 06:59:06PM +0200, Miquel Raynal wrote:
> broonie@kernel.org wrote on Fri, 16 Jun 2023 17:43:06 +0100:
> > On Fri, Jun 16, 2023 at 06:15:35PM +0200, Miquel Raynal wrote:
> > > broonie@kernel.org wrote on Fri, 16 Jun 2023 15:20:27 +0100:  

> > Like I say we should know the transfer speed so we can do better than
> > 4ms/10k - we know how long it takes to clock out each byte, we can just
> > multiply that by the size of the transfer then add some fudge factor for
> > setup/teardown overhead.  1s feels pretty generous too.  The sun6i
> > driver for example does 

> >    max(tfr->len * 8 * 2 / (tfr->speed_hz / 1000), 100U)

> > and just doubles the length based timeout with a minimum of 100ms which
> > seems reasonable.

> I already had issues with ~0.1s timeouts on NAND controllers, just
> because the machine was heavily loaded. I believe we should avoid too
> small timeouts, it does not make sense and make things worse under load.

Well, we can raise that minimum if it's causing issues - 500ms say?  1s
does feel a bit extreme for short transfers (and note that we'll use
more than 100ms for long enough transfers).
Miquel Raynal June 16, 2023, 5:53 p.m. UTC | #5
Hi Mark,

broonie@kernel.org wrote on Fri, 16 Jun 2023 18:43:51 +0100:

> On Fri, Jun 16, 2023 at 06:59:06PM +0200, Miquel Raynal wrote:
> > broonie@kernel.org wrote on Fri, 16 Jun 2023 17:43:06 +0100:  
> > > On Fri, Jun 16, 2023 at 06:15:35PM +0200, Miquel Raynal wrote:  
> > > > broonie@kernel.org wrote on Fri, 16 Jun 2023 15:20:27 +0100:    
> 
> > > Like I say we should know the transfer speed so we can do better than
> > > 4ms/10k - we know how long it takes to clock out each byte, we can just
> > > multiply that by the size of the transfer then add some fudge factor for
> > > setup/teardown overhead.  1s feels pretty generous too.  The sun6i
> > > driver for example does   
> 
> > >    max(tfr->len * 8 * 2 / (tfr->speed_hz / 1000), 100U)  
> 
> > > and just doubles the length based timeout with a minimum of 100ms which
> > > seems reasonable.  
> 
> > I already had issues with ~0.1s timeouts on NAND controllers, just
> > because the machine was heavily loaded. I believe we should avoid too
> > small timeouts, it does not make sense and make things worse under load.  
> 
> Well, we can raise that minimum if it's causing issues - 500ms say?  1s
> does feel a bit extreme for short transfers (and note that we'll use
> more than 100ms for long enough transfers).

Sounds reasonable. I believe it's worth the try.

Cheers,
Miquèl
diff mbox series

Patch

diff --git a/drivers/spi/spi-atmel.c b/drivers/spi/spi-atmel.c
index c4f22d50dba5..00f269f955ef 100644
--- a/drivers/spi/spi-atmel.c
+++ b/drivers/spi/spi-atmel.c
@@ -233,7 +233,8 @@ 
  */
 #define DMA_MIN_BYTES	16
 
-#define SPI_DMA_TIMEOUT		(msecs_to_jiffies(1000))
+#define SPI_DMA_MIN_TIMEOUT	(msecs_to_jiffies(1000))
+#define SPI_DMA_TIMEOUT_PER_10K	(msecs_to_jiffies(4))
 
 #define AUTOSUSPEND_TIMEOUT	2000
 
@@ -1280,6 +1281,7 @@  static int atmel_spi_one_transfer(struct spi_master *master,
 	int			timeout;
 	int			ret;
 	unsigned long		dma_timeout;
+	long			ret_timeout;
 
 	as = spi_master_get_devdata(master);
 
@@ -1308,6 +1310,11 @@  static int atmel_spi_one_transfer(struct spi_master *master,
 	as->current_remaining_bytes = xfer->len;
 	while (as->current_remaining_bytes) {
 		reinit_completion(&as->xfer_completion);
+		/* If transfer is bigger than 10kiB, enlarge the timeout */
+		dma_timeout = SPI_DMA_MIN_TIMEOUT;
+		if (as->current_remaining_bytes > 0x2800)
+			dma_timeout += (as->current_remaining_bytes / 0x2800) *
+				SPI_DMA_TIMEOUT_PER_10K;
 
 		if (as->use_pdc) {
 			atmel_spi_lock(as);
@@ -1333,11 +1340,12 @@  static int atmel_spi_one_transfer(struct spi_master *master,
 			atmel_spi_unlock(as);
 		}
 
-		dma_timeout = wait_for_completion_timeout(&as->xfer_completion,
-							  SPI_DMA_TIMEOUT);
-		if (WARN_ON(dma_timeout == 0)) {
-			dev_err(&spi->dev, "spi transfer timeout\n");
-			as->done_status = -EIO;
+		ret_timeout = wait_for_completion_interruptible_timeout(&as->xfer_completion,
+									dma_timeout);
+		if (ret_timeout <= 0) {
+			dev_err(&spi->dev, "spi transfer %s\n",
+				!ret_timeout ? "timeout" : "canceled");
+			as->done_status = ret_timeout < 0 ? ret_timeout : -EIO;
 		}
 
 		if (as->done_status)