mbox series

[v3,0/3] spi: Add DMA mode support to spi-qcom-qspi

Message ID 1681481153-24036-1-git-send-email-quic_vnivarth@quicinc.com
Headers show
Series spi: Add DMA mode support to spi-qcom-qspi | expand

Message

Vijaya Krishna Nivarthi April 14, 2023, 2:05 p.m. UTC
There are large number of QSPI irqs that fire during boot/init and later
on every suspend/resume.
This could be made faster by doing DMA instead of PIO.
Below is comparison for number of interrupts raised in 2 acenarios...
Boot up and stabilise
Suspend/Resume

Sequence   PIO    DMA
=======================
Boot-up    69088  19284
S/R        5066   3430

Though we have not made measurements for speed, power we expect
the performance to be better with DMA mode and no regressions were
encountered in testing.

Vijaya Krishna Nivarthi (3):
  spi: dt-bindings: qcom,spi-qcom-qspi: Add iommus
  arm64: dts: qcom: sc7280: Add stream-id of qspi to iommus
  spi: spi-qcom-qspi: Add DMA mode support
---
v2 -> v3:
- Modified commit messages
- Made a change to driver based on re-review

v1 -> v2:
- Added documentation file to the series
- Made changes to driver based on HPG re-review
---
 .../bindings/spi/qcom,spi-qcom-qspi.yaml           |   3 +
 arch/arm64/boot/dts/qcom/sc7280.dtsi               |   1 +
 drivers/spi/spi-qcom-qspi.c                        | 434 +++++++++++++++++++--
 3 files changed, 407 insertions(+), 31 deletions(-)

Comments

Krzysztof Kozlowski April 15, 2023, 8:53 a.m. UTC | #1
On 14/04/2023 16:05, Vijaya Krishna Nivarthi wrote:
> Add iommus binding for DMA mode support
> 
> Signed-off-by: Vijaya Krishna Nivarthi <quic_vnivarth@quicinc.com>
> ---

Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>

Best regards,
Krzysztof
Mark Brown April 17, 2023, 12:12 p.m. UTC | #2
On Fri, Apr 14, 2023 at 03:05:58PM -0700, Doug Anderson wrote:

> Having alignment requirements like this doesn't seem like it should be
> that unusual, though, and that's why it feels like the logic belongs
> in the SPI core. In fact, it seems like this is _supposed_ to be
> handled in the SPI core, but it isn't? In "spi.h" I see
> "dma_alignment" that claims to be exactly what you need. As far as I
> can tell, though, the core doesn't use this? ...so I'm kinda confused.
> As far as I can tell this doesn't do anything and thus anyone setting
> it today is broken?

SPI consumers should only be providing dmaable buffers.
Doug Anderson April 17, 2023, 2:07 p.m. UTC | #3
Hi,

On Mon, Apr 17, 2023 at 5:12 AM Mark Brown <broonie@kernel.org> wrote:
>
> On Fri, Apr 14, 2023 at 03:05:58PM -0700, Doug Anderson wrote:
>
> > Having alignment requirements like this doesn't seem like it should be
> > that unusual, though, and that's why it feels like the logic belongs
> > in the SPI core. In fact, it seems like this is _supposed_ to be
> > handled in the SPI core, but it isn't? In "spi.h" I see
> > "dma_alignment" that claims to be exactly what you need. As far as I
> > can tell, though, the core doesn't use this? ...so I'm kinda confused.
> > As far as I can tell this doesn't do anything and thus anyone setting
> > it today is broken?
>
> SPI consumers should only be providing dmaable buffers.

Ah, I think I see.

1. In "struct spi_transfer" the @tx_buf and @rx_buf are documented to
have "dma-safe memory".

2. On ARM64 anyway, I see "ARCH_DMA_MINALIGN" is 128.

So there is no reason to do any special rules to force alignment to
32-bytes because that's already guaranteed. Presumably that means you
can drop a whole pile of code and things will still work fine.

-Doug
Vijaya Krishna Nivarthi April 17, 2023, 3:57 p.m. UTC | #4
Hi,


On 4/17/2023 7:37 PM, Doug Anderson wrote:
> Hi,
>
> On Mon, Apr 17, 2023 at 5:12 AM Mark Brown <broonie@kernel.org> wrote:
>> On Fri, Apr 14, 2023 at 03:05:58PM -0700, Doug Anderson wrote:
>>
>>> Having alignment requirements like this doesn't seem like it should be
>>> that unusual, though, and that's why it feels like the logic belongs
>>> in the SPI core. In fact, it seems like this is _supposed_ to be
>>> handled in the SPI core, but it isn't? In "spi.h" I see
>>> "dma_alignment" that claims to be exactly what you need. As far as I
>>> can tell, though, the core doesn't use this? ...so I'm kinda confused.
>>> As far as I can tell this doesn't do anything and thus anyone setting
>>> it today is broken?
>> SPI consumers should only be providing dmaable buffers.
> Ah, I think I see.
>
> 1. In "struct spi_transfer" the @tx_buf and @rx_buf are documented to
> have "dma-safe memory".
>
> 2. On ARM64 anyway, I see "ARCH_DMA_MINALIGN" is 128.
>
> So there is no reason to do any special rules to force alignment to
> 32-bytes because that's already guaranteed. Presumably that means you
> can drop a whole pile of code and things will still work fine.
>
> -Doug


Thank you very much Mark and Doug for review and inputs.


spi_map_buf is taking into consideration max_dma_len (in spi.h) when it 
is set.

For example if set to 1024 instead of 65536(the actual max_dma_len of 
HW), 4 entries are created in the sg_table for a buffer of size 4096.

However, Like Doug pointed, dma_alignment is not being used by core.

Some drivers seem to be setting both of these in probe() but 
dma_alignment is probably not taking effect.

Is it up to the SPI consumers to read this and ensure they are providing 
dmaable buffers of required alignment?


The dma_addresses coming from core are aligned for larger sized buffers 
but for small ones like 1 and 3 bytes they are not aligned.

Hence if the code handing the alignment part is not present, smaller 
transfers fail.

For example, Below debug patch does

a) all transfers in DMA

b) prints DMA addresses and lengths

====== patch start ======

diff --git a/drivers/spi/spi-qcom-qspi.c b/drivers/spi/spi-qcom-qspi.c
index 60c4f554..8d24022 100644
--- a/drivers/spi/spi-qcom-qspi.c
+++ b/drivers/spi/spi-qcom-qspi.c
@@ -438,8 +438,14 @@ static int qcom_qspi_setup_dma_desc(struct 
qcom_qspi *ctrl,
                 return -EINVAL;
         }

-       for (i = 0; i < sgt->nents; i++)
+       for (i = 0; i < sgt->nents; i++) {
+               dma_addr_t temp_addr;
                 sg_total_len += sg_dma_len(sgt->sgl + i);
+
+               temp_addr = sg_dma_address(sgt->sgl + i);
+               dev_err_ratelimited(ctrl->dev, "%s pilli-20230417 i-%d, 
nents-%d, len-%d, dma_address-%pad\n",
+                               __func__, i, sgt->nents, 
sg_dma_len(sgt->sgl + i), &temp_addr);
+       }
         if (sg_total_len != xfer->len) {
                 dev_err(ctrl->dev, "Data lengths mismatch\n");
                 return -EINVAL;
@@ -517,6 +523,7 @@ static void qcom_qspi_dma_xfer(struct qcom_qspi *ctrl)
  static bool qcom_qspi_can_dma(struct spi_controller *ctlr,
                          struct spi_device *slv, struct spi_transfer *xfer)
  {
+       return true;
         return xfer->len > QSPI_MAX_BYTES_FIFO ? true : false;
  }

====== patch end =======

and below is sample outcome...

[   23.620397] qcom_qspi 88dc000.spi: qcom_qspi_setup_dma_desc 
pilli-20230417 i-0, nents-1, len-1, dma_address-0x00000000fff3e000
[   23.638392] qcom_qspi 88dc000.spi: qcom_qspi_setup_dma_desc 
pilli-20230417 i-0, nents-1, len-3, dma_address-0x00000000fff3f001
[   23.650238] qcom_qspi 88dc000.spi: qcom_qspi_setup_dma_desc 
pilli-20230417 i-0, nents-1, len-1, dma_address-0x00000000fff40004
[   23.662039] qcom_qspi 88dc000.spi: qcom_qspi_setup_dma_desc 
pilli-20230417 i-0, nents-1, len-8, dma_address-0x00000000fff44080
[   23.673965] qcom_qspi 88dc000.spi: qcom_qspi_setup_dma_desc 
pilli-20230417 i-0, nents-1, len-1, dma_address-0x00000000fff44000
[   23.685749] qcom_qspi 88dc000.spi: qcom_qspi_setup_dma_desc 
pilli-20230417 i-0, nents-1, len-3, dma_address-0x00000000fff40001
[   23.697630] qcom_qspi 88dc000.spi: qcom_qspi_setup_dma_desc 
pilli-20230417 i-0, nents-1, len-1, dma_address-0x00000000fff3f004
[   23.709552] qcom_qspi 88dc000.spi: qcom_qspi_setup_dma_desc 
pilli-20230417 i-0, nents-1, len-8, dma_address-0x00000000fff3e600
[   23.721460] qcom_qspi 88dc000.spi: qcom_qspi_setup_dma_desc 
pilli-20230417 i-0, nents-1, len-1, dma_address-0x00000000fff3e000
[   23.733257] qcom_qspi 88dc000.spi: qcom_qspi_setup_dma_desc 
pilli-20230417 i-0, nents-1, len-3, dma_address-0x00000000fff3f001

If we use the dma_address from sg table as is, transfers fail.

I have not checked the spi-nor driver, but is it the consumer driver's 
job to ensure required alignment in all cases?

Can you Please comment?

Thank you,

-Vijay/
Mark Brown April 17, 2023, 4:39 p.m. UTC | #5
On Mon, Apr 17, 2023 at 09:27:16PM +0530, Vijaya Krishna Nivarthi wrote:

> However, Like Doug pointed, dma_alignment is not being used by core.

The core will use kmalloc() for any new buffers it allocates, this is
guaranteed to satisfy DMA constraints.

> Is it up to the SPI consumers to read this and ensure they are providing
> dmaable buffers of required alignment?

If they're doing anything fun for allocation.  Or they can just use
kmalloc() themselves.

> The dma_addresses coming from core are aligned for larger sized buffers but
> for small ones like 1 and 3 bytes they are not aligned.

In theory even buffers that small should be DMA safe, in practice that
rarely matters since it's vanishingly rare that it's sensible to DMA
such tiny buffers rather than PIOing them so drivers will tend to never
actually try to do so and I'd expect bugs.  It is likely worth checking
that DMA makes sense for this hardware.

> I have not checked the spi-nor driver, but is it the consumer driver's job
> to ensure required alignment in all cases?

Yes.