mbox series

[RFC,0/2] Add support for stacked and parallel memories

Message ID 20241026075347.580858-1-amit.kumar-mahapatra@amd.com
Headers show
Series Add support for stacked and parallel memories | expand

Message

Mahapatra, Amit Kumar Oct. 26, 2024, 7:53 a.m. UTC
Hello Everyone,

Following an email discussion with Miquel regarding the binding changes
and overall architecture for implementing support for stacked and parallel
memories, I’m sharing this RFC to initiate a discussion on the proposed
updates to current bindings and to finalize the implementation
architecture.

Before diving into the main topic, here is some background on stacked and
parallel memories.

The AMD QSPI controller supports two advanced connection modes(Stacked and
Parallel) which allow the controller to treat two different flashes as one
storage.

Stacked:
Flashes share the same SPI bus, but different CS line, controller driver
asserts the CS of the flash to which it needs to communicate. Stacked mode
is a software abstraction rather than a controller feature or capability.
At any given time, the controller communicates with one of the two
connected flash devices, as determined by the requested address and data
length. If an operation starts on one flash and ends on the other, the
core needs to split it into two separate operations and adjust the data
length accordingly.

Parallel(Multi-CS):
Both the flashes have their separate SPI bus, CS of both the flashes will
be asserted/de-asserted at the same time. In this mode data will be split
across both the flashes by enabling the STRIPE setting in the controller.
Parallel mode is a controller feature where if the STRIPE bit is set then
the controller internally handles the data split during data write to the
flashes and while reading data from the flash the controller internally
merges data from both the flashes before writing to the controller FIFO.
If STRIPE is not enabled, then same data will be sent to both the devices.
In parallel mode both the flashes should be identical.

For more information on the modes please feel free to go through the
controller flash interface below [1].

Mirochip QSPI controller[2] also supports "Dual Parallel 8-bit IO mode",
but they call it "Twin Quad Mode".

Initially in [3] [4] [5] Miquel had tried to extend MTD-CONCAT driver to
support Stacked mode, but the bindings were not accepted. So, the
MTD-CONCAT approach was dropped and the DT bindings that got accepted
[6] [7] [8] that describes the two flash devices as being one. SPI core
changes to support the above bindings were added [9]. While adding the
support in SPI-NOR  Tudor provided additional feedback, leading to a
discussion on updating the current stacked and parallel DT bindings.

Proposed Solution:
The solution has two parts:

1. Update MTD-CONCAT
   Update MTD-CONCAT to create virtual concatinated mtd devices as defined
   in the device tree.

2. Add a New Layer
   Add a new layer between the SPI-NOR and MTD layers to support stacked
   and parallel configurations. This new layer will be part of spi-nor,
   located in mtd/spi-nor/, can be included/excluded via Kconfig, will be
   maintained by AMD and will:

   - During probing, store information from all connected flashes in
     stacked or parallel mode and present them as a single device to the
	 MTD layer.
   - Register callbacks and manage MTD device registration within the new
     layer instead of spi-nor/core.c.
   - Make minimal changes in spi-nor/core.c, as stacked and parallel
     handling will be managed by the new layer on top of SPI-NOR.
   - Handle odd byte count requests from the MTD layer during flash
     operations in parallel mode.

[1] https://docs.amd.com/r/en-US/am011-versal-acap-trm/QSPI-Flash-Device-Interface
[2] https://ww1.microchip.com/downloads/aemDocuments/documents/MPU32/ProductDocuments/DataSheets/SAMA7G5-Series-Data-Sheet-DS60001765.pdf
[3] https://lore.kernel.org/all/20191113171505.26128-4-miquel.raynal@bootlin.com/
[4] https://lore.kernel.org/all/20191127105522.31445-5-miquel.raynal@bootlin.com/
[5]https://lore.kernel.org/all/20211112152411.818321-1-miquel.raynal@bootlin.com/
[6] https://github.com/torvalds/linux/commit/f89504300e94524d5d5846ff8b728592ac72cec4
[7] https://github.com/torvalds/linux/commit/eba5368503b4291db7819512600fa014ea17c5a8
[8] https://github.com/torvalds/linux/commit/e2edd1b64f1c79e8abda365149ed62a2a9a494b4
[9]https://github.com/torvalds/linux/commit/4d8ff6b0991d5e86b17b235fc46ec62e9195cb9b

Thanks,
Amit

Amit Kumar Mahapatra (2):
  dt-bindings: mtd: Add bindings for describing concatinated MTD devices
  dt-bindings: spi: Update stacked and parallel bindings

 .../mtd/partitions/fixed-partitions.yaml      | 18 +++++++++++++++
 .../bindings/mtd/partitions/partitions.yaml   |  6 +++++
 .../bindings/spi/spi-controller.yaml          | 23 +++++++++++++++++--
 .../bindings/spi/spi-peripheral-props.yaml    |  9 +++-----
 4 files changed, 48 insertions(+), 8 deletions(-)

Comments

Mahapatra, Amit Kumar Oct. 30, 2024, 12:09 p.m. UTC | #1
Hello,

> > Stacked:
> > Flashes share the same SPI bus, but different CS line, controller
> > driver asserts the CS of the flash to which it needs to communicate.
> > Stacked mode is a software abstraction rather than a controller feature or
> capability.
> > At any given time, the controller communicates with one of the two
> > connected flash devices, as determined by the requested address and
> > data length. If an operation starts on one flash and ends on the
> > other, the core needs to split it into two separate operations and
> > adjust the data length accordingly.
> >
> > Parallel(Multi-CS):
> > Both the flashes have their separate SPI bus, CS of both the flashes
> > will be asserted/de-asserted at the same time. In this mode data will
> > be split across both the flashes by enabling the STRIPE setting in the controller.
> > Parallel mode is a controller feature where if the STRIPE bit is set
> > then the controller internally handles the data split during data
> > write to the flashes and while reading data from the flash the
> > controller internally merges data from both the flashes before writing to the
> controller FIFO.
> > If STRIPE is not enabled, then same data will be sent to both the devices.
> > In parallel mode both the flashes should be identical.
> 
> Interesting.
> 
> What's the practical use case ? Some kind of RAID directly on raw flashes ? Could it

In a parallel configuration, the user can double the flash capacity 
and data throughput.

Users disable STRIPE when they need to access the flash device 
registers for reading or writing.
 
> help for protecting from a broken boot flash ?

No, because each read/write operation communicates with both flash 
devices simultaneously, meaning the image will be distributed across 
both flashes and read from both during boot.

Regards,
Amit

> 
> 
> --mtx
> 
> --
> ---
> Hinweis: unverschlüsselte E-Mails können leicht abgehört und manipuliert werden !
> Für eine vertrauliche Kommunikation senden Sie bitte ihren GPG/PGP-Schlüssel zu.
> ---
> Enrico Weigelt, metux IT consult
> Free software and Linux embedded engineering info@metux.net -- +49-151-
> 27565287
Pratyush Yadav Nov. 25, 2024, 3:40 p.m. UTC | #2
On Sat, Oct 26 2024, Amit Kumar Mahapatra wrote:

Hi Amit,

I've been meaning to look into this proposal for some time now, but one
thing or another kept coming up and I never got around to it. Well, I'll
try to get some of my thoughts out with this reply. I still haven't
looked very deeply into the past discussions, so apologies if I bring up
something that was already discussed.

> Hello Everyone,
>
> Following an email discussion with Miquel regarding the binding changes
> and overall architecture for implementing support for stacked and parallel
> memories, I’m sharing this RFC to initiate a discussion on the proposed
> updates to current bindings and to finalize the implementation
> architecture.
>
> Before diving into the main topic, here is some background on stacked and
> parallel memories.
>
> The AMD QSPI controller supports two advanced connection modes(Stacked and
> Parallel) which allow the controller to treat two different flashes as one
> storage.
>
> Stacked:
> Flashes share the same SPI bus, but different CS line, controller driver
> asserts the CS of the flash to which it needs to communicate. Stacked mode
> is a software abstraction rather than a controller feature or capability.
> At any given time, the controller communicates with one of the two
> connected flash devices, as determined by the requested address and data
> length. If an operation starts on one flash and ends on the other, the
> core needs to split it into two separate operations and adjust the data
> length accordingly.
>
> Parallel(Multi-CS):
> Both the flashes have their separate SPI bus, CS of both the flashes will
> be asserted/de-asserted at the same time. In this mode data will be split
> across both the flashes by enabling the STRIPE setting in the controller.
> Parallel mode is a controller feature where if the STRIPE bit is set then
> the controller internally handles the data split during data write to the
> flashes and while reading data from the flash the controller internally
> merges data from both the flashes before writing to the controller FIFO.
> If STRIPE is not enabled, then same data will be sent to both the devices.
> In parallel mode both the flashes should be identical.
>
> For more information on the modes please feel free to go through the
> controller flash interface below [1].
>
> Mirochip QSPI controller[2] also supports "Dual Parallel 8-bit IO mode",
> but they call it "Twin Quad Mode".
>
> Initially in [3] [4] [5] Miquel had tried to extend MTD-CONCAT driver to
> support Stacked mode, but the bindings were not accepted. So, the
> MTD-CONCAT approach was dropped and the DT bindings that got accepted
> [6] [7] [8] that describes the two flash devices as being one. SPI core
> changes to support the above bindings were added [9]. While adding the
> support in SPI-NOR  Tudor provided additional feedback, leading to a
> discussion on updating the current stacked and parallel DT bindings.
>
> Proposed Solution:
> The solution has two parts:
>
> 1. Update MTD-CONCAT
>    Update MTD-CONCAT to create virtual concatinated mtd devices as defined
>    in the device tree.

>From a very quick look, it seems mtdconcat should already do most of
what you want with "stacked mode". The tricky bits might be devicetree
design, but from the software perspective, I think mtdconcat makes
perfect sense. You leave all the complexity to the SPI NOR layer since
it already handles them, and just use the higher-level MTD layer to
virtually concatenate devices. Adding a new layer between MTD and SPI
NOR makes little sense because mtdconcat is already that layer. Another
benefit of this would be you can just as easily concatenate any kinds of
flashes you want; they don't have to be the same.

I think this is a much simpler problem to solve compared to parallel
mode. Once you figure out the devicetree stuff, and I think the
mtdconcat changes should be simple and not too controversial. So I think
you should split the parallel and stacked support into two independent
series. This way you make progress without having to wait for
discussions around parallel mode support to settle.

>
> 2. Add a New Layer
>    Add a new layer between the SPI-NOR and MTD layers to support stacked
>    and parallel configurations. This new layer will be part of spi-nor,
>    located in mtd/spi-nor/, can be included/excluded via Kconfig, will be
>    maintained by AMD and will:
>
>    - During probing, store information from all connected flashes in
>      stacked or parallel mode and present them as a single device to the
> 	 MTD layer.

As I mentioned above, I do not think you should do stacked flashes this
way.

>    - Register callbacks and manage MTD device registration within the new
>      layer instead of spi-nor/core.c.
>    - Make minimal changes in spi-nor/core.c, as stacked and parallel
>      handling will be managed by the new layer on top of SPI-NOR.
>    - Handle odd byte count requests from the MTD layer during flash
>      operations in parallel mode.

You'd also probably have to add support in SPI MEM to signal the
controller to use parallel mode. You don't want to use parallel mode all
the time since you'd need to do "normal" operations as well such as
reading/writing status registers, reading flash ID, SFDP, etc. That is
yet another "cost" parallel mode support has -- it adds another thing to
SPI MEM (I'm not saying the cost isn't necessarily worth it -- just
pointing it out).

>From the SPI NOR side, this layer would have to make sure both flashes
get configured with the exact same settings. That would need SPI NOR to
export how it configured a flash, ideally in a stable, well-defined
format. It would also have to ask SPI NOR to construct the SPI MEM ops
for it, and then edit them to set the "parallel mode" bit. Perhaps the
dirmap op templates can come in handy here.

It's hard to say much more without seeing the code. It would be
interesting to see how you can manage to get this layer work without too
much intrusion in the core.

[...]