mbox series

[v14,00/15] phy: Add support for Lynx 10G SerDes

Message ID 20230413160607.4128315-1-sean.anderson@seco.com
Headers show
Series phy: Add support for Lynx 10G SerDes | expand

Message

Sean Anderson April 13, 2023, 4:05 p.m. UTC
This adds support for the Lynx 10G SerDes found on the QorIQ T-series
and Layerscape series. Due to limited time and hardware, only support
for the LS1046ARDB and LS1088ARDB is added in this initial series.

This series is ready for review by the phy maintainers. I have addressed
all known feedback and there are no outstanding issues.

There are several stand-alone commits in this series. Please feel free
to pick them as appropriate. In particular, commits 1, 3, 4, 12, 13, and
14 are all good candidates for picking.

Major reconfiguration of baud rate (e.g. 1G->10G) does not work. From my
testing, SerDes register settings appear identical. The issue appears to
be between the PCS and the MAC. The link itself comes up at both ends,
and a mac loopback succeeds. However, a PCS loopback results in dropped
packets. Perhaps there is some undocumented register in the PCS?

I suspect this driver is around 95% complete, but I don't have the
documentation to make it work completely. At the very least it is useful
for two cases:

- Although this is untested, it should support 2.5G SGMII as well as
  1000BASE-KX. The latter needs MAC and PCS support, but the former
  should work out of the box.
- It allows for clock configurations not supported by the RCW. This is
  very useful if you want to use e.g. SRDS_PRTCL_S1=0x3333 and =0x1133
  on the same board. This is because the former setting will use PLL1
  as the 1G reference, but the latter will use PLL1 as the 10G
  reference. Because we can reconfigure the PLLs, it is possible to
  always use PLL1 as the 1G reference.

Changes in v14:
- Fix incorrect $id
- Add note about (lack of) use of FIELD_GET/PREP

Changes in v13:
- Fix references to brcm,bcm63xx-gpio.yaml (neeĢ brcm,bcm6345-gpio)
- Split interrupt changes off from serdes support
- Split off SFP addition from serdes support

Changes in v12:
- Put compatible first
- Keep gpio-controller to one line
- Add little-endian property
- Alphabetize compatibles
- Remove some comments
- Remove some examples with insufficient novelty

Changes in v11:
- Keep empty (or almost-empty) properties on a single line
- Don't use | unnecessarily
- Use gpio as the node name for examples
- Rename brcm,bcm6345-gpio.yaml to brcm,bcm63xx-gpio.yaml

Changes in v10:
- Convert gpio-mmio to yaml
- Add compatible for QIXIS
- Remove unnecessary inclusion of clk.h
- Don't gate clocks in compatibility mode
- Fix debugging print with incorrect error variable
- Move serdes bindings to SoC dtsi
- Add support for all (ethernet) serdes modes
- Refer to "nodes" instead of "bindings"
- Move compatible/reg first

Changes in v9:
- Add fsl,unused-lanes-reserved to allow for a gradual transition
  between firmware and Linux control of the SerDes
- Change phy-type back to fsl,type, as I was getting the error
    '#phy-cells' is a dependency of 'phy-type'
- Convert some u32s to unsigned long to match arguments
- Switch from round_rate to determine_rate
- Drop explicit reference to reference clock
- Use .parent_names when requesting parents
- Use devm_clk_hw_get_clk to pass clocks back to serdes
- Fix indentation
- Split off clock "driver" into its own patch to allow for better
  review.
- Add ability to defer lane initialization to phy_init. This allows
  for easier transitioning between firmware-managed serdes and Linux-
  managed serdes, as the consumer (such as dpaa2, which knows what the
  firmware is doing) has the last say on who gets control.
- Fix name of phy mode node
- Add fsl,unused-lanes-reserved to allow a gradual transition, depending
  on the mac link type.
- Remove unused clocks
- Fix some phy mode node names

Changes in v8:
- Remove unused variable from lynx_ls_mode_init
- Rename serdes phy handles to use _A, _B, etc. instead of _0, _1, etc.
  This should help remind readers that the numbering corresponds to the
  physical layout of the registers, and not the lane (pin) number.
- Prevent PCSs from probing as phys
- Rename serdes phy handles like the LS1046A
- Add SFP slot binding
- Fix incorrect lane ordering (it's backwards on the LS1088A just like it is in
  the LS1046A).
- Fix duplicated lane 2 (it should have been lane 3).
- Fix incorrectly-documented value for XFI1.
- Remove interrupt for aquantia phy. It never fired for whatever reason,
  preventing the link from coming up.
- Add GPIOs for QIXIS FPGA.
- Enable MAC1 PCS
- Remove si5341 binding

Changes in v7:
- Use double quotes everywhere in yaml
- Break out call order into generic documentation
- Refuse to switch "major" protocols
- Update Kconfig to reflect restrictions
- Remove set/clear of "pcs reset" bit, since it doesn't seem to fix
  anything.

Changes in v6:
- Bump PHY_TYPE_2500BASEX to 13, since PHY_TYPE_USXGMII was added in the
  meantime
- fsl,type -> phy-type
- frequence -> frequency
- Update MAINTAINERS to include new files
- Include bitfield.h and slab.h to allow compilation on non-arm64
  arches.
- Depend on COMMON_CLK and either layerscape/ppc
- XGI.9 -> XFI.9

Changes in v5:
- Update commit description
- Dual id header
- Remove references to PHY_INTERFACE_MODE_1000BASEKX to allow this
  series to be applied directly to linux/master.
- Add fsl,lynx-10g.h to MAINTAINERS

Changes in v4:
- Add 2500BASE-X and 10GBASE-R phy types
- Use subnodes to describe lane configuration, instead of describing
  PCCRs. This is the same style used by phy-cadence-sierra et al.
- Add ids for Lynx 10g PLLs
- Rework all debug statements to remove use of __func__. Additional
  information has been provided as necessary.
- Consider alternative parent rates in round_rate and not in set_rate.
  Trying to modify out parent's rate in set_rate will deadlock.
- Explicitly perform a stop/reset sequence in set_rate. This way we
  always ensure that the PLL is properly stopped.
- Set the power-down bit when disabling the PLL. We can do this now that
  enable/disable aren't abused during the set rate sequence.
- Fix typos in QSGMII_OFFSET and XFI_OFFSET
- Rename LNmTECR0_TEQ_TYPE_PRE to LNmTECR0_TEQ_TYPE_POST to better
  reflect its function (adding post-cursor equalization).
- Use of_clk_hw_onecell_get instead of a custom function.
- Return struct clks from lynx_clks_init instead of embedding lynx_clk
  in lynx_priv.
- Rework PCCR helper functions; T-series SoCs differ from Layerscape SoCs
  primarily in the layout and offset of the PCCRs. This will help bring a
  cleaner abstraction layer. The caps have been removed, since this handles the
  only current usage.
- Convert to use new binding format. As a result of this, we no longer need to
  have protocols for PCIe or SATA. Additionally, modes now live in lynx_group
  instead of lynx_priv.
- Remove teq from lynx_proto_params, since it can be determined from
  preq_ratio/postq_ratio.
- Fix an early return from lynx_set_mode not releasing serdes->lock.
- Rename lynx_priv.conf to .cfg, since I kept mistyping it.

Changes in v3:
- Manually expand yaml references
- Add mode configuration to device tree
- Rename remaining references to QorIQ SerDes to Lynx 10G
- Fix PLL enable sequence by waiting for our reset request to be cleared
  before continuing. Do the same for the lock, even though it isn't as
  critical. Because we will delay for 1.5ms on average, use prepare
  instead of enable so we can sleep.
- Document the status of each protocol
- Fix offset of several bitfields in RECR0
- Take into account PLLRST_B, SDRST_B, and SDEN when considering whether
  a PLL is "enabled."
- Only power off unused lanes.
- Split mode lane mask into first/last lane (like group)
- Read modes from device tree
- Use caps to determine whether KX/KR are supported
- Move modes to lynx_priv
- Ensure that the protocol controller is not already in-use when we try
  to configure a new mode. This should only occur if the device tree is
  misconfigured (e.g. when QSGMII is selected on two lanes but there is
  only one QSGMII controller).
- Split PLL drivers off into their own file
- Add clock for "ext_dly" instead of writing the bit directly (and
  racing with any clock code).
- Use kasprintf instead of open-coding the snprintf dance
- Support 1000BASE-KX in lynx_lookup_proto. This still requires PCS
  support, so nothing is truly "enabled" yet.
- Describe modes in device tree
- ls1088a: Add serdes bindings

Changes in v2:
- Rename to fsl,lynx-10g.yaml
- Refer to the device in the documentation, rather than the binding
- Move compatible first
- Document phy cells in the description
- Allow a value of 1 for phy-cells. This allows for compatibility with
  the similar (but according to Ioana Ciornei different enough) lynx-28g
  binding.
- Remove minItems
- Use list for clock-names
- Fix example binding having too many cells in regs
- Add #clock-cells. This will allow using assigned-clocks* to configure
  the PLLs.
- Document the structure of the compatible strings
- Rename driver to Lynx 10G (etc.)
- Fix not clearing group->pll after disabling it
- Support 1 and 2 phy-cells
- Power off lanes during probe
- Clear SGMIIaCR1_PCS_EN during probe
- Rename LYNX_PROTO_UNKNOWN to LYNX_PROTO_NONE
- Handle 1000BASE-KX in lynx_proto_mode_prep
- Use one phy cell for SerDes1, since no lanes can be grouped
- Disable SerDes by default to prevent breaking boards inadvertently.

Sean Anderson (15):
  dt-bindings: phy: Add 2500BASE-X and 10GBASE-R
  dt-bindings: phy: Add Lynx 10G phy binding
  dt-bindings: Convert gpio-mmio to yaml
  dt-bindings: gpio-mmio: Add compatible for QIXIS
  dt-bindings: clock: Add ids for Lynx 10g PLLs
  clk: Add Lynx 10G SerDes PLL driver
  phy: fsl: Add Lynx 10G SerDes driver
  phy: lynx10g: Enable by default on Layerscape
  arm64: dts: ls1046a: Add serdes nodes
  arm64: dts: ls1046ardb: Add serdes descriptions
  arm64: dts: ls1088a: Add serdes nodes
  arm64: dts: ls1088a: Prevent PCSs from probing as phys
  arm64: dts: ls1088ardb: Remove aquantia interrupt
  arm64: dts: ls1088ardb: Add SFP cage
  arm64: dts: ls1088ardb: Add serdes descriptions

 ...m6345-gpio.yaml => brcm,bcm63xx-gpio.yaml} |   18 +-
 .../devicetree/bindings/gpio/gpio-mmio.yaml   |  123 ++
 .../bindings/gpio/ni,169445-nand-gpio.txt     |   38 -
 .../devicetree/bindings/gpio/wd,mbl-gpio.txt  |   38 -
 .../mfd/brcm,bcm6318-gpio-sysctl.yaml         |    4 +-
 .../mfd/brcm,bcm63268-gpio-sysctl.yaml        |    4 +-
 .../mfd/brcm,bcm6328-gpio-sysctl.yaml         |    4 +-
 .../mfd/brcm,bcm6358-gpio-sysctl.yaml         |    4 +-
 .../mfd/brcm,bcm6362-gpio-sysctl.yaml         |    4 +-
 .../mfd/brcm,bcm6368-gpio-sysctl.yaml         |    4 +-
 .../devicetree/bindings/phy/fsl,lynx-10g.yaml |  248 ++++
 Documentation/driver-api/phy/index.rst        |    1 +
 Documentation/driver-api/phy/lynx_10g.rst     |   58 +
 MAINTAINERS                                   |    9 +
 .../boot/dts/freescale/fsl-ls1046a-rdb.dts    |   26 +
 .../arm64/boot/dts/freescale/fsl-ls1046a.dtsi |  111 ++
 .../boot/dts/freescale/fsl-ls1088a-rdb.dts    |   82 +-
 .../arm64/boot/dts/freescale/fsl-ls1088a.dtsi |  156 ++-
 drivers/clk/Makefile                          |    1 +
 drivers/clk/clk-fsl-lynx-10g.c                |  510 +++++++
 drivers/phy/freescale/Kconfig                 |   23 +
 drivers/phy/freescale/Makefile                |    1 +
 drivers/phy/freescale/phy-fsl-lynx-10g.c      | 1224 +++++++++++++++++
 include/dt-bindings/clock/fsl,lynx-10g.h      |   14 +
 include/dt-bindings/phy/phy.h                 |    2 +
 include/linux/phy/lynx-10g.h                  |   16 +
 26 files changed, 2607 insertions(+), 116 deletions(-)
 rename Documentation/devicetree/bindings/gpio/{brcm,bcm6345-gpio.yaml => brcm,bcm63xx-gpio.yaml} (75%)
 create mode 100644 Documentation/devicetree/bindings/gpio/gpio-mmio.yaml
 delete mode 100644 Documentation/devicetree/bindings/gpio/ni,169445-nand-gpio.txt
 delete mode 100644 Documentation/devicetree/bindings/gpio/wd,mbl-gpio.txt
 create mode 100644 Documentation/devicetree/bindings/phy/fsl,lynx-10g.yaml
 create mode 100644 Documentation/driver-api/phy/lynx_10g.rst
 create mode 100644 drivers/clk/clk-fsl-lynx-10g.c
 create mode 100644 drivers/phy/freescale/phy-fsl-lynx-10g.c
 create mode 100644 include/dt-bindings/clock/fsl,lynx-10g.h
 create mode 100644 include/linux/phy/lynx-10g.h

Comments

Vladimir Oltean April 25, 2023, 7:50 p.m. UTC | #1
Hello,

On Thu, 13 Apr 2023 12:05:52 -0400, Sean Anderson wrote:
> This series is ready for review by the phy maintainers. I have addressed
> all known feedback and there are no outstanding issues.
> 
> Major reconfiguration of baud rate (e.g. 1G->10G) does not work.
> 
> There are several stand-alone commits in this series. Please feel free
> to pick them as appropriate. In particular, commits 1, 3, 4, 12, 13, and
> 14 are all good candidates for picking.
> 
> - Although this is untested, it should support 2.5G SGMII as well as
>   1000BASE-KX. The latter needs MAC and PCS support, but the former
>   should work out of the box.
> - It allows for clock configurations not supported by the RCW. This is
>   very useful if you want to use e.g. SRDS_PRTCL_S1=0x3333 and =0x1133
>   on the same board. This is because the former setting will use PLL1
>   as the 1G reference, but the latter will use PLL1 as the 10G
>   reference. Because we can reconfigure the PLLs, it is possible to
>   always use PLL1 as the 1G reference.

I am an engineer working for NXP and I have access to the majority of
hardware that includes the Lynx 10G SERDES, as well as to block guides
that are not visible to customers, and to people from the hardware
design and validation teams.

I have an interest in adding a driver for this SERDES to support dynamic
Ethernet protocol reconfiguration, and perhaps the internal PHY for
copper backplane modes (1000Base-KX, 10GBase-KR) and its link training
procedure - although the latter will need more work.

I would like to thank you for starting the effort, but please, stop
submitting code for modes that are untested, and do not submit features
that do not work. If you do not have the time to fix up the loose ends
for this patch submission now, you won't have the time to debug
regressions on boards you might not even have access to, which worked
fine previously due to the static RCW/PBL configuration.

I have downloaded your patches, and although I have objections to some
of them already, I will be in the process of evaluating, testing,
changing them, for the coming weeks, perhaps even more. Please consider
this a NACK for the current patch set due to the SERDES related material,
although the unrelated patches (like "dt-bindings: Convert gpio-mmio to
yaml") can and should have been submitted separately, so they can be
analyzed by their respective maintainers based on their own merit and
not as part of an overall problematic set.
Sean Anderson April 25, 2023, 8:22 p.m. UTC | #2
On 4/25/23 15:50, Vladimir Oltean wrote:
> Hello,
> 
> On Thu, 13 Apr 2023 12:05:52 -0400, Sean Anderson wrote:
>> This series is ready for review by the phy maintainers. I have addressed
>> all known feedback and there are no outstanding issues.
>> 
>> Major reconfiguration of baud rate (e.g. 1G->10G) does not work.
>> 
>> There are several stand-alone commits in this series. Please feel free
>> to pick them as appropriate. In particular, commits 1, 3, 4, 12, 13, and
>> 14 are all good candidates for picking.
>> 
>> - Although this is untested, it should support 2.5G SGMII as well as
>>   1000BASE-KX. The latter needs MAC and PCS support, but the former
>>   should work out of the box.
>> - It allows for clock configurations not supported by the RCW. This is
>>   very useful if you want to use e.g. SRDS_PRTCL_S1=0x3333 and =0x1133
>>   on the same board. This is because the former setting will use PLL1
>>   as the 1G reference, but the latter will use PLL1 as the 10G
>>   reference. Because we can reconfigure the PLLs, it is possible to
>>   always use PLL1 as the 1G reference.
> 
> I am an engineer working for NXP and I have access to the majority of
> hardware that includes the Lynx 10G SERDES, as well as to block guides
> that are not visible to customers, and to people from the hardware
> design and validation teams.
> 
> I have an interest in adding a driver for this SERDES to support dynamic
> Ethernet protocol reconfiguration, and perhaps the internal PHY for
> copper backplane modes (1000Base-KX, 10GBase-KR) and its link training
> procedure - although the latter will need more work.
> 
> I would like to thank you for starting the effort, but please, stop
> submitting code for modes that are untested, and do not submit features
> that do not work. 

The features which do not work (major protocol changes) are disabled :)

If it would cause this series to be immediately merged, I would remove
KX/KR and 2.5G which are the only untested link modes.

That said, PCS support is necessary for these modes, so it is not even
possible to select them.

> If you do not have the time to fix up the loose ends
> for this patch submission now

I have time to fix up any loose ends preventing this series from being
applied. However, I am not very sympathetic to larger requests, since
there has been extensive time to supply feedback already.

> , you won't have the time to debug
> regressions on boards you might not even have access to, which worked
> fine previously due to the static RCW/PBL configuration.

I have gotten no substantive feedback on this driver. I have been
working on this series since June last year, and no one has commented on
the core algotithms thus far. My capacity for making large changes has
decreased because the project funding the development has ended. I
appreciate that you are taking interest now, but frankly I think it is
rather past time...

> I have downloaded your patches, and although I have objections to some
> of them already, I will be in the process of evaluating, testing,
> changing them, for the coming weeks, perhaps even more. Please consider
> this a NACK for the current patch set due to the SERDES related material,
> although the unrelated patches (like "dt-bindings: Convert gpio-mmio to
> yaml") can and should have been submitted separately, so they can be
> analyzed by their respective maintainers based on their own merit and
> not as part of an overall problematic set.

This patchset has been ready to merge for several revisions now. I do
not consider it problematic. However, I do consider the (nonexistant)
review process for this subsystem extremely problematic.

--Sean
Vladimir Oltean April 26, 2023, 10:51 a.m. UTC | #3
On Tue, Apr 25, 2023 at 04:22:32PM -0400, Sean Anderson wrote:
> The features which do not work (major protocol changes) are disabled :)
> 
> If it would cause this series to be immediately merged, I would remove
> KX/KR and 2.5G which are the only untested link modes.
> 
> That said, PCS support is necessary for these modes, so it is not even
> possible to select them.
> 
> > If you do not have the time to fix up the loose ends
> > for this patch submission now
> 
> I have time to fix up any loose ends preventing this series from being
> applied. However, I am not very sympathetic to larger requests, since
> there has been extensive time to supply feedback already.
> 
> > , you won't have the time to debug
> > regressions on boards you might not even have access to, which worked
> > fine previously due to the static RCW/PBL configuration.
> 
> I have gotten no substantive feedback on this driver. I have been
> working on this series since June last year, and no one has commented on
> the core algotithms thus far. My capacity for making large changes has
> decreased because the project funding the development has ended. I
> appreciate that you are taking interest now, but frankly I think it is
> rather past time...
> 
> > I have downloaded your patches, and although I have objections to some
> > of them already, I will be in the process of evaluating, testing,
> > changing them, for the coming weeks, perhaps even more. Please consider
> > this a NACK for the current patch set due to the SERDES related material,
> > although the unrelated patches (like "dt-bindings: Convert gpio-mmio to
> > yaml") can and should have been submitted separately, so they can be
> > analyzed by their respective maintainers based on their own merit and
> > not as part of an overall problematic set.
> 
> This patchset has been ready to merge for several revisions now. I do
> not consider it problematic. However, I do consider the (nonexistant)
> review process for this subsystem extremely problematic.

To be very clear, the "larger request" which you are unsympathetic to is
to wait. I didn't ask you to change anything.

I need to catch up with 14 rounds of patches from you and with the
discussions that took place on each version, and understand how you
responded to feedback like "don't remove PHY interrupts without finding
out why they don't work" and "doesn't changing PLL frequencies on the
fly affect other lanes that use those PLLs, like PCIe?". The cognitive
dissonance between this and you saying that the review process for this
subsystem is absent is mind boggling. You are sufficiently averse to
feedback that's not the feedback you want to hear, that it's hard to
find a common ground.

It's naive to expect that the silicon vendor will respond positively to
a change of such magnitude as this one, which was written using the
"works for me" work ethics, but which the silicon vendor will have to
work with, afterwards. The only reason I sent my previous email was to
announce you in advance that I have managed to carve out a sufficient
amount of time to explore the topic in detail, and to stop you from
pushing this forward in this state. I thought you would understand the
concept of engineers being unable to easily reserve large chunks of time
for a given project, after all, you brought this argument with your own
company...

Even if the SERDES and PLL drivers "work for you" in the current form,
I doubt the usefulness of a PLL driver if you have to disconnect the
SoC's reset request signal on the board to not be stuck in a reboot loop.
https://lore.kernel.org/linux-arm-kernel/d3163201-2012-6cf9-c798-916bab9c7f72@seco.com/
Even so, I have not said anything definitive, I have just requested time
to take your proposal at face value, and understand whether there is any
other alternative.

I would advise you to consider whether your follow-up emails on this
topic encourage a collaborative atmosphere.

I will come back when I have tested the driver in a reasonable amount of
setups.
Sean Anderson April 26, 2023, 2:50 p.m. UTC | #4
On 4/26/23 06:51, Vladimir Oltean wrote:
> On Tue, Apr 25, 2023 at 04:22:32PM -0400, Sean Anderson wrote:
>> The features which do not work (major protocol changes) are disabled :)
>>
>> If it would cause this series to be immediately merged, I would remove
>> KX/KR and 2.5G which are the only untested link modes.
>>
>> That said, PCS support is necessary for these modes, so it is not even
>> possible to select them.
>>
>>> If you do not have the time to fix up the loose ends
>>> for this patch submission now
>>
>> I have time to fix up any loose ends preventing this series from being
>> applied. However, I am not very sympathetic to larger requests, since
>> there has been extensive time to supply feedback already.
>>
>>> , you won't have the time to debug
>>> regressions on boards you might not even have access to, which worked
>>> fine previously due to the static RCW/PBL configuration.
>>
>> I have gotten no substantive feedback on this driver. I have been
>> working on this series since June last year, and no one has commented on
>> the core algotithms thus far. My capacity for making large changes has
>> decreased because the project funding the development has ended. I
>> appreciate that you are taking interest now, but frankly I think it is
>> rather past time...
>>
>>> I have downloaded your patches, and although I have objections to some
>>> of them already, I will be in the process of evaluating, testing,
>>> changing them, for the coming weeks, perhaps even more. Please consider
>>> this a NACK for the current patch set due to the SERDES related material,
>>> although the unrelated patches (like "dt-bindings: Convert gpio-mmio to
>>> yaml") can and should have been submitted separately, so they can be
>>> analyzed by their respective maintainers based on their own merit and
>>> not as part of an overall problematic set.
>>
>> This patchset has been ready to merge for several revisions now. I do
>> not consider it problematic. However, I do consider the (nonexistant)
>> review process for this subsystem extremely problematic.
> 
> To be very clear, the "larger request" which you are unsympathetic to is
> to wait. I didn't ask you to change anything.

The maintainers in this subsystem refuse to review any patches which have
any kind of feedback. I have been trying to get them to look at this series
for literal months. So saying things like "I don't like this series, have
a NACK" seriously delays the process of getting feedback...

> I need to catch up with 14 rounds of patches from you and with the
> discussions that took place on each version, and understand how you
> responded to feedback like "don't remove PHY interrupts without finding
> out why they don't work" 

All I can say is that

- It doesn't work on my board
- The traces are on the bottom of the PCB
- The signal goes through an FPGA which (unlike the LS1046ARDB) is closed-source
- The alternative is polling once a second (not terribly intensive)

I think it's very reasonable to make this change. Anyway, it's in a separate
patch so that it can be applied independently.

and "doesn't changing PLL frequencies on the
> fly affect other lanes that use those PLLs, like PCIe?".

This driver is not enable on any boards with PCIe on the same serdes. Therefore,
this is irrelevant for the initial submission.

> The cognitive
> dissonance between this and you saying that the review process for this
> subsystem is absent is mind boggling. You are sufficiently averse to
> feedback that's not the feedback you want to hear, that it's hard to
> find a common ground.

I'm opposed to nebulous 11th hour objections.

> It's naive to expect that the silicon vendor will respond positively to
> a change of such magnitude as this one, which was written using the
> "works for me" work ethics, but which the silicon vendor will have to
> work with, afterwards. The only reason I sent my previous email was to
> announce you in advance that I have managed to carve out a sufficient
> amount of time to explore the topic in detail, and to stop you from
> pushing this forward in this state. I thought you would understand the
> concept of engineers being unable to easily reserve large chunks of time
> for a given project, after all, you brought this argument with your own
> company...
> 
> Even if the SERDES and PLL drivers "work for you" in the current form,
> I doubt the usefulness of a PLL driver if you have to disconnect the
> SoC's reset request signal on the board to not be stuck in a reboot loop.

I would like to emphasize that this has *nothing to do with this driver*.
This behavior is part of the boot ROM (or something like it) and occurs before
any user code has ever executed. The problem of course is that certain RCWs
expect the reference clocks to be in certain (incompatible) configurations,
and will fail the boot without a lock. I think this is rather silly (since
you only need PLL lock when you actually want to use the serdes), but that's
how it is. And of course, this is only necessary because I was unable to get
major reconfiguration to work. In an ideal world, you could always boot with
the same RCW (with PLL config matching the board) and choose the major protocol
at runtime.

> https://cas5-0-urlprotect.trendmicro.com:443/wis/clicktime/v1/query?url=https%3a%2f%2flore.kernel.org%2flinux%2darm%2dkernel%2fd3163201%2d2012%2d6cf9%2dc798%2d916bab9c7f72%40seco.com%2f&umid=b7a538d4-355b-4a0a-bac8-26f0fa0a74c0&auth=d807158c60b7d2502abde8a2fc01f40662980862-e02727a0e4438c72b5e854cb20b9de5379470fd9
> Even so, I have not said anything definitive, I have just requested time
> to take your proposal at face value, and understand whether there is any
> other alternative.
> 
> I would advise you to consider whether your follow-up emails on this
> topic encourage a collaborative atmosphere.

Sorry, I was a bit standoffish, but frankly I don't think leading off with
"NACK, no feedback for you today" is particularly collaborative either.

--Sean
Vladimir Oltean April 29, 2023, 5:24 p.m. UTC | #5
On Wed, Apr 26, 2023 at 10:50:17AM -0400, Sean Anderson wrote:
> > I need to catch up with 14 rounds of patches from you and with the
> > discussions that took place on each version, and understand how you
> > responded to feedback like "don't remove PHY interrupts without finding
> > out why they don't work"
> 
> All I can say is that
> 
> - It doesn't work on my board
> - The traces are on the bottom of the PCB
> - The signal goes through an FPGA which (unlike the LS1046ARDB) is closed-source

I don't understand the distinction you are making here. Are the sources
for QIXIS bit streams public for any Layerscape board?

> - The alternative is polling once a second (not terribly intensive)

It makes a difference to performance (forwarded packets per second), believe it or not.

> 
> I think it's very reasonable to make this change. Anyway, it's in a separate
> patch so that it can be applied independently.

Perhaps better phrased: "discussed separately"...

> > Even if the SERDES and PLL drivers "work for you" in the current form,
> > I doubt the usefulness of a PLL driver if you have to disconnect the
> > SoC's reset request signal on the board to not be stuck in a reboot loop.
> 
> I would like to emphasize that this has *nothing to do with this driver*.
> This behavior is part of the boot ROM (or something like it) and occurs before
> any user code has ever executed. The problem of course is that certain RCWs
> expect the reference clocks to be in certain (incompatible) configurations,
> and will fail the boot without a lock. I think this is rather silly (since
> you only need PLL lock when you actually want to use the serdes), but that's
> how it is. And of course, this is only necessary because I was unable to get
> major reconfiguration to work. In an ideal world, you could always boot with
> the same RCW (with PLL config matching the board) and choose the major protocol
> at runtime.

Could you please tell me what are the reference clock frequencies that
your board provides at boot time to the 2 PLLs, and which SERDES
protocol out of those 2 (1133 and 3333) boots correctly (no RESET_REQ
hacks necessary) with those refclks? I will try to get a LS1046A-QDS
where I boot from the same refclk + SERDES protocol configuration as
you, and use PBI commands in the RCW to reconfigure the lanes (PLL
selection and protocol registers) for the other mode, while keeping the
FRATE_SEL of the PLLs unmodified.
Sean Anderson May 1, 2023, 3:03 p.m. UTC | #6
On 4/29/23 13:24, Vladimir Oltean wrote:
> On Wed, Apr 26, 2023 at 10:50:17AM -0400, Sean Anderson wrote:
>> > I need to catch up with 14 rounds of patches from you and with the
>> > discussions that took place on each version, and understand how you
>> > responded to feedback like "don't remove PHY interrupts without finding
>> > out why they don't work"
>> 
>> All I can say is that
>> 
>> - It doesn't work on my board
>> - The traces are on the bottom of the PCB
>> - The signal goes through an FPGA which (unlike the LS1046ARDB) is closed-source
> 
> I don't understand the distinction you are making here. Are the sources
> for QIXIS bit streams public for any Layerscape board?

Correct. The sources for the LS1046ARDB QIXIS are available for download.

>> - The alternative is polling once a second (not terribly intensive)
> 
> It makes a difference to performance (forwarded packets per second), believe it or not.

I don't. Please elaborate how link status latency from the phy affects performance.

>> 
>> I think it's very reasonable to make this change. Anyway, it's in a separate
>> patch so that it can be applied independently.
> 
> Perhaps better phrased: "discussed separately"...
> 
>> > Even if the SERDES and PLL drivers "work for you" in the current form,
>> > I doubt the usefulness of a PLL driver if you have to disconnect the
>> > SoC's reset request signal on the board to not be stuck in a reboot loop.
>> 
>> I would like to emphasize that this has *nothing to do with this driver*.
>> This behavior is part of the boot ROM (or something like it) and occurs before
>> any user code has ever executed. The problem of course is that certain RCWs
>> expect the reference clocks to be in certain (incompatible) configurations,
>> and will fail the boot without a lock. I think this is rather silly (since
>> you only need PLL lock when you actually want to use the serdes), but that's
>> how it is. And of course, this is only necessary because I was unable to get
>> major reconfiguration to work. In an ideal world, you could always boot with
>> the same RCW (with PLL config matching the board) and choose the major protocol
>> at runtime.
> 
> Could you please tell me what are the reference clock frequencies that
> your board provides at boot time to the 2 PLLs, and which SERDES
> protocol out of those 2 (1133 and 3333) boots correctly (no RESET_REQ
> hacks necessary) with those refclks? I will try to get a LS1046A-QDS
> where I boot from the same refclk + SERDES protocol configuration as
> you, and use PBI commands in the RCW to reconfigure the lanes (PLL
> selection and protocol registers) for the other mode, while keeping the
> FRATE_SEL of the PLLs unmodified.

 From table 31-1 in the RM, the PLL mapping for 1133 is 2211, and the
 PLL mapping for 3333 is 2222. As a consequence, for 1133, PLL 2 must be
 156.25 MHz and PLL 1 must be either 100 or 125 MHz. And for 3333, PLL 2
 must be either 100 or 125 MHz, and PLL 1 should be shut down (as it is
 unused). This conflict for PLL 2 means that the same reference clock
 configuration cannot work for both 1133 and 3333. In one of the
 configurations, SRDS_RST_RR will be set in RSTRQSR1. On our board,
 reference clock 1 is 156.25 MHz, and reference clock 2 is 125 MHz.
 Therefore, 3333 will fail to boot. Unfortunately, this reset request
 occurs before any user-configurable code has run (except the RCW), so
 it is not possible to fix this issue with e.g. PBI.

 --Sean
 not
Sean Anderson May 22, 2023, 2:42 p.m. UTC | #7
Hi Vladmir,

On 5/1/23 11:03, Sean Anderson wrote:
> On 4/29/23 13:24, Vladimir Oltean wrote:
>> On Wed, Apr 26, 2023 at 10:50:17AM -0400, Sean Anderson wrote:
>>> > I need to catch up with 14 rounds of patches from you and with the
>>> > discussions that took place on each version, and understand how you
>>> > responded to feedback like "don't remove PHY interrupts without finding
>>> > out why they don't work"
>>> 
>>> All I can say is that
>>> 
>>> - It doesn't work on my board
>>> - The traces are on the bottom of the PCB
>>> - The signal goes through an FPGA which (unlike the LS1046ARDB) is closed-source
>> 
>> I don't understand the distinction you are making here. Are the sources
>> for QIXIS bit streams public for any Layerscape board?
> 
> Correct. The sources for the LS1046ARDB QIXIS are available for download.
> 
>>> - The alternative is polling once a second (not terribly intensive)
>> 
>> It makes a difference to performance (forwarded packets per second), believe it or not.
> 
> I don't. Please elaborate how link status latency from the phy affects performance.
> 
>>> 
>>> I think it's very reasonable to make this change. Anyway, it's in a separate
>>> patch so that it can be applied independently.
>> 
>> Perhaps better phrased: "discussed separately"...
>> 
>>> > Even if the SERDES and PLL drivers "work for you" in the current form,
>>> > I doubt the usefulness of a PLL driver if you have to disconnect the
>>> > SoC's reset request signal on the board to not be stuck in a reboot loop.
>>> 
>>> I would like to emphasize that this has *nothing to do with this driver*.
>>> This behavior is part of the boot ROM (or something like it) and occurs before
>>> any user code has ever executed. The problem of course is that certain RCWs
>>> expect the reference clocks to be in certain (incompatible) configurations,
>>> and will fail the boot without a lock. I think this is rather silly (since
>>> you only need PLL lock when you actually want to use the serdes), but that's
>>> how it is. And of course, this is only necessary because I was unable to get
>>> major reconfiguration to work. In an ideal world, you could always boot with
>>> the same RCW (with PLL config matching the board) and choose the major protocol
>>> at runtime.
>> 
>> Could you please tell me what are the reference clock frequencies that
>> your board provides at boot time to the 2 PLLs, and which SERDES
>> protocol out of those 2 (1133 and 3333) boots correctly (no RESET_REQ
>> hacks necessary) with those refclks? I will try to get a LS1046A-QDS
>> where I boot from the same refclk + SERDES protocol configuration as
>> you, and use PBI commands in the RCW to reconfigure the lanes (PLL
>> selection and protocol registers) for the other mode, while keeping the
>> FRATE_SEL of the PLLs unmodified.
> 
>  From table 31-1 in the RM, the PLL mapping for 1133 is 2211, and the
>  PLL mapping for 3333 is 2222. As a consequence, for 1133, PLL 2 must be
>  156.25 MHz and PLL 1 must be either 100 or 125 MHz. And for 3333, PLL 2
>  must be either 100 or 125 MHz, and PLL 1 should be shut down (as it is
>  unused). This conflict for PLL 2 means that the same reference clock
>  configuration cannot work for both 1133 and 3333. In one of the
>  configurations, SRDS_RST_RR will be set in RSTRQSR1. On our board,
>  reference clock 1 is 156.25 MHz, and reference clock 2 is 125 MHz.
>  Therefore, 3333 will fail to boot. Unfortunately, this reset request
>  occurs before any user-configurable code has run (except the RCW), so
>  it is not possible to fix this issue with e.g. PBI.

Have you had a chance to review this driver?

--Sean
Vladimir Oltean May 22, 2023, 3 p.m. UTC | #8
On Mon, May 22, 2023 at 10:42:04AM -0400, Sean Anderson wrote:
> Have you had a chance to review this driver?

Partially / too little (and no, I don't have an answer yet). I am
debugging a SERDES protocol change procedure from XFI to SGMII.
Sean Anderson June 9, 2023, 7:19 p.m. UTC | #9
On 5/22/23 11:00, Vladimir Oltean wrote:
> On Mon, May 22, 2023 at 10:42:04AM -0400, Sean Anderson wrote:
>> Have you had a chance to review this driver?
> 
> Partially / too little (and no, I don't have an answer yet). I am
> debugging a SERDES protocol change procedure from XFI to SGMII.

I'd just like to reiterate that, like I said in the cover letter, I
believe this driver still has value even if it cannot yet perform
protocol switching.

Please send me your feedback, and I will try and incorporate it into the
next revision. Previously, you said you had major objections to the
contents of this series, but you still have not listed them.

--Sean
Vladimir Oltean June 10, 2023, 10:21 p.m. UTC | #10
Hello Sean,

On Fri, Jun 09, 2023 at 03:19:22PM -0400, Sean Anderson wrote:
> On 5/22/23 11:00, Vladimir Oltean wrote:
> > On Mon, May 22, 2023 at 10:42:04AM -0400, Sean Anderson wrote:
> >> Have you had a chance to review this driver?
> > 
> > Partially / too little (and no, I don't have an answer yet). I am
> > debugging a SERDES protocol change procedure from XFI to SGMII.
> 
> I'd just like to reiterate that, like I said in the cover letter, I
> believe this driver still has value even if it cannot yet perform
> protocol switching.
> 
> Please send me your feedback, and I will try and incorporate it into the
> next revision. Previously, you said you had major objections to the
> contents of this series, but you still have not listed them.

And if SERDES protocol switching was not physically possible, would this
patch set still have value?
Vladimir Oltean June 12, 2023, 4:33 p.m. UTC | #11
On Mon, Jun 12, 2023 at 10:35:21AM -0400, Sean Anderson wrote:
> > And if SERDES protocol switching was not physically possible, would this
> > patch set still have value?
> 
> Yes. To e.g. set up SGMII25 or to fix the clocking situation.

Let me analyze the reasons one by one.

The clocking situation only exists due to a hope that protocol changing
would be possible. If you were sure it wasn't possible, your board would
have had refclks set up for protocol 3333 via the supported PLL mapping 2222.
In essence, I consider this almost a non-argument, as it fixes a
self-inflicted problem.

Have you actually tested changing individual lanes from SGMII to SGMII 2.5?
I know it works on LS1028A, but I don't know if that's also true on LS1046A.
Your comments do say "LYNX_PROTO_SGMII25, /* Not tested */".

Assuming a start from SERDES protocol 3333 with PLL mapping 2222,
this protocol change implies powering on PLL 1 (reset state machine
wants it to be disabled) and moving those lanes from PLL 2 to PLL 1.

At first sight you might appear to have a point related to the fact that
PLL register writes are necessary, and thus this whole shebang is necessary.
But this can all be done using PBI commands, with the added benefit that
U-Boot can also use those SERDES networking ports, and not just Linux.
You can use the RCW+PBL specific to your board to inform the SoC that
your platform's refclk 1 is 156 MHz (something which the reset state
machine seems unable to learn, with protocol 0x3333). You don't have to
put that in the device tree. You don't have to push code to any open
source project to expose your platform specific details. Then, just like
in the case of the Lynx 28G driver on LX2160, the SERDES driver could
just treat the PLL configuration as read-only, which would greatly
simplify things and eliminate the need for a clk driver.

Here is an illustrative example (sorry, I don't have a board with the
right refclk on that PLL, to verify all the way):

Add this to ./serdes_10g.rcw in the root of the qoriq-rcw repository:

/*
 * Registers for the Lynx 10G SERDES block.
 *
 * Must be included by an SoC-specific header that defines the
 * SRDS_BASE value.
 */

#define PLLnRSTCTL(n)		(SRDS_BASE + (0x20 * (n)))
#define PLLnCR0(n)		(SRDS_BASE + (0x20 * (n)) + 0x0004)
#define  POFF(x)		(((x) << 31) & 0x80000000)
#define  REFCLK_SEL(x)		(((x) << 28) & 0x70000000)
#define  REFCLK_EN(x)		(((x) << 27) & 0x08000000)
#define  FRATE_SEL(x)		(((x) << 16) & 0x000f0000)
#define  DLYDIV_SEL(x)		((x) & 0x00000003)
#define PCCR8			(SRDS_BASE + 0x0220)
#define  SGMIIA_KX(x)		(((x) << 31) & 0x80000000)
#define  SGMIIA_CFG(x)		(((x) << 28) & 0x70000000)
#define  SGMIIB_KX(x)		(((x) << 27) & 0x08000000)
#define  SGMIIB_CFG(x)		(((x) << 24) & 0x07000000)
#define  SGMIIC_KX(x)		(((x) << 23) & 0x00800000)
#define  SGMIIC_CFG(x)		(((x) << 20) & 0x00700000)
#define  SGMIID_KX(x)		(((x) << 19) & 0x00080000)
#define  SGMIID_CFG(x)		(((x) << 16) & 0x00070000)
#define  SGMIIE_KX(x)		(((x) << 15) & 0x00008000)
#define  SGMIIE_CFG(x)		(((x) << 12) & 0x00007000)
#define  SGMIIF_KX(x)		(((x) << 11) & 0x00000800)
#define  SGMIIF_CFG(x)		(((x) << 8) & 0x00000700)
#define  SGMIIG_KX(x)		(((x) << 7) & 0x00000080)
#define  SGMIIG_CFG(x)		(((x) << 4) & 0x00000070)
#define  SGMIIH_KX(x)		(((x) << 3) & 0x00000008)
#define  SGMIIH_CFG(x)		((x) & 0x00000007)
#define PCCRB			(SRDS_BASE + 0x022c)
#define  XFIA_CFG(x)		(((x) << 28) & 0x70000000)
#define  XFIB_CFG(x)		(((x) << 24) & 0x07000000)
#define  XFIC_CFG(x)		(((x) << 20) & 0x00700000)
#define  XFID_CFG(x)		(((x) << 16) & 0x00070000)
#define  XFIE_CFG(x)		(((x) << 12) & 0x00007000)
#define  XFIF_CFG(x)		(((x) << 8) & 0x00000700)
#define  XFIG_CFG(x)		(((x) << 4) & 0x00000070)
#define  XFIH_CFG(x)		((x) & 0x00000007)
#define LNmGCR0(m)		(SRDS_BASE + (0x40 * (m)) + 0x0800)
#define  RPLL_LES(x)		(((x) << 31) & 0x80000000)
#define  RRAT_SEL(x)		(((x) << 28) & 0x30000000)
#define  TPLL_LES(x)		(((x) << 27) & 0x08000000)
#define  TRAT_SEL(x)		(((x) << 24) & 0x03000000)
#define  RRST_B(x)		(((x) << 22) & 0x00400000)
#define  TRST_B(x)		(((x) << 21) & 0x00200000)
#define  RX_PD(x)		(((x) << 20) & 0x00100000)
#define  TX_PD(x)		(((x) << 19) & 0x00080000)
#define  IF20BIT_EN(x)		(((x) << 18) & 0x00040000)
#define  FIRST_LANE(x)		(((x) << 16) & 0x00010000)
#define  GCR0_RSV		0x1000
#define  PROTS(x)		(((x) << 7) & 0x00000f80)
#define LNmGCR1(m)		(SRDS_BASE + (0x40 * (m)) + 0x0804)
#define  RDAT_INV(x)		(((x) << 31) & 0x80000000)
#define  TDAT_INV(x)		(((x) << 30) & 0x40000000)
#define  OPAD_CTL(x)		(((x) << 26) & 0x04000000)
#define  REIDL_TH(x)		(((x) << 20) & 0x00700000)
#define  REIDL_EX_SEL(x)	(((x) << 18) & 0x000C0000)
#define  REIDL_ET_SEL(x)	(((x) << 16) & 0x00030000)
#define  REIDL_EX_MSB(x)	(((x) << 15) & 0x00008000)
#define  REIDL_ET_MSB(x)	(((x) << 14) & 0x00004000)
#define  REQ_CTL_SNP(x)		(((x) << 13) & 0x00002000)
#define  REQ_CDR_SNP(x)		(((x) << 12) & 0x00001000)
#define  TRSTDIR(x)		(((x) << 7) & 0x00000080)
#define  REQ_BIN_SNP(x)		(((x) << 6) & 0x00000040)
#define  ISLEW_RCTL(x)		(((x) << 4) & 0x00000030)
#define  GCR1_RSV		0x8
#define  OSLEW_RCTL(x)		((x) & 0x3)
#define LNmRECR0(m)		(SRDS_BASE + (0x40 * (m)) + 0x0810)
#define  RXEQ_BST(x)		(((x) << 28) & 0x10000000)
#define  GK2OVD(x)		(((x) << 24) & 0x0f000000)
#define  GK3OVD(x)		(((x) << 16) & 0x000f0000)
#define  GK2OVD_EN(x)		(((x) << 15) & 0x00008000)
#define  GK3OVD_EN(x)		(((x) << 14) & 0x00004000)
#define  OSETOVD_EN(x)		(((x) << 13) & 0x00002000)
#define  BASE_WAND(x)		(((x) << 10) & 0x00000c00)
#define  OSETOVD(x)		((x) & 0x0000007F)
#define LNmTECR0(m)		(SRDS_BASE + (0x40 * (m)) + 0x0818)
#define  TEQ_TYPE(x)		(((x) << 28) & 0x30000000)
#define  SGN_PREQ(x)		(((x) << 26) & 0x04000000)
#define  RATIO_PREQ(x)		(((x) << 22) & 0x03C00000)
#define  SGN_POST1Q(x)		(((x) << 21) & 0x00200000)
#define  RATIO_PST1Q(x)		(((x) << 16) & 0x001F0000)
#define  ADPT_EQ(x)		(((x) << 8) & 0x00003F00)
#define  AMP_RED(x)		((x) & 0x0000003f)
#define LNmTTLCR0(m)		(SRDS_BASE + (0x40 * (m)) + 0x0820)
#define LNmTCSR0(m)		(SRDS_BASE + (0x40 * (m)) + 0x0830)
#define LNmTCSR1(m)		(SRDS_BASE + (0x40 * (m)) + 0x0834)
#define LNmTCSR2(m)		(SRDS_BASE + (0x40 * (m)) + 0x0838)
#define LNmTCSR3(m)		(SRDS_BASE + (0x40 * (m)) + 0x083c)
#define SGMIIaCR1(a)		(SRDS_BASE + (0x10 * (a)) + 0x1804)
#define  SGMII_MDEV_PORT(x)	(((x) << 27) & 0xf8000000)
#define  SGPCS_EN(x)		(((x) << 11) & 0x00000800)
#define XFIaCR1(a)		(SRDS_BASE + (0x10 * (a)) + 0x1984)
#define  XFI_MDEV_PORT(x)	(((x) << 27) & 0xf8000000)

Add this to ./ls1046ardb/serdes_pll1_power_on.rcw in the same repo
(and finish the writes):

/*
 * Assuming that the SoC starts from SERDES1 protocol 0x3333, power on PLL 1,
 * which is required by the reset state machine to be disabled.
 */

#define SRDS_BASE		0xea0000 /* SERDES 1 */
#include <../serdes_10g.rcw>

.pbi

write PLLnCR0(0), POFF(0) | REFCLK_SEL(2) | REFCLK_EN(0)
wait 2500

write PLLnRSTCTL(0), ...

.end

Add this at the end of your board RCW:

#include <../ls1046ardb/serdes_pll1_power_on.rcw>


In short, I believe the reasons you have cited do not justify the
complexity of the solution that you propose.
Sean Anderson June 12, 2023, 8:46 p.m. UTC | #12
On 6/12/23 12:33, Vladimir Oltean wrote:
> On Mon, Jun 12, 2023 at 10:35:21AM -0400, Sean Anderson wrote:
>> > And if SERDES protocol switching was not physically possible, would this
>> > patch set still have value?

More to the point, did you make any progress in this area?

>> Yes. To e.g. set up SGMII25 or to fix the clocking situation.
> 
> Let me analyze the reasons one by one.
> 
> The clocking situation only exists due to a hope that protocol changing
> would be possible. If you were sure it wasn't possible, your board would
> have had refclks set up for protocol 3333 via the supported PLL mapping 2222.
> In essence, I consider this almost a non-argument, as it fixes a
> self-inflicted problem.

2222 also does not work, as outlined above.

The clocking is incompatible, and must be fixed by software.

The only thing self-inflicted is NXP's conflicting design.

> Have you actually tested changing individual lanes from SGMII to SGMII 2.5?
> I know it works on LS1028A, but I don't know if that's also true on LS1046A.
> Your comments do say "LYNX_PROTO_SGMII25, /* Not tested */".

Not yet. 

This driver would also be a good place to add the KR link training with
NXP tried to upstream a few years ago.

> Assuming a start from SERDES protocol 3333 with PLL mapping 2222,
> this protocol change implies powering on PLL 1 (reset state machine
> wants it to be disabled) and moving those lanes from PLL 2 to PLL 1.
> 
> At first sight you might appear to have a point related to the fact that
> PLL register writes are necessary, and thus this whole shebang is necessary.
> But this can all be done using PBI commands, with the added benefit that
> U-Boot can also use those SERDES networking ports, and not just Linux.
> You can use the RCW+PBL specific to your board to inform the SoC that
> your platform's refclk 1 is 156 MHz (something which the reset state
> machine seems unable to learn, with protocol 0x3333). You don't have to
> put that in the device tree. You don't have to push code to any open
> source project to expose your platform specific details. Then, just like
> in the case of the Lynx 28G driver on LX2160, the SERDES driver could
> just treat the PLL configuration as read-only, which would greatly
> simplify things and eliminate the need for a clk driver.
> 
> Here is an illustrative example (sorry, I don't have a board with the
> right refclk on that PLL, to verify all the way):
> 
> ... snip ...

(which of course complicates the process of building the PBIs...)

> In short, I believe the reasons you have cited do not justify the
> complexity of the solution that you propose.

The work is done, and it is certainly more flexible than what you
propose. E.g. imagine a clock which needs to be configured before it has
the correct frequency. Such a thing is difficult to do in a PBI solution,
but is trivial in Linux.

--Sean
Vladimir Oltean June 13, 2023, 2:27 p.m. UTC | #13
On Mon, Jun 12, 2023 at 04:46:16PM -0400, Sean Anderson wrote:
> On 6/12/23 12:33, Vladimir Oltean wrote:
> > On Mon, Jun 12, 2023 at 10:35:21AM -0400, Sean Anderson wrote:
> >> > And if SERDES protocol switching was not physically possible, would this
> >> > patch set still have value?
> 
> More to the point, did you make any progress in this area?

Aha. It is two email exchanges later that you've thought of asking that.

No, I haven't made progress. Though I managed to get some extra pointers
and things to try out, and at this stage I am the limiting factor. I will
let you know once the investigation has reached a conclusion.

So many things depend on whether major SERDES protocol reconfiguration
is possible, that if I were you, my question would be just that, instead
of the equivalent of "Vladimir, when can my driver be merged ???".
I understand how incredibly frustrating it is that the responses are
coming in so slowly, and this is the reason why I am responding at all
to your pings even if I have essentially nothing new to say.

And it's not that you don't care whether protocol reconfiguration is
possible or not - you do, but at the same time you're overcome with this
strong attachment to your solution just because it is yours, that I
don't know how to deal with.

> 2222 also does not work, as outlined above.
> 
> The clocking is incompatible, and must be fixed by software.
> 
> The only thing self-inflicted is NXP's conflicting design.

The way things are supposed to work (*if* this works at all) is that the
reset state machine starts with a supported PLL / refclk configuration
that permits a certain subset of protocols, and the SERDES protocols can
be changed from the reset state, as desired, for the individual lanes.

What is self-inflicted is that the refclks on your board design are not
supportable by any reset state machine configuration, and wiring them
that way was a conscious decision. Did your company's board designers
receive the recommendation to disconnect RESET_REQ from NXP, and has NXP
said that the end result will be something that continues to be supportable?
I've searched the customer support database and the answer seems to be no.
In any case, if you have to disconnect RESET_REQ from the SoC to make
the driver in this form useful, then... yeah, no. You're obviously free
to do whatever you want with your products, but that's not how mainline
Linux works, it needs to be useful beyond you.

If protocol switching works, I will maintain that your board should have
wired the refclks to PLLs the other way around (which is supported by
the RCW), and then proceed to do protocol switching to reach the desired
configuration.

So you can see that many things change based on whether protocol reconfig
is possible or not. The patch set presented here looks exactly like the
product of someone infatuated with form over substance, and who misses
the overall picture. My belief is that if you shift your attitude from
"is the work done?" to "is the work generally useful?", the interactions
will improve by a lot.

> > Have you actually tested changing individual lanes from SGMII to SGMII 2.5?
> > I know it works on LS1028A, but I don't know if that's also true on LS1046A.
> > Your comments do say "LYNX_PROTO_SGMII25, /* Not tested */".
> 
> Not yet. 

Aha. So concretely, what makes you so confident with moving this forward?
You like the code to just be there, no?

It helps your board, I understand. But not everything has to stay in
mainline if it's not useful for others, and I've proposed an alternative
solution which I'm not sure how seriously you took.

> This driver would also be a good place to add the KR link training with
> NXP tried to upstream a few years ago.

Well, speaking as someone who is now also tasked with the copper backplane
support, believe me that I know that, and this is why I'm so desperate
with the logic you're trying to push forward. It's clear that we should
try to collaborate rather than try to push individualistic non-tested
non-solutions.

> > Assuming a start from SERDES protocol 3333 with PLL mapping 2222,
> > this protocol change implies powering on PLL 1 (reset state machine
> > wants it to be disabled) and moving those lanes from PLL 2 to PLL 1.
> > 
> > At first sight you might appear to have a point related to the fact that
> > PLL register writes are necessary, and thus this whole shebang is necessary.
> > But this can all be done using PBI commands, with the added benefit that
> > U-Boot can also use those SERDES networking ports, and not just Linux.
> > You can use the RCW+PBL specific to your board to inform the SoC that
> > your platform's refclk 1 is 156 MHz (something which the reset state
> > machine seems unable to learn, with protocol 0x3333). You don't have to
> > put that in the device tree. You don't have to push code to any open
> > source project to expose your platform specific details. Then, just like
> > in the case of the Lynx 28G driver on LX2160, the SERDES driver could
> > just treat the PLL configuration as read-only, which would greatly
> > simplify things and eliminate the need for a clk driver.
> > 
> > Here is an illustrative example (sorry, I don't have a board with the
> > right refclk on that PLL, to verify all the way):
> > 
> > ... snip ...
> 
> (which of course complicates the process of building the PBIs...)

Maybe this is the language barrier, but what are you trying to say here?

> > In short, I believe the reasons you have cited do not justify the
> > complexity of the solution that you propose.
> 
> The work is done,

:)

> and it is certainly more flexible than what you propose. E.g. imagine
> a clock which needs to be configured before it has the correct
> frequency. Such a thing is difficult to do in a PBI solution, but is
> trivial in Linux.

So "a clock which needs to be configured" means a programmable clock
generator that provides the refclk to the PLL, correct?

With QorIQ/Layerscape SoCs, I imagine that case would typically be done
by a board CPLD during the boot sequence, because if the PLL refclks
don't have the right frequency, the main SoC doesn't start. You've hacked
the RESET_REQ signal so you don't have that problem. But surprise -
others haven't - so it's not exactly a relevant argument in favor.

It's clear that the hardware has limitations that have impact upon the
way in which both boards and software is designed. Because you don't
know whether protocol reconfig is possible, you don't know *if* it is
possible, *under what circumstances* it will be possible. What if XFI
only works with 156 MHz provided to PLL 2, and not to PLL 1? You're
trying to push a non-final solution as a final solution.

The programmable refclk generator that is reconfigured at runtime is
another can of worms, which I'm not sure can be feasibly implemented on
this hardware given the fact that multiple lanes share one PLL, and
there are many more lanes (4 or 8) than PLLs (2). Again, I need to see
some practical functionality that this patch set provides.

Do you _actually_ have a programmable clock generator that the CPLD
can't program, or is this a "what if" situation?

More to the point of the "flexibility" that you claim. You haven't
addressed my point of networking over the SERDES ports in U-Boot.
You don't need that - I'm clear with that, otherwise we'd be having this
conversation on the U-Boot mailing lists - but what if others, wanting
to follow your solution, would? Would you see that as an opportunity to
write even more code?
Vladimir Oltean Aug. 10, 2023, 10:26 a.m. UTC | #14
Hi Sean,

On Tue, Jun 13, 2023 at 05:27:54PM +0300, Vladimir Oltean wrote:
> The way things are supposed to work (*if* this works at all) is that the
> reset state machine starts with a supported PLL / refclk configuration
> that permits a certain subset of protocols, and the SERDES protocols can
> be changed from the reset state, as desired, for the individual lanes.
> 
> What is self-inflicted is that the refclks on your board design are not
> supportable by any reset state machine configuration, and wiring them
> that way was a conscious decision. Did your company's board designers
> receive the recommendation to disconnect RESET_REQ from NXP, and has NXP
> said that the end result will be something that continues to be supportable?
> I've searched the customer support database and the answer seems to be no.
> In any case, if you have to disconnect RESET_REQ from the SoC to make
> the driver in this form useful, then... yeah, no. You're obviously free
> to do whatever you want with your products, but that's not how mainline
> Linux works, it needs to be useful beyond you.
> 
> If protocol switching works, I will maintain that your board should have
> wired the refclks to PLLs the other way around (which is supported by
> the RCW), and then proceed to do protocol switching to reach the desired
> configuration.

It was quite a journey to piece everything together.

There is one thing to be mentioned right from the start, and that is
that on some SoCs (including the LS1043A and LS1046A), the SerDes data
path is partially controlled by the RCW, and thus, dynamically performing
a major SerDes protocol reconfiguration requires a RCW override procedure
(undocumented in the SerDes reconfiguration steps, but all the info you
need is summarized below).

The DCFG block has a set of RCWSR0 - RCWSR15 read-only status registers
relative to DCFG_CCSR. What we don't document in the SoC RM, but I got
permission to say here, is that the DCFG block exposes a second set of
Expert Mode registers in the DCSR address space (base address 2000_0000h).
The DCFG_DCSR register region spans from offset 2_0000h to 2_05AC into
the DCSR base address. At the exact same offsets into DCFG_DCSR as
RCWSR0 - RCWSR15 are into DCFG_CCSR (aka 100h-13ch), one can find the
shadow RCWCR0 - RCWCR15 (Reset Control Word Control Register) registers
which are also writable. There is a one-to-one mapping between each
register (and field) in RCWSR and RCWCR, so I won't detail the
contents of the RCWCR registers, because we document RCWSR fully.

RCW override means modifying some of the RCWCR registers. In this
particular case, finalizing the major SerDes reconfiguration requires
overriding SRDS_PRTCL_S1 in RCWCR5 with the new per-lane settings, to mux
the correct PCS to the MAC. In the general case, random RCW overrides
that don't come from the hardware validation team are unsupported by
NXP, and you should expect the procedure to yield unpredictable results.

I don't know how much of the above steps is applicable to the other 10G Lynx
SoCs (LS1088, LS2088 etc). Not all SoCs may require RCW override, and
the RCW override procedure may not be the same. That is TBD, and we'd
appreciate if support for other SoCs than the LS1046 to be added no
earlier than when we have a validated SerDes reconfiguration procedure.

I believe this is enough information to permit the creation of a Linux
driver on the DCFG_DCSR registers which permits RCW override at runtime.
It seems this will be necessary at least on LS1046.

We should discuss who and when picks up your patches, removes the
unsupported, untested and unnecessary code and adds the RCW override
procedure. It can be you, it can also be someone from NXP. If it will be
you, please let me know, because there are concrete implementation
choices I want to leave comments on. There is also the previous
observation from Ioana that you should not delete PHY interrupts without
finding out why they don't work.

Only what was thoroughly tested and is based on a hardware validated
procedure should be submitted. This means only a 1G <-> 10G transition
on LS1046 for now.

Nonetheless, below is a functional example of how NXP would recommend
you to achieve the desired PLL mapping for any RCW-based SerDes protocol.
My testing platform was the LS1046A-QDS with PLL1 at 100 MHz and PLL2 at
156.25 MHz. I believe that this should eliminate the need for a clk
driver for the PLLs, and should make your Ethernet lanes usable much
earlier than Linux. That being said, our position at NXP is that you
don't need a clk driver for the PLLs, and I would like to see the clk
portion removed from future patch revisions.

This patch is on top of https://github.com/nxp-qoriq/rcw/tree/lf-6.1.22-2.0.0
Sean Anderson Aug. 10, 2023, 7:58 p.m. UTC | #15
Hi Vladimir,

On 8/10/23 06:26, Vladimir Oltean wrote:
> Hi Sean,
> 
> On Tue, Jun 13, 2023 at 05:27:54PM +0300, Vladimir Oltean wrote:
>> The way things are supposed to work (*if* this works at all) is that the
>> reset state machine starts with a supported PLL / refclk configuration
>> that permits a certain subset of protocols, and the SERDES protocols can
>> be changed from the reset state, as desired, for the individual lanes.
>> 
>> What is self-inflicted is that the refclks on your board design are not
>> supportable by any reset state machine configuration, and wiring them
>> that way was a conscious decision. Did your company's board designers
>> receive the recommendation to disconnect RESET_REQ from NXP, and has NXP
>> said that the end result will be something that continues to be supportable?
>> I've searched the customer support database and the answer seems to be no.
>> In any case, if you have to disconnect RESET_REQ from the SoC to make
>> the driver in this form useful, then... yeah, no. You're obviously free
>> to do whatever you want with your products, but that's not how mainline
>> Linux works, it needs to be useful beyond you.

As explained previously (and noted by yourself below) 1G and 10G RCWs
have mutally-incompatible clocking requirements. Now that you have
documented an alternate solution, it is possible to boot up with one RCW
and switch to another. But without that it was not possible to have one
board support both RCWs (without e.g. a microcontroller or FPGA to
configure the clock generator before releasing the processor reset). I
do not think that the silicon should assert the reset request line if
the serdes doesn't lock, but it does and it can't really be disabled.

As it happens, our board is set up so that the reference clocks are
configured for 10G by default. During this boot, reset request is never
requested. If we did not have to support software configuration of the
serdes speed (to support 1G SFPs) we would not have to disconnect reset
request.

That said, I have evaluated the various reasons that reset request can
be asserted, and I have determined that for our product they are not
necessary. The only major limitation is the lack of a watchdog, but that
was not a requirement for us. Because of this, using a GPIO for reset is
sufficient and neatly avoids the issue.

We did not see the need to contact NXP support because we internally
came up with a reliable solution. I suspect that they would not have
suggested the solution you present below, but if you think otherwise I
will try them in the future :)

I would appreciate if you are not so derisive in your comments. I do not
like reading your emails because they are very abrasive.

>> If protocol switching works, I will maintain that your board should have
>> wired the refclks to PLLs the other way around (which is supported by
>> the RCW), and then proceed to do protocol switching to reach the desired
>> configuration.
> 
> It was quite a journey to piece everything together.
> 
> There is one thing to be mentioned right from the start, and that is
> that on some SoCs (including the LS1043A and LS1046A), the SerDes data
> path is partially controlled by the RCW, and thus, dynamically performing
> a major SerDes protocol reconfiguration requires a RCW override procedure
> (undocumented in the SerDes reconfiguration steps, but all the info you
> need is summarized below).
> 
> The DCFG block has a set of RCWSR0 - RCWSR15 read-only status registers
> relative to DCFG_CCSR. What we don't document in the SoC RM, but I got
> permission to say here, is that the DCFG block exposes a second set of
> Expert Mode registers in the DCSR address space (base address 2000_0000h).
> The DCFG_DCSR register region spans from offset 2_0000h to 2_05AC into
> the DCSR base address. At the exact same offsets into DCFG_DCSR as
> RCWSR0 - RCWSR15 are into DCFG_CCSR (aka 100h-13ch), one can find the
> shadow RCWCR0 - RCWCR15 (Reset Control Word Control Register) registers
> which are also writable. There is a one-to-one mapping between each
> register (and field) in RCWSR and RCWCR, so I won't detail the
> contents of the RCWCR registers, because we document RCWSR fully.
>
> RCW override means modifying some of the RCWCR registers. In this
> particular case, finalizing the major SerDes reconfiguration requires
> overriding SRDS_PRTCL_S1 in RCWCR5 with the new per-lane settings, to mux
> the correct PCS to the MAC. In the general case, random RCW overrides
> that don't come from the hardware validation team are unsupported by
> NXP, and you should expect the procedure to yield unpredictable results.

Good to see these finally documented (in some form or other). Perhaps
these could also be used to enable e.g. a UART in the hard-coded RCW
without requiring e.g. JTAG...

> I don't know how much of the above steps is applicable to the other 10G Lynx
> SoCs (LS1088, LS2088 etc). Not all SoCs may require RCW override, and
> the RCW override procedure may not be the same. That is TBD, and we'd
> appreciate if support for other SoCs than the LS1046 to be added no
> earlier than when we have a validated SerDes reconfiguration procedure.
> 
> I believe this is enough information to permit the creation of a Linux
> driver on the DCFG_DCSR registers which permits RCW override at runtime.
> It seems this will be necessary at least on LS1046.
> 
> We should discuss who and when picks up your patches, removes the
> unsupported, untested and unnecessary code and adds the RCW override
> procedure.

Well, while I don't agree with your characterization (since this code
was indeed tested by me and was indeed necessary), I am glad to see that
you think there is a path forward.

> It can be you, it can also be someone from NXP. If it will be
> you, please let me know, because there are concrete implementation
> choices I want to leave comments on.

I can look into doing this. It will be in my free time, so it will
likely be a bit before I can update this series.

> There is also the previous observation from Ioana that you should not
> delete PHY interrupts without finding out why they don't work.

Well, if you have a better solution, please let me know. The interrupt
does not work in real hardware.

I was hampered in my efforts to determine the cause because the interrupt
passes through an FPGA to which I lack the HDL. So far, I have not seen
any argument against polling except that we do not understand the
problem yet. However, I have not seen any other analysis of the problem
either.

> Only what was thoroughly tested and is based on a hardware validated
> procedure should be submitted. This means only a 1G <-> 10G transition
> on LS1046 for now.
> 
> Nonetheless, below is a functional example of how NXP would recommend
> you to achieve the desired PLL mapping for any RCW-based SerDes protocol.
> My testing platform was the LS1046A-QDS with PLL1 at 100 MHz and PLL2 at
> 156.25 MHz. I believe that this should eliminate the need for a clk
> driver for the PLLs, and should make your Ethernet lanes usable much
> earlier than Linux. That being said, our position at NXP is that you
> don't need a clk driver for the PLLs, and I would like to see the clk
> portion removed from future patch revisions.

I have not had any issues with clocking. This is actually one of the
areas where the reference manual is sufficient to create a working
driver. Adding flexibility here is very useful, because we can solve
hardware problems in software. This can reduce e.g. board respins, and
allow for more interesting clocking solutions (such as allowing clock
generators which must be configured before use).

> This patch is on top of https://cas5-0-urlprotect.trendmicro.com:443/wis/clicktime/v1/query?url=https%3a%2f%2fgithub.com%2fnxp%2dqoriq%2frcw%2ftree%2flf%2d6.1.22%2d2.0.0&umid=9173333f-cda0-48a2-a652-ccb5b892d77e&auth=d807158c60b7d2502abde8a2fc01f40662980862-d3ec43c9e78658dbb409cef29887d136eca21c6a
> 
> From 9f90d6805883f23a898f9d66826f89b7ba73afe3 Mon Sep 17 00:00:00 2001
> From: Vladimir Oltean <vladimir.oltean@nxp.com>
> Date: Thu, 25 May 2023 11:23:41 +0300
> Subject: [PATCH] LS1046A: implement a PBI-based SerDes protocol switching
>  mechanism
> 
> The LS1046A reset state machine is a bit limited in the SRDS_PRTCL_S1
> protocol combinations that it offers. For example, it offers protocol
> 0x3333 (4x SGMII) only with PLL mapping 2222, and that is good as long
> as dynamic protocol switching between 1G (SGMII) and 10G (XFI) isn't
> desired at runtime.
> 
> If that is taken into account as an additional constraint, then we need
> PLL2 (yes, specifically 2) to have a refclk of 156.25 MHz, and that is
> in conflict with the PLL mapping of the aforementioned 0x3333, because
> 1G (SGMII) can only work with a PLL refclk of 100 or 125 MHz, so that's
> what has to be fed into PLL 2.
> 
> Dynamic frequency switching of PLLs is hard and out of question.
> 
> It is desirable to be able to use PLL2 for the 156.25 MHz refclk
> frequency (for the 10G link modes), and PLL1 for the 100 MHz refclk
> frequency (for the 1G link modes). It turns out, this is possible, even
> if not 100% natively through the reset state machine built-in options.
> 
> The strategy is to pick a pair of SerDes refclk frequencies in the RCW
> that is correct for the given board (thus allowing the SerDes PLLs to
> lock, and the SoC to boot) as a first step. The SerDes protocol can be
> absolutely anything as long as the PLL frequencies are right, because as
> a second step, we'll be fixing up the SerDes lane registers for the
> desired final protocol, to appear as if we had that protocol natively.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> ---
>  .../rcw_1600.rcw                              |  99 +++++++++++++++++
>  ls1046ardb/serdes_1133_to_3333.rcw            |  73 +++++++++++++
>  serdes_10g.rcw                                | 102 ++++++++++++++++++
>  3 files changed, 274 insertions(+)
>  create mode 100644 ls1046aqds/RR_FFSNPPPH_1133_5559_to_3333_5559/rcw_1600.rcw
>  create mode 100644 ls1046ardb/serdes_1133_to_3333.rcw
>  create mode 100644 serdes_10g.rcw
> 
> diff --git a/ls1046aqds/RR_FFSNPPPH_1133_5559_to_3333_5559/rcw_1600.rcw b/ls1046aqds/RR_FFSNPPPH_1133_5559_to_3333_5559/rcw_1600.rcw
> new file mode 100644
> index 000000000000..2bbc0163392d
> --- /dev/null
> +++ b/ls1046aqds/RR_FFSNPPPH_1133_5559_to_3333_5559/rcw_1600.rcw
> @@ -0,0 +1,99 @@
> +/*
> + * LS1046AQDS RCW for SerDes Protocol 0x3333_5559 derived from 0x1133_5559
> + *
> + * 42G configuration -- 2 RGMII + 4 SGMII (reconfigurable to XFI) + 3 PCIe + SATA
> + *
> + * Frequencies:
> + *
> + * Sys Clock: 100 MHz
> + * DDR_Refclock: 100 MHz
> + *
> + * Core		-- 1600 MHz (Mul 16)
> + * Platform	-- 600 MHz (Mul 6)
> + * DDR		-- 2100 MT/s (Mul 21)
> + * FMan		-- 700 MHz (CGA2 /2)
> + * XFI		-- 156.25 MHz (10.3125G)
> + * SGMII	-- 100 MHz (5G)
> + * PCIE		-- 100 MHz (5G)
> + * eSDHC	-- 1400 MHz (CGA2 /1)
> + *
> + * Hardware Accelerator Block Cluster Group A Mux Clock:
> + *   FMan        - HWA_CGA_M1_CLK_SEL = 6 - Async mode, CGA PLL 2 /2 is clock
> + *   eSDHC, QSPI - HWA_CGA_M2_CLK_SEL = 1 - Async mode, CGA PLL 2 /1 is clock
> + *
> + * Serdes Lanes vs Slot information
> + *  Serdes1 Lane 0 (D) - Starts as XFI9, switches to SGMII9
> + *  Serdes1 Lane 1 (C) - Starts as XFI10, switches to SGMII10
> + *  Serdes1 Lane 2 (B) - SGMII5, Slot 1
> + *  Serdes1 Lane 3 (A) - SGMII6, Slot 1
> + *
> + *  Serdes2 Lane 0 (A) - PCIe1 Gen2 x1, Slot 3
> + *  Serdes2 Lane 1 (B) - PCIe2 Gen2 x1, Slot 4
> + *  Serdes2 Lane 2 (C) - PCIe3 Gen2 x1, Slot 5
> + *  Serdes2 Lane 3 (D) - SATA
> + *
> + * PLL mapping: starts as 2211_2221, ends as 1111_2221
> + *
> + * Serdes 1:
> + *  PLL mapping: 1111
> + *  SRDS_PLL_REF_CLK_SEL_S1 : 0b'01
> + *    SerDes 1, PLL1[160] : 0 - 100MHz for SGMII and PCIe
> + *    SerDes 1, PLL2[161] : 1 - 156.25MHz for XFI
> + *  SRDS_PLL_PD_S1 : 0b'0
> + *    SerDes 1, PLL1 : 0 - not powered down
> + *    SerDes 1, PLL2 : 0 - not powered down
> + *  SRDS_DIV_PEX_S1 :
> + *    Only used for PEX, not used.
> + *
> + * Serdes 2:
> + *  PLL mapping: 2221
> + *  SRDS_PLL_REF_CLK_SEL_S2 : 0b'00
> + *    SerDes 2, PLL1[162] : 0 - 100MHz for SATA
> + *    SerDes 2, PLL2[163] : 0 - 100MHz for PCIe
> + *  SRDS_PLL_PD_S2 : 0b'00
> + *    SerDes 2, PLL1 : 0 - not powered down
> + *    SerDes 2, PLL2 : 0 - not powered down
> + *  SRDS_DIV_PEX_S2 : 0b'01
> + *    00 - train up to max rate of 8G
> + *    01 - train up to max rate of 5G
> + *    10 - train up to max rate of 2.5G
> + *
> + * DDR clock:
> + * DDR_REFCLK_SEL : 1 - DDRCLK pin provides the reference clock to the DDR PLL
> + *
> + */
> +
> +#include <../ls1046ardb/ls1046a.rcwi>
> +
> +SYS_PLL_RAT=6
> +MEM_PLL_RAT=21
> +CGA_PLL1_RAT=16
> +CGA_PLL2_RAT=14
> +SRDS_PRTCL_S1=4403
> +SRDS_PRTCL_S2=21849

I know it is not typical for NXP RCWs, but your rcw tool supports using
hex/binary prefixes. Thus, you could rewrite the above lines as

SRDS_PRTCL_S1=0x1133
SRDS_PRTCL_S2=0x5559

IMO this is much easier to read, since it matches the documentation.

> +SRDS_PLL_REF_CLK_SEL_S1=1
> +SRDS_PLL_REF_CLK_SEL_S2=0
> +SRDS_DIV_PEX_S1=1
> +SRDS_DIV_PEX_S2=1
> +DDR_FDBK_MULT=2
> +DDR_REFCLK_SEL=1
> +PBI_SRC=14

As another example, here you can do e.g.

PBI_SRC=0b1110

which better matches the documentation.

> +IFC_MODE=37
> +HWA_CGA_M1_CLK_SEL=6
> +DRAM_LAT=1
> +UART_BASE=7
> +IRQ_OUT=1
> +LVDD_VSEL=1
> +TVDD_VSEL=0
> +DVDD_VSEL=2
> +EVDD_VSEL=2
> +IIC2_EXT=1
> +SYSCLK_FREQ=600
> +HWA_CGA_M2_CLK_SEL=1
> +
> +#include <../ls1046ardb/cci_barrier_disable.rcw>
> +#include <../ls1046ardb/usb_phy_freq.rcw>
> +#include <../ls1046ardb/uboot_address.rcw>
> +#include <../ls1046ardb/serdes_sata.rcw>
> +#include <../ls1046ardb/a009531.rcw>
> +#include <../ls1046ardb/serdes_1133_to_3333.rcw>
> diff --git a/ls1046ardb/serdes_1133_to_3333.rcw b/ls1046ardb/serdes_1133_to_3333.rcw
> new file mode 100644
> index 000000000000..ffd548a73675
> --- /dev/null
> +++ b/ls1046ardb/serdes_1133_to_3333.rcw
> @@ -0,0 +1,73 @@
> +/*
> + * Change protocols on SerDes1 from 1133 to 3333, and their PLL mappings from
> + * 2211 to 1111. This is useful because, although the reset state machine has a
> + * native 0x3333 SerDes protocol option, the PLL mapping of that is 2222.
> + * This non-native option frees up PLL 2, and it can be provisioned e.g. with a
> + * 156.25 MHz for any lanes that might want to switch to XFI at runtime.
> + */
> +
> +#define SRDS_BASE		0xea0000 /* SerDes 1 relative to CCSR_BASE */
> +#include <../serdes_10g.rcw>
> +
> +/* For writing outside the CCSR space (in DCSR), an indirect access method is
> + * used. The SCFG_ALTCBAR register (field ALTCFG) holds the upper 24 bits of
> + * the 48-bit address, and the awrite PBL instruction gets the lower 24 bits of
> + * the address that is relative to that. Here we work with 32-bit addresses,
> + * so we only care about the upper 8 bits.
> + */
> +#define SCFG_ALTCBAR		0x570158
> +#define ALTCFG(x)		(((x) << 8) & 0xffffff00)
> +#define DCFG_DCSR_RCWCR5	0x20140110
> +#define RCWCR5_SRDS_PRTCL_S1(x)	(((x) << 16) & 0xffff0000)
> +#define RCWCR5_SRDS_PRTCL_S2(x)	((x) & 0xffff)
> +#define upper_8_bits(x)		(((x) & 0xff000000) >> 24)
> +#define lower_24_bits(x)	((x) & 0xffffff)
> +
> +#define GCR0_SGMII_FROM_PLL1	RPLL_LES(1) | RRAT_SEL(2) | \
> +				TPLL_LES(1) | TRAT_SEL(2) | \
> +				FIRST_LANE(1) | PROTS(1)
> +
> +.pbi
> +
> +write LNmGCR0(2), RRST_B(0) | TRST_B(0)
> +write LNmGCR0(3), RRST_B(0) | TRST_B(0)
> +
> +wait 50
> +
> +write LNmGCR0(2), GCR0_SGMII_FROM_PLL1
> +write LNmGCR0(3), GCR0_SGMII_FROM_PLL1
> +
> +write LNmGCR1(2), REIDL_TH(1) | REIDL_EX_SEL(3) | REIDL_ET_MSB(1) | \
> +		  ISLEW_RCTL(1) | OSLEW_RCTL(1)
> +write LNmGCR1(3), REIDL_TH(1) | REIDL_EX_SEL(3) | REIDL_ET_MSB(1) | \
> +		  ISLEW_RCTL(1) | OSLEW_RCTL(1)
> +
> +write LNmRECR0(2), GK2OVD_EN(1) | GK2OVD(15) | GK3OVD_EN(1) | GK3OVD(15)
> +write LNmRECR0(3), GK2OVD_EN(1) | GK2OVD(15) | GK3OVD_EN(1) | GK3OVD(15)
> +
> +write LNmTECR0(2), ADPT_EQ(48) | AMP_RED(6)
> +write LNmTECR0(3), ADPT_EQ(48) | AMP_RED(6)
> +
> +/* LS1046A requires RCW override to reconfigure the mux between
> + * the PCS and the MAC.
> + */
> +write SCFG_ALTCBAR, ALTCFG(upper_8_bits(DCFG_DCSR_RCWCR5))
> +flush
> +awrite lower_24_bits(DCFG_DCSR_RCWCR5), RCWCR5_SRDS_PRTCL_S1(0x3333) | \
> +					RCWCR5_SRDS_PRTCL_S2(0x5559)
> +
> +/* PCCRB: 0x21000000 -> 0x00000000 */
> +write PCCRB, XFIA_CFG(0) | XFIB_CFG(0)
> +
> +/* PCCR8: 0x11000000 -> 0x11110000 */
> +write PCCR8, SGMIIA_CFG(1) | SGMIIB_CFG(1) | SGMIIC_CFG(1) | SGMIID_CFG(1)
> +
> +write SGMIIaCR1(2), SGMII_MDEV_PORT(0) | SGPCS_EN(1)
> +write SGMIIaCR1(3), SGMII_MDEV_PORT(0) | SGPCS_EN(1)
> +
> +wait 120
> +
> +write LNmGCR0(2), GCR0_SGMII_FROM_PLL1 | RRST_B(1) | TRST_B(1)
> +write LNmGCR0(3), GCR0_SGMII_FROM_PLL1 | RRST_B(1) | TRST_B(1)
> +
> +.end
> diff --git a/serdes_10g.rcw b/serdes_10g.rcw
> new file mode 100644
> index 000000000000..714d53fde8af
> --- /dev/null
> +++ b/serdes_10g.rcw
> @@ -0,0 +1,102 @@
> +/*
> + * Registers for the Lynx 10G SerDes block.
> + *
> + * Must be included by an SoC-specific header that defines the
> + * SRDS_BASE value.
> + */
> +
> +#define PLLnRSTCTL(n)		(SRDS_BASE + (0x20 * (n)))
> +#define PLLnCR0(n)		(SRDS_BASE + (0x20 * (n)) + 0x0004)
> +#define  POFF(x)		(((x) << 31) & 0x80000000)
> +#define  REFCLK_SEL(x)		(((x) << 28) & 0x70000000)
> +#define  REFCLK_EN(x)		(((x) << 27) & 0x08000000)
> +#define  FRATE_SEL(x)		(((x) << 16) & 0x000f0000)
> +#define  DLYDIV_SEL(x)		((x) & 0x00000003)
> +#define PCCR8			(SRDS_BASE + 0x0220)
> +#define  SGMIIA_KX(x)		(((x) << 31) & 0x80000000)
> +#define  SGMIIA_CFG(x)		(((x) << 28) & 0x70000000)
> +#define  SGMIIB_KX(x)		(((x) << 27) & 0x08000000)
> +#define  SGMIIB_CFG(x)		(((x) << 24) & 0x07000000)
> +#define  SGMIIC_KX(x)		(((x) << 23) & 0x00800000)
> +#define  SGMIIC_CFG(x)		(((x) << 20) & 0x00700000)
> +#define  SGMIID_KX(x)		(((x) << 19) & 0x00080000)
> +#define  SGMIID_CFG(x)		(((x) << 16) & 0x00070000)
> +#define  SGMIIE_KX(x)		(((x) << 15) & 0x00008000)
> +#define  SGMIIE_CFG(x)		(((x) << 12) & 0x00007000)
> +#define  SGMIIF_KX(x)		(((x) << 11) & 0x00000800)
> +#define  SGMIIF_CFG(x)		(((x) << 8) & 0x00000700)
> +#define  SGMIIG_KX(x)		(((x) << 7) & 0x00000080)
> +#define  SGMIIG_CFG(x)		(((x) << 4) & 0x00000070)
> +#define  SGMIIH_KX(x)		(((x) << 3) & 0x00000008)
> +#define  SGMIIH_CFG(x)		((x) & 0x00000007)
> +#define PCCRB			(SRDS_BASE + 0x022c)
> +#define  XFIA_CFG(x)		(((x) << 28) & 0x70000000)
> +#define  XFIB_CFG(x)		(((x) << 24) & 0x07000000)
> +#define  XFIC_CFG(x)		(((x) << 20) & 0x00700000)
> +#define  XFID_CFG(x)		(((x) << 16) & 0x00070000)
> +#define  XFIE_CFG(x)		(((x) << 12) & 0x00007000)
> +#define  XFIF_CFG(x)		(((x) << 8) & 0x00000700)
> +#define  XFIG_CFG(x)		(((x) << 4) & 0x00000070)
> +#define  XFIH_CFG(x)		((x) & 0x00000007)
> +#define LNmGCR0(m)		(SRDS_BASE + (0x40 * (m)) + 0x0800)
> +#define  RPLL_LES(x)		(((x) << 31) & 0x80000000)
> +#define  RRAT_SEL(x)		(((x) << 28) & 0x30000000)
> +#define  TPLL_LES(x)		(((x) << 27) & 0x08000000)
> +#define  TRAT_SEL(x)		(((x) << 24) & 0x03000000)
> +#define  RRST_B(x)		(((x) << 22) & 0x00400000)
> +#define  TRST_B(x)		(((x) << 21) & 0x00200000)
> +#define  RX_PD(x)		(((x) << 20) & 0x00100000)
> +#define  TX_PD(x)		(((x) << 19) & 0x00080000)
> +#define  IF20BIT_EN(x)		(((x) << 18) & 0x00040000)
> +#define  FIRST_LANE(x)		(((x) << 16) & 0x00010000)
> +#define  GCR0_RSV		0x1000
> +#define  PROTS(x)		(((x) << 7) & 0x00000f80)
> +#define LNmGCR1(m)		(SRDS_BASE + (0x40 * (m)) + 0x0804)
> +#define  RDAT_INV(x)		(((x) << 31) & 0x80000000)
> +#define  TDAT_INV(x)		(((x) << 30) & 0x40000000)
> +#define  OPAD_CTL(x)		(((x) << 26) & 0x04000000)
> +#define  REIDL_TH(x)		(((x) << 20) & 0x00700000)
> +#define  REIDL_EX_SEL(x)	(((x) << 18) & 0x000C0000)
> +#define  REIDL_ET_SEL(x)	(((x) << 16) & 0x00030000)
> +#define  REIDL_EX_MSB(x)	(((x) << 15) & 0x00008000)
> +#define  REIDL_ET_MSB(x)	(((x) << 14) & 0x00004000)
> +#define  REQ_CTL_SNP(x)		(((x) << 13) & 0x00002000)
> +#define  REQ_CDR_SNP(x)		(((x) << 12) & 0x00001000)
> +#define  TRSTDIR(x)		(((x) << 7) & 0x00000080)
> +#define  REQ_BIN_SNP(x)		(((x) << 6) & 0x00000040)
> +#define  ISLEW_RCTL(x)		(((x) << 4) & 0x00000030)
> +#define  GCR1_RSV		0x8
> +#define  OSLEW_RCTL(x)		((x) & 0x3)
> +#define LNmRECR0(m)		(SRDS_BASE + (0x40 * (m)) + 0x0810)
> +#define  RXEQ_BST(x)		(((x) << 28) & 0x10000000)
> +#define  GK2OVD(x)		(((x) << 24) & 0x0f000000)
> +#define  GK3OVD(x)		(((x) << 16) & 0x000f0000)
> +#define  GK2OVD_EN(x)		(((x) << 15) & 0x00008000)
> +#define  GK3OVD_EN(x)		(((x) << 14) & 0x00004000)
> +#define  OSETOVD_EN(x)		(((x) << 13) & 0x00002000)
> +#define  BASE_WAND(x)		(((x) << 10) & 0x00000c00)
> +#define  OSETOVD(x)		((x) & 0x0000007F)
> +#define LNmTECR0(m)		(SRDS_BASE + (0x40 * (m)) + 0x0818)
> +#define  TEQ_TYPE(x)		(((x) << 28) & 0x30000000)
> +#define  SGN_PREQ(x)		(((x) << 26) & 0x04000000)
> +#define  RATIO_PREQ(x)		(((x) << 22) & 0x03C00000)
> +#define  SGN_POST1Q(x)		(((x) << 21) & 0x00200000)
> +#define  RATIO_PST1Q(x)		(((x) << 16) & 0x001F0000)
> +#define  ADPT_EQ(x)		(((x) << 8) & 0x00003F00)
> +#define  AMP_RED(x)		((x) & 0x0000003f)
> +#define LNmTTLCR0(m)		(SRDS_BASE + (0x40 * (m)) + 0x0820)
> +#define LNmTCSR0(m)		(SRDS_BASE + (0x40 * (m)) + 0x0830)
> +#define LNmTCSR1(m)		(SRDS_BASE + (0x40 * (m)) + 0x0834)
> +#define LNmTCSR2(m)		(SRDS_BASE + (0x40 * (m)) + 0x0838)
> +#define LNmTCSR3(m)		(SRDS_BASE + (0x40 * (m)) + 0x083c)
> +#define SGMIIaCR0(a)		(SRDS_BASE + (0x10 * (a)) + 0x1800)
> +#define  RST_SGM(x)		(((x) << 31) & 0x80000000)
> +#define  PD_SGM(x)		(((x) << 30) & 0x40000000)
> +#define SGMIIaCR1(a)		(SRDS_BASE + (0x10 * (a)) + 0x1804)
> +#define  SGMII_MDEV_PORT(x)	(((x) << 27) & 0xf8000000)
> +#define  SGPCS_EN(x)		(((x) << 11) & 0x00000800)
> +#define XFIaCR0(a)		(SRDS_BASE + (0x10 * (a)) + 0x1980)
> +#define  RST_XFI(x)		(((x) << 31) & 0x80000000)
> +#define  PD_XFI(x)		(((x) << 30) & 0x40000000)
> +#define XFIaCR1(a)		(SRDS_BASE + (0x10 * (a)) + 0x1984)
> +#define  XFI_MDEV_PORT(x)	(((x) << 27) & 0xf8000000)
> --
> 2.34.1
> 
>> > This driver would also be a good place to add the KR link training with
>> > NXP tried to upstream a few years ago.
>> 
>> Well, speaking as someone who is now also tasked with the copper backplane
>> support, believe me that I know that, and this is why I'm so desperate
>> with the logic you're trying to push forward. It's clear that we should
>> try to collaborate rather than try to push individualistic non-tested
>> non-solutions.
> 
> Speaking of this, I will send an RFC in the upcoming days which proposes
> a model for the copper backplane PHY support on LX2160A and its already
> existing Lynx 28G SerDes driver.
> 
> I will concern myself with porting that support on Lynx 10G only once
> the model I propose turns out to be acceptable for both the network PHY
> as well as the generic PHY maintainers.

Well, as far as I know this driver is only a few changes from being
acceptable to these maintainers as well :)

--Sean
Vladimir Oltean Aug. 11, 2023, 3:08 p.m. UTC | #16
Hi Sean,

On Tue, Jun 13, 2023 at 05:27:54PM +0300, Vladimir Oltean wrote:
> > > At first sight you might appear to have a point related to the fact that
> > > PLL register writes are necessary, and thus this whole shebang is necessary.
> > > But this can all be done using PBI commands, with the added benefit that
> > > U-Boot can also use those SERDES networking ports, and not just Linux.
> > > You can use the RCW+PBL specific to your board to inform the SoC that
> > > your platform's refclk 1 is 156 MHz (something which the reset state
> > > machine seems unable to learn, with protocol 0x3333). You don't have to
> > > put that in the device tree. You don't have to push code to any open
> > > source project to expose your platform specific details. Then, just like
> > > in the case of the Lynx 28G driver on LX2160, the SERDES driver could
> > > just treat the PLL configuration as read-only, which would greatly
> > > simplify things and eliminate the need for a clk driver.
> > > 
> > > Here is an illustrative example (sorry, I don't have a board with the
> > > right refclk on that PLL, to verify all the way):
> > > 
> > > ... snip ...
> > 
> > (which of course complicates the process of building the PBIs...)
> 
> Maybe this is the language barrier, but what are you trying to say here?

I said that I don't understand. Can you please clarify what you were
trying to transmit with this comment?
Vladimir Oltean Aug. 11, 2023, 4:12 p.m. UTC | #17
Hi Sean,

On Thu, Aug 10, 2023 at 03:58:36PM -0400, Sean Anderson wrote:
> As explained previously (and noted by yourself below) 1G and 10G RCWs
> have mutally-incompatible clocking requirements. Now that you have
> documented an alternate solution, it is possible to boot up with one RCW
> and switch to another. But without that it was not possible to have one
> board support both RCWs (without e.g. a microcontroller or FPGA to
> configure the clock generator before releasing the processor reset). I
> do not think that the silicon should assert the reset request line if
> the serdes doesn't lock, but it does and it can't really be disabled.
> 
> As it happens, our board is set up so that the reference clocks are
> configured for 10G by default. During this boot, reset request is never
> requested. If we did not have to support software configuration of the
> serdes speed (to support 1G SFPs) we would not have to disconnect reset
> request.
> 
> That said, I have evaluated the various reasons that reset request can
> be asserted, and I have determined that for our product they are not
> necessary. The only major limitation is the lack of a watchdog, but that
> was not a requirement for us. Because of this, using a GPIO for reset is
> sufficient and neatly avoids the issue.

I would like to pause here and highlight the existence of the so-called
XY problem: https://en.wikipedia.org/wiki/XY_problem

| The XY problem is a communication problem encountered in help desk,
| technical support, software engineering, or customer service situations
| where the question is about an end user's attempted solution (Y) rather
| than the root problem itself (X).

You admitted that you needed to solve problem X (software reconfiguration
of the SerDes speed between 1G and 10G), and that has led you to problem
Y (giving the PLLs some refclk frequencies which aren't supported at
power-on reset, and thus, are also not validated by NXP). You've presented
to the Linux mailing lists a driver which solves problem Y, but not X.

I gave you a solution to problem X which doesn't even trigger problem Y.

Furthermore, I gave you a solution to problem Y which is much simpler
than yours. On the 12th of June, in Message-ID: 20230612163353.dwouatvqbuo6h4ea@skbuf,
I explained that if you absolutely insist to use the unsupported PLL
refclks, you can use PBL commands to change the PLL settings (so that
they lock) at power-on reset time. The advantage is that both U-Boot and
Linux will work without having to make any modification to the PLLs,
just treat them as read-only.

As it happens, at NXP we also want to solve problem X in a generic way
(aka we need a procedure that works for all customers, and not just for
your board), and we want to do so in a way that the hardware validation
team agrees with. Thus, the SoC needs to accept the PLL refclks at
power-on reset time. The solution we came up with is the one presented
to you yesterday. It makes your PLL clk driver unnecessary. No one will
come after you if you keep it in your Linux tree and use it, but it is
unnecessary.

> > Nonetheless, below is a functional example of how NXP would recommend
> > you to achieve the desired PLL mapping for any RCW-based SerDes protocol.
> > My testing platform was the LS1046A-QDS with PLL1 at 100 MHz and PLL2 at
> > 156.25 MHz. I believe that this should eliminate the need for a clk
> > driver for the PLLs, and should make your Ethernet lanes usable much
> > earlier than Linux. That being said, our position at NXP is that you
> > don't need a clk driver for the PLLs, and I would like to see the clk
> > portion removed from future patch revisions.
> 
> I have not had any issues with clocking. This is actually one of the
> areas where the reference manual is sufficient to create a working
> driver. Adding flexibility here is very useful, because we can solve
> hardware problems in software. This can reduce e.g. board respins, and
> allow for more interesting clocking solutions (such as allowing clock
> generators which must be configured before use).

I am waiting for someone to come up with a real life use case that
justifies those "more interesting clocking solutions".

Please correct me if you think I am wrong, but as things stand, the
SerDes PLL clk driver is now a solution waiting for a problem. It can
wait for that problem out of tree.

> > There is also the previous observation from Ioana that you should not
> > delete PHY interrupts without finding out why they don't work.
> 
> Well, if you have a better solution, please let me know. The interrupt
> does not work in real hardware.
> 
> I was hampered in my efforts to determine the cause because the interrupt
> passes through an FPGA to which I lack the HDL. So far, I have not seen
> any argument against polling except that we do not understand the
> problem yet. However, I have not seen any other analysis of the problem
> either.

Out of respect for the topic at hand, ask for help in a separate thread,
open an NXP support ticket - do something to split it from the SerDes
work which it is unrelated with. You were told that part of the reason
for the NACK is the fact that you are grouping unrelated things together,
and you did nothing about it.

> > +SRDS_PRTCL_S1=4403
> > +SRDS_PRTCL_S2=21849
> 
> I know it is not typical for NXP RCWs, but your rcw tool supports using
> hex/binary prefixes. Thus, you could rewrite the above lines as
> 
> SRDS_PRTCL_S1=0x1133
> SRDS_PRTCL_S2=0x5559
> 
> IMO this is much easier to read, since it matches the documentation.

Ok, thanks.
Vladimir Oltean Aug. 21, 2023, 12:49 p.m. UTC | #18
Hi Sean,

On Fri, Aug 11, 2023 at 07:36:37PM +0300, Vladimir Oltean wrote:
> Let me explain that approach, because your mention of "swapping out the
> bootloaders" makes it appear as if you are not visualising what I am
> proposing.
> 
> The Lynx SerDes family has 2 PLLs, and more lanes (4 or 8). Each lane
> uses one PLL or the other, to derive its protocol frequency. Through the
> RCW, you provision the 2 PLL frequencies that may be used by the lanes
> at runtime.
> 
> The Lynx 28G SerDes driver reads the PLL frequencies in
> lynx_28g_pll_read_configuration(), and determines the interface modes
> supportable by each PLL (this is used by phylink). But it never changes
> those PLL frequencies, since that operation is practically impossible in
> the general sense (PLLs are shared by multiple lanes, so changing a PLL
> frequency disrupts all lanes that use it).

Is my high-level feedback clear and actionable to you? I am suggesting
to keep the look and feel the same between the lynx-10g and lynx-28g
drivers, and to not use "fsl,type" protocols listed in the device tree
as the immutable source of information for populating mode->protos, but
instead the current PLL frequency configuration. So this implies that I
am requesting that the dt-bindings should not contain a listing of the
supported protocols.
Sean Anderson Aug. 21, 2023, 5:45 p.m. UTC | #19
On 8/21/23 08:49, Vladimir Oltean wrote:
> Hi Sean,
> 
> On Fri, Aug 11, 2023 at 07:36:37PM +0300, Vladimir Oltean wrote:
>> Let me explain that approach, because your mention of "swapping out the
>> bootloaders" makes it appear as if you are not visualising what I am
>> proposing.
>> 
>> The Lynx SerDes family has 2 PLLs, and more lanes (4 or 8). Each lane
>> uses one PLL or the other, to derive its protocol frequency. Through the
>> RCW, you provision the 2 PLL frequencies that may be used by the lanes
>> at runtime.
>> 
>> The Lynx 28G SerDes driver reads the PLL frequencies in
>> lynx_28g_pll_read_configuration(), and determines the interface modes
>> supportable by each PLL (this is used by phylink). But it never changes
>> those PLL frequencies, since that operation is practically impossible in
>> the general sense (PLLs are shared by multiple lanes, so changing a PLL
>> frequency disrupts all lanes that use it).
> 
> Is my high-level feedback clear and actionable to you? I am suggesting
> to keep the look and feel the same between the lynx-10g and lynx-28g
> drivers, and to not use "fsl,type" protocols listed in the device tree
> as the immutable source of information for populating mode->protos, but
> instead the current PLL frequency configuration. So this implies that I
> am requesting that the dt-bindings should not contain a listing of the
> supported protocols.

Well, we have two pieces of information we need

- What values do we need to program in the PCCRs to select a particular
  mode? This includes whether to e.g. set the KX bits.
- Implied by the above, what protocols are supported on which lanes?
  This is not strictly necessary, but will certainly solve a lot of
  headscratching.

This information varies between different socs, and different serdes on
the same socs. We can't really look at the RCW or the clocks and figure
out what we need to program. So what are our options?

- We can have a separate compatible for each serdes on each SoC (e.g.
  "fsl,lynx-10g-a"). This was rejected by the devicetree maintainers.
- We can have one compatible for each SoC, and determine the serdes
  based on the address. I would like to avoid this...
- We can stick all the details which vary between serdes/socs into the
  device tree. This is very flexible, since supporting new SoCs is
  mostly a matter of adding a new compatible and writing a new
  devicetree. On the other hand, if you have a bug in your devicetree,
  it's not easy to fix it in the kernel.
- Just don't support protocol switching. The 28G driver does this, which
  is why it only has one compatible. However, supporting protocol
  switching is a core goal of this driver, so dropping support is not an
  option.

I'm open to any other suggestions you have, but this was the best way I
could figure out to get this information to the driver in a way which
would be acceptable to the devicetree folks.

--Sean
Ioana Ciornei Aug. 21, 2023, 6:13 p.m. UTC | #20
On Mon, Aug 21, 2023 at 01:45:44PM -0400, Sean Anderson wrote:
> Well, we have two pieces of information we need
> 
> - What values do we need to program in the PCCRs to select a particular
>   mode? This includes whether to e.g. set the KX bits.
> - Implied by the above, what protocols are supported on which lanes?
>   This is not strictly necessary, but will certainly solve a lot of
>   headscratching.
> 
> This information varies between different socs, and different serdes on
> the same socs. We can't really look at the RCW or the clocks and figure
> out what we need to program. So what are our options?
> 
> - We can have a separate compatible for each serdes on each SoC (e.g.
>   "fsl,lynx-10g-a"). This was rejected by the devicetree maintainers.
> - We can have one compatible for each SoC, and determine the serdes
>   based on the address. I would like to avoid this...

To me this really seems like a straightforward approach.

> - We can stick all the details which vary between serdes/socs into the
>   device tree. This is very flexible, since supporting new SoCs is
>   mostly a matter of adding a new compatible and writing a new
>   devicetree. On the other hand, if you have a bug in your devicetree,
>   it's not easy to fix it in the kernel.
> - Just don't support protocol switching. The 28G driver does this, which
>   is why it only has one compatible. However, supporting protocol
>   switching is a core goal of this driver, so dropping support is not an
>   option.
> 

The Lynx 28G SerDes driver does support protocol switching.
How did you arrive at the opposite conclusion?

The initial commit on the driver is even part of a patch set named
"dpaa2-mac: add support for changing the protocol at runtime". In
upstream it only supports the 1G <-> 10G transition but I have some
patches on the way to also support 25G.

Ioana
Vladimir Oltean Aug. 21, 2023, 6:20 p.m. UTC | #21
On Mon, Aug 21, 2023 at 09:13:49PM +0300, Ioana Ciornei wrote:
> > - We can have one compatible for each SoC, and determine the serdes
> >   based on the address. I would like to avoid this...
> 
> To me this really seems like a straightforward approach.

+1
Sean Anderson Aug. 21, 2023, 6:46 p.m. UTC | #22
On 8/21/23 14:13, Ioana Ciornei wrote:
> On Mon, Aug 21, 2023 at 01:45:44PM -0400, Sean Anderson wrote:
>> Well, we have two pieces of information we need
>> 
>> - What values do we need to program in the PCCRs to select a particular
>>   mode? This includes whether to e.g. set the KX bits.
>> - Implied by the above, what protocols are supported on which lanes?
>>   This is not strictly necessary, but will certainly solve a lot of
>>   headscratching.
>> 
>> This information varies between different socs, and different serdes on
>> the same socs. We can't really look at the RCW or the clocks and figure
>> out what we need to program. So what are our options?
>> 
>> - We can have a separate compatible for each serdes on each SoC (e.g.
>>   "fsl,lynx-10g-a"). This was rejected by the devicetree maintainers.
>> - We can have one compatible for each SoC, and determine the serdes
>>   based on the address. I would like to avoid this...
> 
> To me this really seems like a straightforward approach.

Indeed it would be straightforward, but what's the point of having a
devicetree in the first place then? We could just go back to being a
(non-dt) platform device.

>> - We can stick all the details which vary between serdes/socs into the
>>   device tree. This is very flexible, since supporting new SoCs is
>>   mostly a matter of adding a new compatible and writing a new
>>   devicetree. On the other hand, if you have a bug in your devicetree,
>>   it's not easy to fix it in the kernel.
>> - Just don't support protocol switching. The 28G driver does this, which
>>   is why it only has one compatible. However, supporting protocol
>>   switching is a core goal of this driver, so dropping support is not an
>>   option.
>> 
> 
> The Lynx 28G SerDes driver does support protocol switching.
> How did you arrive at the opposite conclusion?

Sorry, it's been a while and I just did a quick look-over, and noticed
there was no configuration for different SoCs.

After further review, it seems the reason 28g can get away without this
is because there's a one-to-one mapping between protocol controllers and
lanes. Unfortunately, that regularity is not present for 10g.

--Sean

> The initial commit on the driver is even part of a patch set named
> "dpaa2-mac: add support for changing the protocol at runtime". In
> upstream it only supports the 1G <-> 10G transition but I have some
> patches on the way to also support 25G.
> 
> Ioana
Vladimir Oltean Aug. 21, 2023, 7:58 p.m. UTC | #23
On Mon, Aug 21, 2023 at 02:46:53PM -0400, Sean Anderson wrote:
> After further review, it seems the reason 28g can get away without this
> is because there's a one-to-one mapping between protocol controllers and
> lanes. Unfortunately, that regularity is not present for 10g.
> 
> --Sean

There are some things I saw in your phy-fsl-lynx-10g.c driver and device
tree bindings that I don't understand (the concept of lane groups), and
I'm not sure if they're related with what you're saying here, so if you
could elaborate a bit more (ideally with an example) on the one-to-one
mapping and the specific problems it causes, it would be great.

I may be off with my understanding of the regularity you are talking about,
but the LX2160 (and Lynx 28G block) also has multi-lane protocols like 40G,
100G, assuming that's what you are talking about. I haven't started yet
working on those for the mtip_backplane driver, but I'm not currently
seeing a problem with the architecture where a phy_device represents a
single lane that's part of a multi-lane port, and not an entire group.

In my imagination, there are 2 cases:
- all 4 lanes are managed by the single dpaa2-mac consumer (which has 4
  phandles, and iterates over them with a "for" loop)
- each of the 4 lanes is managed by the respective backplane AN/LT core,
  and thus, there's one phandle to each lane

I sketched some dt-bindings for the second case here, so I guess it must
be the first scenario that's somehow problematic?
https://patchwork.kernel.org/project/netdevbpf/patch/20230817150644.3605105-9-vladimir.oltean@nxp.com/
Sean Anderson Aug. 21, 2023, 9:06 p.m. UTC | #24
On 8/21/23 15:58, Vladimir Oltean wrote:
> On Mon, Aug 21, 2023 at 02:46:53PM -0400, Sean Anderson wrote:
>> After further review, it seems the reason 28g can get away without this
>> is because there's a one-to-one mapping between protocol controllers and
>> lanes. Unfortunately, that regularity is not present for 10g.
>> 
>> --Sean
> 
> There are some things I saw in your phy-fsl-lynx-10g.c driver and device
> tree bindings that I don't understand (the concept of lane groups)

Each lane group corresponds to a phy device (struct phy). For protocols
like PCIe or XAUI which can use multiple lanes, this lets the driver
coordinate configuring all lanes at once in the correct order.

> and
> I'm not sure if they're related with what you're saying here, so if you
> could elaborate a bit more (ideally with an example) on the one-to-one
> mapping and the specific problems it causes, it would be great.

For e.g. the LS2088A, SerDes1 lanes H-A use SG1-8 and XFI1-8. SerDes2
lanes A-H use SG9-16 and XFI9-16. Each lane has its own controller, and
the mapping is 1-to-1. In the PCCRs, each controller uses the same
value, and is mapped in a regular way. So you can go directly from the
lane number to the right value to mask in the PCCR, with a very simple
translation scheme.

For e.g. the LS1046A, SerDes1 lane D uses XFI.9 (aka XFIA) and lane C
uses XFI.10 (aka XFIB). This is the opposite of how SerDes1 lanes A-D
use SGMII.9, .10, .5, and .6 (aka SGMIIA-D).

For e.g. the T4240, SerDes1 lanes A-H use sg1.5, .6, .10, .9, .1, .2,
.3, .4 (aka SGMII E, F, H, G, A, B, C, D).

For e.g. the B4860, SerDes lanes A uses sgmii1 or sgmii5 and B uses
sgmii2 or sgmii6. The MAC selected is determined based on the value
programmed into PCCR2.

While I appreciate that your hardware engineers did a better job for
28g, many 10g serdes arbitrarily map lanes to protocol controllers.
I think the mapping is too irregular to tame, and it is better to say
"if you want this configuration, program this value".

> I may be off with my understanding of the regularity you are talking about,
> but the LX2160 (and Lynx 28G block) also has multi-lane protocols like 40G,
> 100G, assuming that's what you are talking about. I haven't started yet
> working on those for the mtip_backplane driver, but I'm not currently
> seeing a problem with the architecture where a phy_device represents a
> single lane that's part of a multi-lane port, and not an entire group.

Resetting one lane in a group will reset the rest, which could confuse
the driver. Additionally, treating the lanes as one phy lets us set the
reset direction and first lane bits correctly.

> In my imagination, there are 2 cases:
> - all 4 lanes are managed by the single dpaa2-mac consumer (which has 4
>   phandles, and iterates over them with a "for" loop)
> - each of the 4 lanes is managed by the respective backplane AN/LT core,
>   and thus, there's one phandle to each lane

By doing the grouping in the driver, we also simplify the consumer
implementation. The MAC can always use a single phy, without worrying
about the actual number of lanes. This matches the hardware, since the
MAC is going to talk XGMII (or whatever) to the protocol controller
anyway.

I think it will be a lot easier to add multi-lane support with this
method because it gives the driver more information about what's going
on. The driver can control the whole configuration/reset process and the
timing.

> I sketched some dt-bindings for the second case here, so I guess it must
> be the first scenario that's somehow problematic?
> https://cas5-0-urlprotect.trendmicro.com:443/wis/clicktime/v1/query?url=https%3a%2f%2fpatchwork.kernel.org%2fproject%2fnetdevbpf%2fpatch%2f20230817150644.3605105%2d9%2dvladimir.oltean%40nxp.com%2f&umid=9e644233-009e-4197-a266-5d9a85eb1148&auth=d807158c60b7d2502abde8a2fc01f40662980862-cc1d5330d84af8fa40745b165a44849db50f8a67

Yes, no issues with the second case.

--Sean
Vladimir Oltean Aug. 21, 2023, 10:48 p.m. UTC | #25
On Mon, Aug 21, 2023 at 05:06:46PM -0400, Sean Anderson wrote:
> On 8/21/23 15:58, Vladimir Oltean wrote:
> > On Mon, Aug 21, 2023 at 02:46:53PM -0400, Sean Anderson wrote:
> >> After further review, it seems the reason 28g can get away without this
> >> is because there's a one-to-one mapping between protocol controllers and
> >> lanes. Unfortunately, that regularity is not present for 10g.
> >> 
> >> --Sean
> > 
> > There are some things I saw in your phy-fsl-lynx-10g.c driver and device
> > tree bindings that I don't understand (the concept of lane groups)
> 
> Each lane group corresponds to a phy device (struct phy). For protocols
> like PCIe or XAUI which can use multiple lanes, this lets the driver
> coordinate configuring all lanes at once in the correct order.

For PCIe I admit that I don't know. I don't even know if there's any
valid (current or future) use case for having the PCIe controller have a
phandle to the SerDes. Everything else below in my reply assumes Ethernet.

> > and
> > I'm not sure if they're related with what you're saying here, so if you
> > could elaborate a bit more (ideally with an example) on the one-to-one
> > mapping and the specific problems it causes, it would be great.
> 
> For e.g. the LS2088A, SerDes1 lanes H-A use SG1-8 and XFI1-8. SerDes2
> lanes A-H use SG9-16 and XFI9-16. Each lane has its own controller, and
> the mapping is 1-to-1. In the PCCRs, each controller uses the same
> value, and is mapped in a regular way. So you can go directly from the
> lane number to the right value to mask in the PCCR, with a very simple
> translation scheme.
> 
> For e.g. the LS1046A, SerDes1 lane D uses XFI.9 (aka XFIA) and lane C
> uses XFI.10 (aka XFIB). This is the opposite of how SerDes1 lanes A-D
> use SGMII.9, .10, .5, and .6 (aka SGMIIA-D).
> 
> For e.g. the T4240, SerDes1 lanes A-H use sg1.5, .6, .10, .9, .1, .2,
> .3, .4 (aka SGMII E, F, H, G, A, B, C, D).
> 
> For e.g. the B4860, SerDes lanes A uses sgmii1 or sgmii5 and B uses
> sgmii2 or sgmii6. The MAC selected is determined based on the value
> programmed into PCCR2.

It's so exceedingly unlikely that B4860 will gain support for anything
new, that it's not even worth talking about it, or even considering it
in the design of a new driver. Just forget about it.

Let's concentrate on the Layerscapes, and on the T series to the extent
that we're not going out of our way to support them with a fairly simple
design.

In the Lynx 10G block guide that I have, PCCR2 is a register that does
something completely different from Ethernet. I'm not sure if B4860 has
a Lynx 10G block and not something else.

> While I appreciate that your hardware engineers did a better job for
> 28g, many 10g serdes arbitrarily map lanes to protocol controllers.
> I think the mapping is too irregular to tame, and it is better to say
> "if you want this configuration, program this value".

Ok, but that's a lateral argument (or I'm not understanding the connection).

Some maintainers (Mark Brown for sure, from my personal experience) prefer
that expert-level knowledge of hardware should be hardcoded into the kernel
driver based on known stuff such as the SoC-specific compatible string.
I certainly share the same view.

With your case, I think that argument is even more pertinent, because
IIUC, the lane group protocols won't be put in the SoC .dtsi (so as to
be written only once), but in the board device trees (since the
available protocols invariably depend upon the board provisioning).
So, non-expert board device tree writers will have to know what's with
the PCCR stuff. Pretty brain-intensive.

> > I may be off with my understanding of the regularity you are talking about,
> > but the LX2160 (and Lynx 28G block) also has multi-lane protocols like 40G,
> > 100G, assuming that's what you are talking about. I haven't started yet
> > working on those for the mtip_backplane driver, but I'm not currently
> > seeing a problem with the architecture where a phy_device represents a
> > single lane that's part of a multi-lane port, and not an entire group.
> 
> Resetting one lane in a group will reset the rest, which could confuse
> the driver. Additionally, treating the lanes as one phy lets us set the
> reset direction and first lane bits correctly.

Yeah, in theory that is probably correct, but in practice can't we hide
our head in the sand and say that the "phys" phandles have to have the
lanes in the same order as LNmGCR0[1STLANE] expects them (which is also
the "natural" order as the SoC RM describes it)? With a "for" loop
implementation in the MAC, that would work just fine 100% of the time.
Doing more intricate massaging of the "phys" in the consumer, and you're
just asking for trouble. My 2 cents.

Sure, it's the same kind of ask of a board device tree writer as "hey,
please give me a good PCCR value", but I honestly think that the head
scratching will be much less severe.

> > In my imagination, there are 2 cases:
> > - all 4 lanes are managed by the single dpaa2-mac consumer (which has 4
> >   phandles, and iterates over them with a "for" loop)
> > - each of the 4 lanes is managed by the respective backplane AN/LT core,
> >   and thus, there's one phandle to each lane
> 
> By doing the grouping in the driver, we also simplify the consumer
> implementation. The MAC can always use a single phy, without worrying
> about the actual number of lanes. This matches the hardware, since the
> MAC is going to talk XGMII (or whatever) to the protocol controller
> anyway.

XGMII is the link between the MAC and the PCS, but the "phys" phandle to
the SerDes gives insight to the MAC driver way beyond the PCS layer.
That kinda invalidates the idea that "you don't need to worry about the
actual number of lanes". When you're a MAC driver with an XLAUI link and
you need insight into the PMA/PMD layer, you'd better not be surprised
about the fact that there are 4 lanes, at the very least?

> I think it will be a lot easier to add multi-lane support with this
> method because it gives the driver more information about what's going
> on. The driver can control the whole configuration/reset process and the
> timing.

Also, I'm thinking that if you support multi-lane (which dpaa2-mac currently
doesn't, even in LSDK), you can't avoid multiple "phys" phandles in dpaa2-mac,
and a "for" loop, eventually, anyway. That's because if your lanes have protocol
retimers on them, those are going to be modeled as generic PHYs too. And those
will not be bundled in these "groups", because they might be one chip per lane.

The retimer thing isn't theoretical, but, due to reasons independent of NXP,
we lack the ability to provide an upstream user of the "lane retimer as
generic PHY" functionality in dpaa2-mac. So it stays downstream for now.
https://github.com/nxp-qoriq/linux/commit/627c5f626a13657f46f68b90882f329310e0e22f

So, if you're thinking of easing the work of the consumer side, I'm not
sure that the gains will be that high.

Otherwise, I will repeat the idea that lynx-10g and lynx-28g should be
treated in unison, because some drivers (dpaa2-mac, mtip_backplane) will
have to interface with both, and I don't really believe that major deviations
in software architecture between the 2 SerDes drivers are justifiable in
any way (multi-protocol handled differently, for example).
Sean Anderson Aug. 21, 2023, 11:39 p.m. UTC | #26
On 8/21/23 18:48, Vladimir Oltean wrote:
> On Mon, Aug 21, 2023 at 05:06:46PM -0400, Sean Anderson wrote:
>> On 8/21/23 15:58, Vladimir Oltean wrote:
>> > On Mon, Aug 21, 2023 at 02:46:53PM -0400, Sean Anderson wrote:
>> >> After further review, it seems the reason 28g can get away without this
>> >> is because there's a one-to-one mapping between protocol controllers and
>> >> lanes. Unfortunately, that regularity is not present for 10g.
>> >> 
>> >> --Sean
>> > 
>> > There are some things I saw in your phy-fsl-lynx-10g.c driver and device
>> > tree bindings that I don't understand (the concept of lane groups)
>> 
>> Each lane group corresponds to a phy device (struct phy). For protocols
>> like PCIe or XAUI which can use multiple lanes, this lets the driver
>> coordinate configuring all lanes at once in the correct order.
> 
> For PCIe I admit that I don't know. I don't even know if there's any
> valid (current or future) use case for having the PCIe controller have a
> phandle to the SerDes. Everything else below in my reply assumes Ethernet.

One use case could be selecting PCIe/SATA as appropriate for an M.2
card. Another use case could be allocating lanes to different slots
based on card presence (e.g. 1 card use x4, 2+ cards use x1). I recently
bought a motherboard whose manual says

| PCI_E4 [normally x4] will run x1 speed when installing devices in PCI_E2/
| PCI_E3/ PCI_E5 slot [all x1].
which implies exactly this sort of design.

>> > and
>> > I'm not sure if they're related with what you're saying here, so if you
>> > could elaborate a bit more (ideally with an example) on the one-to-one
>> > mapping and the specific problems it causes, it would be great.
>> 
>> For e.g. the LS2088A, SerDes1 lanes H-A use SG1-8 and XFI1-8. SerDes2
>> lanes A-H use SG9-16 and XFI9-16. Each lane has its own controller, and
>> the mapping is 1-to-1. In the PCCRs, each controller uses the same
>> value, and is mapped in a regular way. So you can go directly from the
>> lane number to the right value to mask in the PCCR, with a very simple
>> translation scheme.
>> 
>> For e.g. the LS1046A, SerDes1 lane D uses XFI.9 (aka XFIA) and lane C
>> uses XFI.10 (aka XFIB). This is the opposite of how SerDes1 lanes A-D
>> use SGMII.9, .10, .5, and .6 (aka SGMIIA-D).
>> 
>> For e.g. the T4240, SerDes1 lanes A-H use sg1.5, .6, .10, .9, .1, .2,
>> .3, .4 (aka SGMII E, F, H, G, A, B, C, D).
>> 
>> For e.g. the B4860, SerDes lanes A uses sgmii1 or sgmii5 and B uses
>> sgmii2 or sgmii6. The MAC selected is determined based on the value
>> programmed into PCCR2.
> 
> It's so exceedingly unlikely that B4860 will gain support for anything
> new, that it's not even worth talking about it, or even considering it
> in the design of a new driver. Just forget about it.
> 
> Let's concentrate on the Layerscapes, and on the T series to the extent
> that we're not going out of our way to support them with a fairly simple
> design.

Well, these are just easy ways to show the oddities in the PCCRs. I
could have made the same points with the various Layerscapes.

> In the Lynx 10G block guide that I have, PCCR2 is a register that does
> something completely different from Ethernet. I'm not sure if B4860 has
> a Lynx 10G block and not something else.

T-series SoCs use a different PCCR layout than Layerscape, despite using
the same registers in the rest of the SerDes. The LS1021 also uses this
scheme, so it's not just T-series (but it's close enough). This is why I
had an option for different PCCR configuration functions in earlier
versions of this series, so if someone wanted they could easily add
T-series support.

>> While I appreciate that your hardware engineers did a better job for
>> 28g, many 10g serdes arbitrarily map lanes to protocol controllers.
>> I think the mapping is too irregular to tame, and it is better to say
>> "if you want this configuration, program this value".
> 
> Ok, but that's a lateral argument (or I'm not understanding the connection).

To restate, there's no systemic organization to the PCCRs. The driver
configuration should reflect this and allow arbitrary mappings.

> Some maintainers (Mark Brown for sure, from my personal experience) prefer
> that expert-level knowledge of hardware should be hardcoded into the kernel
> driver based on known stuff such as the SoC-specific compatible string.
> I certainly share the same view.
> 
> With your case, I think that argument is even more pertinent, because
> IIUC, the lane group protocols won't be put in the SoC .dtsi (so as to
> be written only once), but in the board device trees (since the
> available protocols invariably depend upon the board provisioning).
> So, non-expert board device tree writers will have to know what's with
> the PCCR stuff. Pretty brain-intensive.

The idea is to stick all the PCCR stuff in the SoC devicetree, and the
board-level devicetrees just set up the clocks and pick which MACs use
which phys. Have a look at patches 10 and 15 of this series for some
typical board configurations. I think it's pretty simple, since all the
complexity is in the SoC dt.

That said, I originally had the driver set up so that everything was
based on the compatible, but I had to change that because it was nacked
by the devicetree people.

>> > I may be off with my understanding of the regularity you are talking about,
>> > but the LX2160 (and Lynx 28G block) also has multi-lane protocols like 40G,
>> > 100G, assuming that's what you are talking about. I haven't started yet
>> > working on those for the mtip_backplane driver, but I'm not currently
>> > seeing a problem with the architecture where a phy_device represents a
>> > single lane that's part of a multi-lane port, and not an entire group.
>> 
>> Resetting one lane in a group will reset the rest, which could confuse
>> the driver. Additionally, treating the lanes as one phy lets us set the
>> reset direction and first lane bits correctly.
> 
> Yeah, in theory that is probably correct, but in practice can't we hide
> our head in the sand and say that the "phys" phandles have to have the
> lanes in the same order as LNmGCR0[1STLANE] expects them (which is also
> the "natural" order as the SoC RM describes it)? With a "for" loop
> implementation in the MAC, that would work just fine 100% of the time.
> Doing more intricate massaging of the "phys" in the consumer, and you're
> just asking for trouble. My 2 cents.

Yeah, but from the perspective of the driver, if we have one phy per
lane we don't get any indication from the subsystem that we are doing a
multi-lane configuration. So we need to stick this sort of thing into
the phy handle. But IMO we can do all this in the driver and the board
integrator never has to worry about it. 

> Sure, it's the same kind of ask of a board device tree writer as "hey,
> please give me a good PCCR value", but I honestly think that the head
> scratching will be much less severe.
> 
>> > In my imagination, there are 2 cases:
>> > - all 4 lanes are managed by the single dpaa2-mac consumer (which has 4
>> >   phandles, and iterates over them with a "for" loop)
>> > - each of the 4 lanes is managed by the respective backplane AN/LT core,
>> >   and thus, there's one phandle to each lane
>> 
>> By doing the grouping in the driver, we also simplify the consumer
>> implementation. The MAC can always use a single phy, without worrying
>> about the actual number of lanes. This matches the hardware, since the
>> MAC is going to talk XGMII (or whatever) to the protocol controller
>> anyway.
> 
> XGMII is the link between the MAC and the PCS, but the "phys" phandle to
> the SerDes gives insight to the MAC driver way beyond the PCS layer.
> That kinda invalidates the idea that "you don't need to worry about the
> actual number of lanes". When you're a MAC driver with an XLAUI link and
> you need insight into the PMA/PMD layer, you'd better not be surprised
> about the fact that there are 4 lanes, at the very least?

Well, this is why they are condensed into a "lane group". From the MAC's
perspective the whole thing is opaque, and gets handled by the phy
subsystem.

>> I think it will be a lot easier to add multi-lane support with this
>> method because it gives the driver more information about what's going
>> on. The driver can control the whole configuration/reset process and the
>> timing.
> 
> Also, I'm thinking that if you support multi-lane (which dpaa2-mac currently
> doesn't, even in LSDK), you can't avoid multiple "phys" phandles in dpaa2-mac,
> and a "for" loop, eventually, anyway. That's because if your lanes have protocol
> retimers on them, those are going to be modeled as generic PHYs too. And those
> will not be bundled in these "groups", because they might be one chip per lane.
> 
> The retimer thing isn't theoretical, but, due to reasons independent of NXP,
> we lack the ability to provide an upstream user of the "lane retimer as
> generic PHY" functionality in dpaa2-mac. So it stays downstream for now.
> https://cas5-0-urlprotect.trendmicro.com:443/wis/clicktime/v1/query?url=https%3a%2f%2fgithub.com%2fnxp%2dqoriq%2flinux%2fcommit%2f627c5f626a13657f46f68b90882f329310e0e22f&umid=54bd2b00-07e7-4f55-8e2a-ed7b7cf7d142&auth=d807158c60b7d2502abde8a2fc01f40662980862-a6780f8e227e850b7785d5afbae1abed18ded9d9
> 
> So, if you're thinking of easing the work of the consumer side, I'm not
> sure that the gains will be that high.

Well, FWIW those serdes don't need good coordination. That said, I
think it might be interesting to have some subsystem-level support for
this, much like clock groups.

> Otherwise, I will repeat the idea that lynx-10g and lynx-28g should be
> treated in unison, because some drivers (dpaa2-mac, mtip_backplane) will
> have to interface with both, and I don't really believe that major deviations
> in software architecture between the 2 SerDes drivers are justifiable in
> any way (multi-protocol handled differently, for example).

Well, I think we should take the opportunity to think about the hardware
which exists and how we plan to model it. IMO grouping lanes into a
single phy simplifies both the phy driver and the mac driver.

--Sean
Vladimir Oltean Aug. 21, 2023, 11:59 p.m. UTC | #27
On Mon, Aug 21, 2023 at 07:39:15PM -0400, Sean Anderson wrote:
> Well, I think we should take the opportunity to think about the hardware
> which exists and how we plan to model it. IMO grouping lanes into a
> single phy simplifies both the phy driver and the mac driver.

Ok, but ungrouped for backplane and grouped for !backplane? For the KR
link modes, parallel link training, with separate consumers per lanes in
a group, will be needed per lane.
Ioana Ciornei Aug. 22, 2023, 2:55 p.m. UTC | #28
On Mon, Aug 21, 2023 at 02:46:53PM -0400, Sean Anderson wrote:
> On 8/21/23 14:13, Ioana Ciornei wrote:
> > On Mon, Aug 21, 2023 at 01:45:44PM -0400, Sean Anderson wrote:
> >> Well, we have two pieces of information we need
> >> 
> >> - What values do we need to program in the PCCRs to select a particular
> >>   mode? This includes whether to e.g. set the KX bits.
> >> - Implied by the above, what protocols are supported on which lanes?
> >>   This is not strictly necessary, but will certainly solve a lot of
> >>   headscratching.
> >> 
> >> This information varies between different socs, and different serdes on
> >> the same socs. We can't really look at the RCW or the clocks and figure
> >> out what we need to program. So what are our options?
> >> 
> >> - We can have a separate compatible for each serdes on each SoC (e.g.
> >>   "fsl,lynx-10g-a"). This was rejected by the devicetree maintainers.

I previously took this statement at face value and didn't further
investigate. After a bit of digging through the first versions of this
patch set it's evident that you left out a big piece of information.

The devicetree maintainers have indeed rejected compatible strings of
the "fsl,<soc-name>-serdes-<instance>" form but they also suggested to
move the numbering to a property instead:

https://lore.kernel.org/all/db9d9455-37af-1616-8f7f-3d752e7930f1@linaro.org/

But instead of doing that, you chose to move all the different details
that vary between SerDes blocks/SoCs from the driver to the DTS. I don't
see that this was done in response to explicit feedback.

> >> - We can have one compatible for each SoC, and determine the serdes
> >>   based on the address. I would like to avoid this...
> > 
> > To me this really seems like a straightforward approach.
> 
> Indeed it would be straightforward, but what's the point of having a
> devicetree in the first place then? We could just go back to being a
> (non-dt) platform device.
> 

I am confused why you are now so adamant to have these details into the
DTS. Your first approach was to put them into the driver, not the DTS:

https://lore.kernel.org/netdev/20220628221404.1444200-5-sean.anderson@seco.com/

And this approach could still work now and get accepted by the device
tree maintainers. The only change that would be needed is to add a
property like "fsl,serdes-block-id = <1>".

> >> - We can stick all the details which vary between serdes/socs into the
> >>   device tree. This is very flexible, since supporting new SoCs is
> >>   mostly a matter of adding a new compatible and writing a new
> >>   devicetree. On the other hand, if you have a bug in your devicetree,
> >>   it's not easy to fix it in the kernel.
> >> - Just don't support protocol switching. The 28G driver does this, which
> >>   is why it only has one compatible. However, supporting protocol
> >>   switching is a core goal of this driver, so dropping support is not an
> >>   option.
> >> 

Ioana
Sean Anderson Aug. 24, 2023, 8:54 p.m. UTC | #29
On 8/22/23 10:55, Ioana Ciornei wrote:
> On Mon, Aug 21, 2023 at 02:46:53PM -0400, Sean Anderson wrote:
>> On 8/21/23 14:13, Ioana Ciornei wrote:
>> > On Mon, Aug 21, 2023 at 01:45:44PM -0400, Sean Anderson wrote:
>> >> Well, we have two pieces of information we need
>> >> 
>> >> - What values do we need to program in the PCCRs to select a particular
>> >>   mode? This includes whether to e.g. set the KX bits.
>> >> - Implied by the above, what protocols are supported on which lanes?
>> >>   This is not strictly necessary, but will certainly solve a lot of
>> >>   headscratching.
>> >> 
>> >> This information varies between different socs, and different serdes on
>> >> the same socs. We can't really look at the RCW or the clocks and figure
>> >> out what we need to program. So what are our options?
>> >> 
>> >> - We can have a separate compatible for each serdes on each SoC (e.g.
>> >>   "fsl,lynx-10g-a"). This was rejected by the devicetree maintainers.
> 
> I previously took this statement at face value and didn't further
> investigate. After a bit of digging through the first versions of this
> patch set it's evident that you left out a big piece of information.
> 
> The devicetree maintainers have indeed rejected compatible strings of
> the "fsl,<soc-name>-serdes-<instance>" form but they also suggested to
> move the numbering to a property instead:
> 
> https://cas5-0-urlprotect.trendmicro.com:443/wis/clicktime/v1/query?url=https%3a%2f%2flore.kernel.org%2fall%2fdb9d9455%2d37af%2d1616%2d8f7f%2d3d752e7930f1%40linaro.org%2f&umid=2d629417-3b95-49e4-8cdb-34737cc93582&auth=d807158c60b7d2502abde8a2fc01f40662980862-895c2dfe1c33719569d44ae2b51e21f626f39d39
> 
> But instead of doing that, you chose to move all the different details
> that vary between SerDes blocks/SoCs from the driver to the DTS. I don't
> see that this was done in response to explicit feedback.
> 
>> >> - We can have one compatible for each SoC, and determine the serdes
>> >>   based on the address. I would like to avoid this...
>> > 
>> > To me this really seems like a straightforward approach.
>> 
>> Indeed it would be straightforward, but what's the point of having a
>> devicetree in the first place then? We could just go back to being a
>> (non-dt) platform device.
>> 
> 
> I am confused why you are now so adamant to have these details into the
> DTS. Your first approach was to put them into the driver, not the DTS:
> 
> https://cas5-0-urlprotect.trendmicro.com:443/wis/clicktime/v1/query?url=https%3a%2f%2flore.kernel.org%2fnetdev%2f20220628221404.1444200%2d5%2dsean.anderson%40seco.com%2f&umid=2d629417-3b95-49e4-8cdb-34737cc93582&auth=d807158c60b7d2502abde8a2fc01f40662980862-64ea8fa45172282f676b7463a5401e8a7c5bdcbf
> 
> And this approach could still work now and get accepted by the device
> tree maintainers. The only change that would be needed is to add a
> property like "fsl,serdes-block-id = <1>".

https://lore.kernel.org/linux-phy/1c2bbc12-0aa5-6d2a-c701-577ce70f7502@linaro.org/

Despite what he says in your link, I explicitly proposed doing exactly
that and he rejected it. I suspect that despite accusing me of
"twisting" the conversation, he did not clearly remember that
exchange...

That said, maybe we could do something like 

serdes: phy@1ea0000 {
	compatible = "fsl,ls1046a-serdes";
	reg = <0x1ea0000 0x1000>, <0x1eb0000 0x1000>;
};

--Sean
Sean Anderson Aug. 24, 2023, 10:09 p.m. UTC | #30
On 8/21/23 19:59, Vladimir Oltean wrote:
> On Mon, Aug 21, 2023 at 07:39:15PM -0400, Sean Anderson wrote:
>> Well, I think we should take the opportunity to think about the hardware
>> which exists and how we plan to model it. IMO grouping lanes into a
>> single phy simplifies both the phy driver and the mac driver.
> 
> Ok, but ungrouped for backplane and grouped for !backplane? For the KR
> link modes, parallel link training, with separate consumers per lanes in
> a group, will be needed per lane.

Hm, this is the sort of thing I hadn't considered since separate link
training isn't necessary for lynx 10g. But couldn't this be done by
adding a "lane" parameter to phy_configure_opts_xgkr?

Although, I am not sure how the driver is supposed to figure out what
coefficients to use. c73 implies that the training frame should be sent
on each lane. So I expected that there would be four copies of the
link coefficient registers. However, when reading the LX2160ARM, I only
saw one set of registers (e.g. 26.6.3.3). So is link training done
serially? I didn't see anything like a "lane select" field.

--Sean
Vladimir Oltean Aug. 25, 2023, 2:43 p.m. UTC | #31
On Thu, Aug 24, 2023 at 06:09:52PM -0400, Sean Anderson wrote:
> On 8/21/23 19:59, Vladimir Oltean wrote:
> > On Mon, Aug 21, 2023 at 07:39:15PM -0400, Sean Anderson wrote:
> >> Well, I think we should take the opportunity to think about the hardware
> >> which exists and how we plan to model it. IMO grouping lanes into a
> >> single phy simplifies both the phy driver and the mac driver.
> > 
> > Ok, but ungrouped for backplane and grouped for !backplane? For the KR
> > link modes, parallel link training, with separate consumers per lanes in
> > a group, will be needed per lane.
> 
> Hm, this is the sort of thing I hadn't considered since separate link
> training isn't necessary for lynx 10g. But couldn't this be done by
> adding a "lane" parameter to phy_configure_opts_xgkr?
> 
> Although, I am not sure how the driver is supposed to figure out what
> coefficients to use. c73 implies that the training frame should be sent
> on each lane. So I expected that there would be four copies of the
> link coefficient registers. However, when reading the LX2160ARM, I only
> saw one set of registers (e.g. 26.6.3.3). So is link training done
> serially? I didn't see anything like a "lane select" field.
> 
> --Sean

There is one AN/LT block replicated for each lane, even for multi-lane
backplane protocols. The primary (first) AN/LT block handles C73 autoneg
+ C73 link training on that lane, and the secondary AN/LT blocks handle
just link training on their respective lanes.

In other words, each AN/LT block needs to interact with just its lane
(SerDes PHY). A "lane" parameter could be added to phy_configure_opts_xgkr
to work around the "grouped lanes" design, but it would complicate the
consumer implementation, since the AN/LT block does not otherwise need
to know what is the index of the SerDes lane it is attached to (so it
would need something like an extra device tree property).

C72 link training is independent on each lane, has independent AN/LT
block MDIO registers, independent SerDes lane registers, and independent
training frame exchanges. There is no "lane select" field.

You can see the "LX2160A lanes A, B, C, D with SerDes 1 protocol 19:
dpmac2 uses 40GBase-KR4" example in my backplane dt-bindings patch,
which shows how on dpmac2's internal MDIO bus, there are AN/LT devices
at MDIO addresses 7, 6, 5 and 4, one for each lane.

I know that Lynx 10G doesn't do multi-lane backplane, but I wouldn't
want Lynx 10G and Lynx 28G to have different designs when it comes to
their handling of multi-lane. A single design that works for both would
be great.
Vladimir Oltean Sept. 13, 2023, 10:02 p.m. UTC | #32
Hi Sean,

On Thu, Aug 10, 2023 at 03:58:36PM -0400, Sean Anderson wrote:
> I can look into doing this. It will be in my free time, so it will
> likely be a bit before I can update this series.

I was expecting you'd ask some clarification questions about the RCW
override procedure that I've informally described over email, so I guess
you haven't spent any more time on this.

I'm letting you know that very soon, I will have to start my work on
porting the backplane driver posted here:
https://patchwork.kernel.org/project/netdevbpf/cover/20230817150644.3605105-1-vladimir.oltean@nxp.com/
to the Lynx 10G SoCs. And for that, I need a SerDes driver as a base :)

I was wondering how inclined are you to respond positively to the
feedback that the lynx-10g driver should have a look and feel as close
as possible to lynx-28g, given that they're very similar.

Because internally within NXP, we do have a version of the lynx-10g
driver which is contemporary with lynx-28g from mainline, but we didn't
publish it because protocol changes didn't work (for the same reason
that they don't work with your driver). With that driver, you can think
of the feedback about the similar look and feel as being "implicitly applied"
(being written by the same author), so I'm starting to consider more and
more seriously the option of basing my work on that instead of your v14
(on which I'd need to spend extra time to modify the dt-bindings with PCCRs,
concept of lane groups, concept of PLL CCF driver, etc).

What are your thoughts?