Message ID | 20200601203041.58631-1-kurt@intricatesoftware.com |
---|---|
State | New |
Headers | show |
Series | rockchip: Add delay after link-training | expand |
On Tue, Jun 2, 2020 at 2:00 AM Kurt Miller <kurt at intricatesoftware.com> wrote: > > On at least the RockPro64, many cards will trip a > synchronous abort when first accessing PCIe config space > during bus scanning. A delay after link training allows > some of these cards to function. Can you check does the SoC has external PCIe pwr-pin GPIO? I did see unstable SSD behavior on rock960 but fixed with this. https://github.com/radxa/u-boot/blob/stable-4.4-rockpi4/board/rockchip/evb_rk3399/evb-rk3399.c#L168
On Tue, 2020-06-02 at 02:16 +0530, Jagan Teki wrote: > On Tue, Jun 2, 2020 at 2:00 AM Kurt Miller <kurt at intricatesoftware.com> wrote: > > > > > > On at least the RockPro64, many cards will trip a > > synchronous abort when first accessing PCIe config space > > during bus scanning. A delay after link training allows > > some of these cards to function. > Can you check does the SoC has external PCIe pwr-pin GPIO? > > I did see unstable SSD behavior on rock960 but fixed with this. > https://github.com/radxa/u-boot/blob/stable-4.4-rockpi4/board/rockchip/evb_rk3399/evb-rk3399.c#L168 The schematic has: GPIO1_D0/TCPD_VBUS_SOURCE2_d ---L26---->>PCIE_PWR and?arch/arm/dts/rk3399-rockpro64.dtsi has: &pinctrl { ? ? ? ? pcie { ? ? ? ? ? ? ? ? pcie_pwr_en: pcie-pwr-en { ????????????????????????rockchip,pins = <1 RK_PD0 RK_FUNC_GPIO &pcfg_pull_none>; ????????????????}; ????????}; }; Does that answer your question? I'm rather new at this so I may need more guidance if I miss understood your question. Thanks, -Kurt
Hi Kurt, On 2020/6/2 ??4:30, Kurt Miller wrote: > On at least the RockPro64, many cards will trip a > synchronous abort when first accessing PCIe config space > during bus scanning. A delay after link training allows > some of these cards to function. > > Signed-off-by: Kurt Miller <kurt at intricatesoftware.com> > --- > On the RockPro64, some pci cards trip a synchronous abort when scanning the > pci bus. For example with HighPoint Rocket Raid 640L which is based on > Marvell 88SE9230 I see this: > > => pci > "Synchronous Abort" handler, esr 0x96000210 > elr: 000000000022d034 lr : 000000000022cfd0 (reloc) > elr: 00000000f4568034 lr : 00000000f4567fd0 > x0 : 0000000000100000 x1 : 00000000f8000000 > x2 : 0000000000000000 x3 : 0000000000100000 > x4 : 00000000f2559290 x5 : 0000000000000000 > x6 : 0000000000000001 x7 : 00000000f2559860 > x8 : 0000000000000030 x9 : 0000000000000008 > x10: 0000000000000010 x11: 00000000f251fd1c > x12: 0000000000001421 x13: 0000000000001468 > x14: 00000000f251fd4c x15: 00000000ffffffff > x16: 0000000000060001 x17: 000000000000001f > x18: 00000000f2532dc0 x19: 00000000f251fcd0 > x20: 0000000000000001 x21: 0000000000000000 > x22: 0000000000010000 x23: 00000000f45d4000 > x24: 0000000000000000 x25: 00000000f45bc000 > x26: 0000000000000000 x27: 0000000000000000 > x28: 00000000f2541440 x29: 00000000f251fc20 > > Code: 540000c1 350000a5 93407c00 f9400081 (b8616800) > Resetting CPU ... > > Adding a delay after link training works-around the problem. I added this > delay to the OpenBSD rkpcie driver as well: > > https://github.com/openbsd/src/commit/9857dee3520d8ca5bec68538f4b0708d7e64fc87 > > HighPoint Rocket Raid 640L needs a 1.75 sec delay and Crossfield SAS9211-4i > needs a 1 second delay, so I arbitrarily decided on 2 seconds. > > The delay work-around was originally discovered by nuumio: > https://github.com/nuumio/linux-kernel/commit/5a65b17686002dc84d461bffa324a2cb68e67aee > > drivers/pci/pcie_rockchip.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/drivers/pci/pcie_rockchip.c b/drivers/pci/pcie_rockchip.c > index 0edc2464a8..51cfbf6b18 100644 > --- a/drivers/pci/pcie_rockchip.c > +++ b/drivers/pci/pcie_rockchip.c > @@ -288,6 +288,14 @@ static int rockchip_pcie_init_port(struct udevice *dev) > goto err_power_off_phy; > } > > + /* > + * XXX: On at least the RockPro64, many cards will trip a > + * synchronous abort when first accessing PCIe config space > + * during bus scanning. A delay after link training allows > + * some of these cards to function. > + */ > + mdelay(2000); I don't understand what kind of delay for init needs 2 Seconds, root cause will preferred. Thanks, - Kever > + > /* Initialize Root Complex registers. */ > writel(PCIE_LM_VENDOR_ROCKCHIP, priv->apb_base + PCIE_LM_VENDOR_ID); > writel(PCI_CLASS_BRIDGE_PCI << 16,
? 2020/6/2 9:59, Kever Yang ??: > Hi Kurt, > > On 2020/6/2 ??4:30, Kurt Miller wrote: >> On at least the RockPro64, many cards will trip a >> synchronous abort when first accessing PCIe config space >> during bus scanning. A delay after link training allows >> some of these cards to function. >> >> Signed-off-by: Kurt Miller <kurt at intricatesoftware.com> >> --- >> On the RockPro64, some pci cards trip a synchronous abort when >> scanning the >> pci bus. For example with HighPoint Rocket Raid 640L which is based on >> Marvell 88SE9230 I see this: >> >> => pci >> "Synchronous Abort" handler, esr 0x96000210 >> elr: 000000000022d034 lr : 000000000022cfd0 (reloc) >> elr: 00000000f4568034 lr : 00000000f4567fd0 >> x0 : 0000000000100000 x1 : 00000000f8000000 >> x2 : 0000000000000000 x3 : 0000000000100000 >> x4 : 00000000f2559290 x5 : 0000000000000000 >> x6 : 0000000000000001 x7 : 00000000f2559860 >> x8 : 0000000000000030 x9 : 0000000000000008 >> x10: 0000000000000010 x11: 00000000f251fd1c >> x12: 0000000000001421 x13: 0000000000001468 >> x14: 00000000f251fd4c x15: 00000000ffffffff >> x16: 0000000000060001 x17: 000000000000001f >> x18: 00000000f2532dc0 x19: 00000000f251fcd0 >> x20: 0000000000000001 x21: 0000000000000000 >> x22: 0000000000010000 x23: 00000000f45d4000 >> x24: 0000000000000000 x25: 00000000f45bc000 >> x26: 0000000000000000 x27: 0000000000000000 >> x28: 00000000f2541440 x29: 00000000f251fc20 >> >> Code: 540000c1 350000a5 93407c00 f9400081 (b8616800) >> Resetting CPU ... >> >> Adding a delay after link training works-around the problem. I added this >> delay to the OpenBSD rkpcie driver as well: >> >> https://github.com/openbsd/src/commit/9857dee3520d8ca5bec68538f4b0708d7e64fc87 >> >> >> HighPoint Rocket Raid 640L needs a 1.75 sec delay and Crossfield >> SAS9211-4i >> needs a 1 second delay, so I arbitrarily decided on 2 seconds. >> >> The delay work-around was originally discovered by nuumio: >> https://github.com/nuumio/linux-kernel/commit/5a65b17686002dc84d461bffa324a2cb68e67aee >> >> >> ? drivers/pci/pcie_rockchip.c | 8 ++++++++ >> ? 1 file changed, 8 insertions(+) >> >> diff --git a/drivers/pci/pcie_rockchip.c b/drivers/pci/pcie_rockchip.c >> index 0edc2464a8..51cfbf6b18 100644 >> --- a/drivers/pci/pcie_rockchip.c >> +++ b/drivers/pci/pcie_rockchip.c >> @@ -288,6 +288,14 @@ static int rockchip_pcie_init_port(struct udevice >> *dev) >> ????????? goto err_power_off_phy; >> ????? } >> +??? /* >> +???? * XXX: On at least the RockPro64, many cards will trip a >> +???? * synchronous abort when first accessing PCIe config space >> +???? * during bus scanning. A delay after link training allows >> +???? * some of these cards to function. >> +???? */ >> +??? mdelay(2000); > > I don't understand what kind of delay for init needs 2 Seconds, root > cause will preferred. Strictly speaking, how long should we need for this had been provided, see this: https://patchwork.kernel.org/patch/11561977/ If you need more delay, I highly suspect you should check if the power supply is stable before enabling training. > > > Thanks, > > - Kever > >> + >> ????? /* Initialize Root Complex registers. */ >> ????? writel(PCIE_LM_VENDOR_ROCKCHIP, priv->apb_base + >> PCIE_LM_VENDOR_ID); >> ????? writel(PCI_CLASS_BRIDGE_PCI << 16, > > >
On Tue, 2020-06-02 at 10:23 +0800, Shawn Lin wrote: > > ? 2020/6/2 9:59, Kever Yang ??: > > > > Hi Kurt, > > > > On 2020/6/2 ??4:30, Kurt Miller wrote: > > > > > > On at least the RockPro64, many cards will trip a > > > synchronous abort when first accessing PCIe config space > > > during bus scanning. A delay after link training allows > > > some of these cards to function. > > > > > > Signed-off-by: Kurt Miller <kurt at intricatesoftware.com> > > > --- > > > On the RockPro64, some pci cards trip a synchronous abort when? > > > scanning the > > > pci bus. For example with HighPoint Rocket Raid 640L which is based on > > > Marvell 88SE9230 I see this: > > > > > > => pci > > > "Synchronous Abort" handler, esr 0x96000210 > > > elr: 000000000022d034 lr : 000000000022cfd0 (reloc) > > > elr: 00000000f4568034 lr : 00000000f4567fd0 > > > x0 : 0000000000100000 x1 : 00000000f8000000 > > > x2 : 0000000000000000 x3 : 0000000000100000 > > > x4 : 00000000f2559290 x5 : 0000000000000000 > > > x6 : 0000000000000001 x7 : 00000000f2559860 > > > x8 : 0000000000000030 x9 : 0000000000000008 > > > x10: 0000000000000010 x11: 00000000f251fd1c > > > x12: 0000000000001421 x13: 0000000000001468 > > > x14: 00000000f251fd4c x15: 00000000ffffffff > > > x16: 0000000000060001 x17: 000000000000001f > > > x18: 00000000f2532dc0 x19: 00000000f251fcd0 > > > x20: 0000000000000001 x21: 0000000000000000 > > > x22: 0000000000010000 x23: 00000000f45d4000 > > > x24: 0000000000000000 x25: 00000000f45bc000 > > > x26: 0000000000000000 x27: 0000000000000000 > > > x28: 00000000f2541440 x29: 00000000f251fc20 > > > > > > Code: 540000c1 350000a5 93407c00 f9400081 (b8616800) > > > Resetting CPU ... > > > > > > Adding a delay after link training works-around the problem. I added this > > > delay to the OpenBSD rkpcie driver as well: > > > > > > https://github.com/openbsd/src/commit/9857dee3520d8ca5bec68538f4b0708d7e64fc87? > > > > > > > > > HighPoint Rocket Raid 640L needs a 1.75 sec delay and Crossfield? > > > SAS9211-4i > > > needs a 1 second delay, so I arbitrarily decided on 2 seconds. > > > > > > The delay work-around was originally discovered by nuumio: > > > https://github.com/nuumio/linux-kernel/commit/5a65b17686002dc84d461bffa324a2cb68e67aee? > > > > > > > > > ? drivers/pci/pcie_rockchip.c | 8 ++++++++ > > > ? 1 file changed, 8 insertions(+) > > > > > > diff --git a/drivers/pci/pcie_rockchip.c b/drivers/pci/pcie_rockchip.c > > > index 0edc2464a8..51cfbf6b18 100644 > > > --- a/drivers/pci/pcie_rockchip.c > > > +++ b/drivers/pci/pcie_rockchip.c > > > @@ -288,6 +288,14 @@ static int rockchip_pcie_init_port(struct udevice? > > > *dev) > > > ????????? goto err_power_off_phy; > > > ????? } > > > +??? /* > > > +???? * XXX: On at least the RockPro64, many cards will trip a > > > +???? * synchronous abort when first accessing PCIe config space > > > +???? * during bus scanning. A delay after link training allows > > > +???? * some of these cards to function. > > > +???? */ > > > +??? mdelay(2000); > > I don't understand what kind of delay for init needs 2 Seconds, root? > > cause will preferred. Hi Kever, While working on this issue for the OpenBSD PCIe driver I was not able to determine the root cause. I tested the following adapters: ROCKPro64 2 Port SATA StarTech PEXSAT32 2 Port SATA Samsung 970 Evo NVMe w/m.2 adapter IO CREST SI-PEX40148 2 Port SATA IO CREST SI-PEX40057 4 port? HighPoint Rocket Raid 640L Crossfield SAS9211-4i Del PERC H200 Dell PERC 6/i Intel Gigabit VT Quad Port Server All of the above adapters successfully link trained, however three of them would panic upon the first read of the PCI config space.?nuumio's work-around of placing a delay after link-training allows these cards to work. HighPoint Rocket Raid 640L ~1.75 sec delay Crossfield SAS9211-4i ~1 sec delay Dell PERC H200 ~1 sec delay In attempt to determine if there was a way to detect how long to wait, I compared every status and debug register documented in the rk3399 TRM part 2 for the PCI controller. I compared the values pre-delay and post-delay. I was not able to find a value that would indicate it was safe to proceed. > Strictly speaking, how long should we need for this had been provided, > see this: > > https://patchwork.kernel.org/patch/11561977/ > > If you need more delay, I??highly suspect you should check if > the power supply is stable before enabling training. Hi Shawn, Thank you for pointing out the need for the 100ms delay before beginning link-training. I believe this is to correct link training failures, while the delay after link training is to work-around post link training reading of PCI config space on certain cards. In my testing I discussed above, I suspected that power was a likely cause. The HighPoint Rocket Raid 640L is a good test card because it documents its 3.3v power at 0.7 watts: https://highpoint-tech.com/USA_new/series_r600-specifications.htm How can I check if the power supply is stable? Regards, -Kurt > > > > > > > > Thanks, > > > > - Kever > > > > > > > > + > > > ????? /* Initialize Root Complex registers. */ > > > ????? writel(PCIE_LM_VENDOR_ROCKCHIP, priv->apb_base +? > > > PCIE_LM_VENDOR_ID); > > > ????? writel(PCI_CLASS_BRIDGE_PCI << 16, > > > > >
On Tue, Jun 2, 2020 at 11:12 AM Kurt Miller <kurt at intricatesoftware.com> wrote: > > On Tue, 2020-06-02 at 10:23 +0800, Shawn Lin wrote: > > > > ? 2020/6/2 9:59, Kever Yang ??: > > > > > > Hi Kurt, > > > > > > On 2020/6/2 ??4:30, Kurt Miller wrote: > > > > > > > > On at least the RockPro64, many cards will trip a > > > > synchronous abort when first accessing PCIe config space > > > > during bus scanning. A delay after link training allows > > > > some of these cards to function. > > > > > > > > Signed-off-by: Kurt Miller <kurt at intricatesoftware.com> > > > > --- > > > > On the RockPro64, some pci cards trip a synchronous abort when > > > > scanning the > > > > pci bus. For example with HighPoint Rocket Raid 640L which is based on > > > > Marvell 88SE9230 I see this: > > > > > > > > => pci > > > > "Synchronous Abort" handler, esr 0x96000210 > > > > elr: 000000000022d034 lr : 000000000022cfd0 (reloc) > > > > elr: 00000000f4568034 lr : 00000000f4567fd0 > > > > x0 : 0000000000100000 x1 : 00000000f8000000 > > > > x2 : 0000000000000000 x3 : 0000000000100000 > > > > x4 : 00000000f2559290 x5 : 0000000000000000 > > > > x6 : 0000000000000001 x7 : 00000000f2559860 > > > > x8 : 0000000000000030 x9 : 0000000000000008 > > > > x10: 0000000000000010 x11: 00000000f251fd1c > > > > x12: 0000000000001421 x13: 0000000000001468 > > > > x14: 00000000f251fd4c x15: 00000000ffffffff > > > > x16: 0000000000060001 x17: 000000000000001f > > > > x18: 00000000f2532dc0 x19: 00000000f251fcd0 > > > > x20: 0000000000000001 x21: 0000000000000000 > > > > x22: 0000000000010000 x23: 00000000f45d4000 > > > > x24: 0000000000000000 x25: 00000000f45bc000 > > > > x26: 0000000000000000 x27: 0000000000000000 > > > > x28: 00000000f2541440 x29: 00000000f251fc20 > > > > > > > > Code: 540000c1 350000a5 93407c00 f9400081 (b8616800) > > > > Resetting CPU ... > > > > > > > > Adding a delay after link training works-around the problem. I added this > > > > delay to the OpenBSD rkpcie driver as well: > > > > > > > > https://github.com/openbsd/src/commit/9857dee3520d8ca5bec68538f4b0708d7e64fc87 > > > > > > > > > > > > HighPoint Rocket Raid 640L needs a 1.75 sec delay and Crossfield > > > > SAS9211-4i > > > > needs a 1 second delay, so I arbitrarily decided on 2 seconds. > > > > > > > > The delay work-around was originally discovered by nuumio: > > > > https://github.com/nuumio/linux-kernel/commit/5a65b17686002dc84d461bffa324a2cb68e67aee > > > > > > > > > > > > drivers/pci/pcie_rockchip.c | 8 ++++++++ > > > > 1 file changed, 8 insertions(+) > > > > > > > > diff --git a/drivers/pci/pcie_rockchip.c b/drivers/pci/pcie_rockchip.c > > > > index 0edc2464a8..51cfbf6b18 100644 > > > > --- a/drivers/pci/pcie_rockchip.c > > > > +++ b/drivers/pci/pcie_rockchip.c > > > > @@ -288,6 +288,14 @@ static int rockchip_pcie_init_port(struct udevice > > > > *dev) > > > > goto err_power_off_phy; > > > > } > > > > + /* > > > > + * XXX: On at least the RockPro64, many cards will trip a > > > > + * synchronous abort when first accessing PCIe config space > > > > + * during bus scanning. A delay after link training allows > > > > + * some of these cards to function. > > > > + */ > > > > + mdelay(2000); > > > I don't understand what kind of delay for init needs 2 Seconds, root > > > cause will preferred. > > Hi Kever, > > While working on this issue for the OpenBSD PCIe driver I was not > able to determine the root cause. I tested the following adapters: > > ROCKPro64 2 Port SATA > StarTech PEXSAT32 2 Port SATA > Samsung 970 Evo NVMe w/m.2 adapter > IO CREST SI-PEX40148 2 Port SATA > IO CREST SI-PEX40057 4 port > HighPoint Rocket Raid 640L > Crossfield SAS9211-4i > Del PERC H200 > Dell PERC 6/i > Intel Gigabit VT Quad Port Server > > All of the above adapters successfully link trained, however > three of them would panic upon the first read of the PCI config > space. nuumio's work-around of placing a delay after link-training > allows these cards to work. > > HighPoint Rocket Raid 640L ~1.75 sec delay > Crossfield SAS9211-4i ~1 sec delay > Dell PERC H200 ~1 sec delay > > In attempt to determine if there was a way to detect how long > to wait, I compared every status and debug register documented > in the rk3399 TRM part 2 for the PCI controller. I compared the > values pre-delay and post-delay. I was not able to find a value > that would indicate it was safe to proceed. > > > Strictly speaking, how long should we need for this had been provided, > > see this: > > > > https://patchwork.kernel.org/patch/11561977/ > > > > If you need more delay, I highly suspect you should check if > > the power supply is stable before enabling training. > > Hi Shawn, > > Thank you for pointing out the need for the 100ms delay before > beginning link-training. I believe this is to correct link > training failures, while the delay after link training is > to work-around post link training reading of PCI config > space on certain cards. There are similar issues on the Linux driver with various cards randomly throwing an abort. If it is power, a 2 second wait to allow cards to stabilize after power on might be wise for the Linux driver as well. I imagine some cards take longer to complete power on reset than others, and attempting to read them immediately after powering them up could be the issue here. > > In my testing I discussed above, I suspected that power > was a likely cause. The HighPoint Rocket Raid 640L is > a good test card because it documents its 3.3v power at > 0.7 watts: > > https://highpoint-tech.com/USA_new/series_r600-specifications.htm > > How can I check if the power supply is stable? > > Regards, > -Kurt > > > > > > > > > > > > > Thanks, > > > > > > - Kever > > > > > > > > > > > + > > > > /* Initialize Root Complex registers. */ > > > > writel(PCIE_LM_VENDOR_ROCKCHIP, priv->apb_base + > > > > PCIE_LM_VENDOR_ID); > > > > writel(PCI_CLASS_BRIDGE_PCI << 16, > > > > > > > > >
Hi Kurt, On 2020/6/4 ??5:17, Peter Geis wrote: > On Tue, Jun 2, 2020 at 11:12 AM Kurt Miller <kurt at intricatesoftware.com> wrote: >> On Tue, 2020-06-02 at 10:23 +0800, Shawn Lin wrote: >>> ? 2020/6/2 9:59, Kever Yang ??: >>>> Hi Kurt, >>>> >>>> On 2020/6/2 ??4:30, Kurt Miller wrote: >>>>> On at least the RockPro64, many cards will trip a >>>>> synchronous abort when first accessing PCIe config space >>>>> during bus scanning. A delay after link training allows >>>>> some of these cards to function. >>>>> >>>>> Signed-off-by: Kurt Miller <kurt at intricatesoftware.com> >>>>> --- >>>>> On the RockPro64, some pci cards trip a synchronous abort when >>>>> scanning the >>>>> pci bus. For example with HighPoint Rocket Raid 640L which is based on >>>>> Marvell 88SE9230 I see this: >>>>> >>>>> => pci >>>>> "Synchronous Abort" handler, esr 0x96000210 >>>>> elr: 000000000022d034 lr : 000000000022cfd0 (reloc) >>>>> elr: 00000000f4568034 lr : 00000000f4567fd0 >>>>> x0 : 0000000000100000 x1 : 00000000f8000000 >>>>> x2 : 0000000000000000 x3 : 0000000000100000 >>>>> x4 : 00000000f2559290 x5 : 0000000000000000 >>>>> x6 : 0000000000000001 x7 : 00000000f2559860 >>>>> x8 : 0000000000000030 x9 : 0000000000000008 >>>>> x10: 0000000000000010 x11: 00000000f251fd1c >>>>> x12: 0000000000001421 x13: 0000000000001468 >>>>> x14: 00000000f251fd4c x15: 00000000ffffffff >>>>> x16: 0000000000060001 x17: 000000000000001f >>>>> x18: 00000000f2532dc0 x19: 00000000f251fcd0 >>>>> x20: 0000000000000001 x21: 0000000000000000 >>>>> x22: 0000000000010000 x23: 00000000f45d4000 >>>>> x24: 0000000000000000 x25: 00000000f45bc000 >>>>> x26: 0000000000000000 x27: 0000000000000000 >>>>> x28: 00000000f2541440 x29: 00000000f251fc20 >>>>> >>>>> Code: 540000c1 350000a5 93407c00 f9400081 (b8616800) >>>>> Resetting CPU ... >>>>> >>>>> Adding a delay after link training works-around the problem. I added this >>>>> delay to the OpenBSD rkpcie driver as well: >>>>> >>>>> https://github.com/openbsd/src/commit/9857dee3520d8ca5bec68538f4b0708d7e64fc87 >>>>> >>>>> >>>>> HighPoint Rocket Raid 640L needs a 1.75 sec delay and Crossfield >>>>> SAS9211-4i >>>>> needs a 1 second delay, so I arbitrarily decided on 2 seconds. >>>>> >>>>> The delay work-around was originally discovered by nuumio: >>>>> https://github.com/nuumio/linux-kernel/commit/5a65b17686002dc84d461bffa324a2cb68e67aee >>>>> >>>>> >>>>> drivers/pci/pcie_rockchip.c | 8 ++++++++ >>>>> 1 file changed, 8 insertions(+) >>>>> >>>>> diff --git a/drivers/pci/pcie_rockchip.c b/drivers/pci/pcie_rockchip.c >>>>> index 0edc2464a8..51cfbf6b18 100644 >>>>> --- a/drivers/pci/pcie_rockchip.c >>>>> +++ b/drivers/pci/pcie_rockchip.c >>>>> @@ -288,6 +288,14 @@ static int rockchip_pcie_init_port(struct udevice >>>>> *dev) >>>>> goto err_power_off_phy; >>>>> } >>>>> + /* >>>>> + * XXX: On at least the RockPro64, many cards will trip a >>>>> + * synchronous abort when first accessing PCIe config space >>>>> + * during bus scanning. A delay after link training allows >>>>> + * some of these cards to function. >>>>> + */ >>>>> + mdelay(2000); >>>> I don't understand what kind of delay for init needs 2 Seconds, root >>>> cause will preferred. >> Hi Kever, >> >> While working on this issue for the OpenBSD PCIe driver I was not >> able to determine the root cause. I tested the following adapters: >> >> ROCKPro64 2 Port SATA >> StarTech PEXSAT32 2 Port SATA >> Samsung 970 Evo NVMe w/m.2 adapter >> IO CREST SI-PEX40148 2 Port SATA >> IO CREST SI-PEX40057 4 port >> HighPoint Rocket Raid 640L >> Crossfield SAS9211-4i >> Del PERC H200 >> Dell PERC 6/i >> Intel Gigabit VT Quad Port Server >> >> All of the above adapters successfully link trained, however >> three of them would panic upon the first read of the PCI config >> space. nuumio's work-around of placing a delay after link-training >> allows these cards to work. >> >> HighPoint Rocket Raid 640L ~1.75 sec delay >> Crossfield SAS9211-4i ~1 sec delay >> Dell PERC H200 ~1 sec delay >> >> In attempt to determine if there was a way to detect how long >> to wait, I compared every status and debug register documented >> in the rk3399 TRM part 2 for the PCI controller. I compared the >> values pre-delay and post-delay. I was not able to find a value >> that would indicate it was safe to proceed. >> >>> Strictly speaking, how long should we need for this had been provided, >>> see this: >>> >>> https://patchwork.kernel.org/patch/11561977/ I can accept the same fix like kernel which is 100ms, but 2 Second is really too much for most cases. Thanks, - Kever >>> >>> If you need more delay, I highly suspect you should check if >>> the power supply is stable before enabling training. >> Hi Shawn, >> >> Thank you for pointing out the need for the 100ms delay before >> beginning link-training. I believe this is to correct link >> training failures, while the delay after link training is >> to work-around post link training reading of PCI config >> space on certain cards. > There are similar issues on the Linux driver with various cards > randomly throwing an abort. > If it is power, a 2 second wait to allow cards to stabilize after > power on might be wise for the Linux driver as well. > > I imagine some cards take longer to complete power on reset than > others, and attempting to read them immediately after powering them up > could be the issue here. > >> In my testing I discussed above, I suspected that power >> was a likely cause. The HighPoint Rocket Raid 640L is >> a good test card because it documents its 3.3v power at >> 0.7 watts: >> >> https://highpoint-tech.com/USA_new/series_r600-specifications.htm >> >> How can I check if the power supply is stable? >> >> Regards, >> -Kurt >> >>>> >>>> >>>> Thanks, >>>> >>>> - Kever >>>> >>>>> + >>>>> /* Initialize Root Complex registers. */ >>>>> writel(PCIE_LM_VENDOR_ROCKCHIP, priv->apb_base + >>>>> PCIE_LM_VENDOR_ID); >>>>> writel(PCI_CLASS_BRIDGE_PCI << 16, >>>>
On Sat, 2020-06-27 at 20:57 +0800, Kever Yang wrote: > Hi Kurt, > > > On 2020/6/4 ??5:17, Peter Geis wrote: > > > > > On Tue, Jun 2, 2020 at 11:12 AM Kurt Miller <kurt at intricatesoftware.com> wrote: > > > > > > On Tue, 2020-06-02 at 10:23 +0800, Shawn Lin wrote: > > > > > > > > ? 2020/6/2 9:59, Kever Yang ??: > > > > > > > > > > Hi Kurt, > > > > > > > > > > On 2020/6/2 ??4:30, Kurt Miller wrote: > > > > > > > > > > > > On at least the RockPro64, many cards will trip a > > > > > > synchronous abort when first accessing PCIe config space > > > > > > during bus scanning. A delay after link training allows > > > > > > some of these cards to function. > > > > > > > > > > > > Signed-off-by: Kurt Miller <kurt at intricatesoftware.com> > > > > > > --- > > > > > > On the RockPro64, some pci cards trip a synchronous abort when > > > > > > scanning the > > > > > > pci bus. For example with HighPoint Rocket Raid 640L which is based on > > > > > > Marvell 88SE9230 I see this: > > > > > > > > > > > > => pci > > > > > > "Synchronous Abort"?handler, esr 0x96000210 > > > > > > elr: 000000000022d034 lr : 000000000022cfd0 (reloc) > > > > > > elr: 00000000f4568034 lr : 00000000f4567fd0 > > > > > > x0 : 0000000000100000 x1 : 00000000f8000000 > > > > > > x2 : 0000000000000000 x3 : 0000000000100000 > > > > > > x4 : 00000000f2559290 x5 : 0000000000000000 > > > > > > x6 : 0000000000000001 x7 : 00000000f2559860 > > > > > > x8 : 0000000000000030 x9 : 0000000000000008 > > > > > > x10: 0000000000000010 x11: 00000000f251fd1c > > > > > > x12: 0000000000001421 x13: 0000000000001468 > > > > > > x14: 00000000f251fd4c x15: 00000000ffffffff > > > > > > x16: 0000000000060001 x17: 000000000000001f > > > > > > x18: 00000000f2532dc0 x19: 00000000f251fcd0 > > > > > > x20: 0000000000000001 x21: 0000000000000000 > > > > > > x22: 0000000000010000 x23: 00000000f45d4000 > > > > > > x24: 0000000000000000 x25: 00000000f45bc000 > > > > > > x26: 0000000000000000 x27: 0000000000000000 > > > > > > x28: 00000000f2541440 x29: 00000000f251fc20 > > > > > > > > > > > > Code: 540000c1 350000a5 93407c00 f9400081 (b8616800) > > > > > > Resetting CPU ... > > > > > > > > > > > > Adding a delay after link training works-around the problem. I added this > > > > > > delay to the OpenBSD rkpcie driver as well: > > > > > > > > > > > > https://github.com/openbsd/src/commit/9857dee3520d8ca5bec68538f4b0708d7e64fc87 > > > > > > > > > > > > > > > > > > HighPoint Rocket Raid 640L needs a 1.75 sec delay and Crossfield > > > > > > SAS9211-4i > > > > > > needs a 1 second delay, so I arbitrarily decided on 2 seconds. > > > > > > > > > > > > The delay work-around was originally discovered by nuumio: > > > > > > https://github.com/nuumio/linux-kernel/commit/5a65b17686002dc84d461bffa324a2cb68e67aee > > > > > > > > > > > > > > > > > > ???drivers/pci/pcie_rockchip.c | 8 ++++++++ > > > > > > ???1 file changed, 8 insertions(+) > > > > > > > > > > > > diff --git a/drivers/pci/pcie_rockchip.c b/drivers/pci/pcie_rockchip.c > > > > > > index 0edc2464a8..51cfbf6b18 100644 > > > > > > --- a/drivers/pci/pcie_rockchip.c > > > > > > +++ b/drivers/pci/pcie_rockchip.c > > > > > > @@ -288,6 +288,14 @@ static int rockchip_pcie_init_port(struct udevice > > > > > > *dev) > > > > > > ???????????goto err_power_off_phy; > > > > > > ???????} > > > > > > +????/* > > > > > > +?????* XXX: On at least the RockPro64, many cards will trip a > > > > > > +?????* synchronous abort when first accessing PCIe config space > > > > > > +?????* during bus scanning. A delay after link training allows > > > > > > +?????* some of these cards to function. > > > > > > +?????*/ > > > > > > +????mdelay(2000); > > > > > I don't understand what kind of delay for init needs 2 Seconds, root > > > > > cause will preferred. > > > Hi Kever, > > > > > > While working on this issue for the OpenBSD PCIe driver I was not > > > able to determine the root cause. I tested the following adapters: > > > > > > ROCKPro64 2 Port SATA > > > StarTech PEXSAT32 2 Port SATA > > > Samsung 970 Evo NVMe w/m.2 adapter > > > IO CREST SI-PEX40148 2 Port SATA > > > IO CREST SI-PEX40057 4 port > > > HighPoint Rocket Raid 640L > > > Crossfield SAS9211-4i > > > Del PERC H200 > > > Dell PERC 6/i > > > Intel Gigabit VT Quad Port Server > > > > > > All of the above adapters successfully link trained, however > > > three of them would panic upon the first read of the PCI config > > > space. nuumio's work-around of placing a delay after link-training > > > allows these cards to work. > > > > > > HighPoint Rocket Raid 640L ~1.75 sec delay > > > Crossfield SAS9211-4i ~1 sec delay > > > Dell PERC H200 ~1 sec delay > > > > > > In attempt to determine if there was a way to detect how long > > > to wait, I compared every status and debug register documented > > > in the rk3399 TRM part 2 for the PCI controller. I compared the > > > values pre-delay and post-delay. I was not able to find a value > > > that would indicate it was safe to proceed. > > > > > > > > > > > Strictly speaking, how long should we need for this had been provided, > > > > see this: > > > > > > > > https://patchwork.kernel.org/patch/11561977/ > I can accept the same fix like kernel which is 100ms, but 2 Second is? > really too much for most cases. > Hi Kever, Thank you for following up on this. The 100ms delay before link-training appears to be solving a different issue where some cards fail to link train. From the comment in the referenced patch "Otherwise we do see some failures for link training." The delay I am proposing is after successful link-training to avoid the synchronous abort that happens when scanning the pci bus and attempting read the pci config space for some cards. I performed a test where I added the 100ms delay before link-training and it did not correct the synchronous abort on the?HighPoint Rocket Raid 640L. Shawn Lin suggested I check if the power supply is stable before enabling training. How would I go about doing this?? Regards, -Kurt > > Thanks, > > - Kever > > > > > > > > > > > > > > > > > > If you need more delay, I??highly suspect you should check if > > > > the power supply is stable before enabling training. > > > Hi Shawn, > > > > > > Thank you for pointing out the need for the 100ms delay before > > > beginning link-training. I believe this is to correct link > > > training failures, while the delay after link training is > > > to work-around post link training reading of PCI config > > > space on certain cards. > > There are similar issues on the Linux driver with various cards > > randomly throwing an abort. > > If it is power, a 2 second wait to allow cards to stabilize after > > power on might be wise for the Linux driver as well. > > > > I imagine some cards take longer to complete power on reset than > > others, and attempting to read them immediately after powering them up > > could be the issue here. > > > > > > > > In my testing I discussed above, I suspected that power > > > was a likely cause. The HighPoint Rocket Raid 640L is > > > a good test card because it documents its 3.3v power at > > > 0.7 watts: > > > > > > https://highpoint-tech.com/USA_new/series_r600-specifications.htm > > > > > > How can I check if the power supply is stable? > > > > > > Regards, > > > -Kurt > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > - Kever > > > > > > > > > > > > > > > > > + > > > > > > ???????/* Initialize Root Complex registers. */ > > > > > > ???????writel(PCIE_LM_VENDOR_ROCKCHIP, priv->apb_base + > > > > > > PCIE_LM_VENDOR_ID); > > > > > > ???????writel(PCI_CLASS_BRIDGE_PCI << 16, >
diff --git a/drivers/pci/pcie_rockchip.c b/drivers/pci/pcie_rockchip.c index 0edc2464a8..51cfbf6b18 100644 --- a/drivers/pci/pcie_rockchip.c +++ b/drivers/pci/pcie_rockchip.c @@ -288,6 +288,14 @@ static int rockchip_pcie_init_port(struct udevice *dev) goto err_power_off_phy; } + /* + * XXX: On at least the RockPro64, many cards will trip a + * synchronous abort when first accessing PCIe config space + * during bus scanning. A delay after link training allows + * some of these cards to function. + */ + mdelay(2000); + /* Initialize Root Complex registers. */ writel(PCIE_LM_VENDOR_ROCKCHIP, priv->apb_base + PCIE_LM_VENDOR_ID); writel(PCI_CLASS_BRIDGE_PCI << 16,
On at least the RockPro64, many cards will trip a synchronous abort when first accessing PCIe config space during bus scanning. A delay after link training allows some of these cards to function. Signed-off-by: Kurt Miller <kurt at intricatesoftware.com> --- On the RockPro64, some pci cards trip a synchronous abort when scanning the pci bus. For example with HighPoint Rocket Raid 640L which is based on Marvell 88SE9230 I see this: => pci "Synchronous Abort" handler, esr 0x96000210 elr: 000000000022d034 lr : 000000000022cfd0 (reloc) elr: 00000000f4568034 lr : 00000000f4567fd0 x0 : 0000000000100000 x1 : 00000000f8000000 x2 : 0000000000000000 x3 : 0000000000100000 x4 : 00000000f2559290 x5 : 0000000000000000 x6 : 0000000000000001 x7 : 00000000f2559860 x8 : 0000000000000030 x9 : 0000000000000008 x10: 0000000000000010 x11: 00000000f251fd1c x12: 0000000000001421 x13: 0000000000001468 x14: 00000000f251fd4c x15: 00000000ffffffff x16: 0000000000060001 x17: 000000000000001f x18: 00000000f2532dc0 x19: 00000000f251fcd0 x20: 0000000000000001 x21: 0000000000000000 x22: 0000000000010000 x23: 00000000f45d4000 x24: 0000000000000000 x25: 00000000f45bc000 x26: 0000000000000000 x27: 0000000000000000 x28: 00000000f2541440 x29: 00000000f251fc20 Code: 540000c1 350000a5 93407c00 f9400081 (b8616800) Resetting CPU ... Adding a delay after link training works-around the problem. I added this delay to the OpenBSD rkpcie driver as well: https://github.com/openbsd/src/commit/9857dee3520d8ca5bec68538f4b0708d7e64fc87 HighPoint Rocket Raid 640L needs a 1.75 sec delay and Crossfield SAS9211-4i needs a 1 second delay, so I arbitrarily decided on 2 seconds. The delay work-around was originally discovered by nuumio: https://github.com/nuumio/linux-kernel/commit/5a65b17686002dc84d461bffa324a2cb68e67aee drivers/pci/pcie_rockchip.c | 8 ++++++++ 1 file changed, 8 insertions(+)