Message ID | 449292CA-CE60-4B90-90F7-295FBFEAB3F8@kohlschutter.com |
---|---|
State | New |
Headers | show |
Series | [v3] arm64: dts: rockchip: Fix SD card init on rk3399-nanopi4 | expand |
On 2022-07-15 18:16, Christian Kohlschütter wrote: > OK, this took me a while to figure out. > > When no undervoltage limit is configured, I can reliably trigger the initialization bug upon boot. > When the limit is set to 3.0V, it rarely occurs, but just after I send the v3 patch, I was able to reproduce... Well this has to be in the running for "weirdest placebo ever"... :/ All it actually seems to achieve is printing an error[1] (this is after all a tiny 5-pin fixed-voltage LDO regulator, not an intelligent PMIC), and if that makes an appreciable difference then there has to be some kind of weird timing condition at play. Maybe regulator_register() ends up turning it off and on again rapidly enough that the card sees a voltage brownout and glitches, and adding more delay by printing to the console somewhere in the middle gives it enough time to act as a proper power cycle with no ill effect? If you just whack something like an mdelay(500) at around that point in set_machine_constraints(), without the DT property, does it have the same effect? Robin. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/regulator/core.c#n1521 >> Am 15.07.2022 um 19:12 schrieb Christian Kohlschütter <christian@kohlschutter.com>: >> >> mmc/SD-card initialization may fail on NanoPi R4S with >> "mmc1: problem reading SD Status register" / >> "mmc1: error -110 whilst initialising SD card" >> either on cold boot or after a reboot. >> >> Moreover, the system would also sometimes hang upon reboot. >> >> This is prevented by setting an explicit undervoltage protection limit >> for the SD-card-specific vcc3v0_sd voltage regulator. >> >> Set the undervoltage protection limit to 2.7V, which is the minimum >> permissible SD card operating voltage. >> >> Signed-off-by: Christian Kohlschütter <christian@kohlschutter.com> >> --- >> arch/arm64/boot/dts/rockchip/rk3399-nanopi4.dtsi | 4 ++++ >> 1 file changed, 4 insertions(+) >> mode change 100644 => 100755 arch/arm64/boot/dts/rockchip/rk3399-nanopi4.dtsi >> >> diff --git a/arch/arm64/boot/dts/rockchip/rk3399-nanopi4.dtsi b/arch/arm64/boot/dts/rockchip/rk3399-nanopi4.dtsi >> old mode 100644 >> new mode 100755 >> index 8c0ff6c96e03..669c74ce4d13 >> --- a/arch/arm64/boot/dts/rockchip/rk3399-nanopi4.dtsi >> +++ b/arch/arm64/boot/dts/rockchip/rk3399-nanopi4.dtsi >> @@ -73,6 +73,10 @@ vcc3v0_sd: vcc3v0-sd { >> regulator-always-on; >> regulator-min-microvolt = <3000000>; >> regulator-max-microvolt = <3000000>; >> + >> + // must be configured or SD card may fail to initialize occasionally >> + regulator-uv-protection-microvolt = <2700000>; >> + >> regulator-name = "vcc3v0_sd"; >> vin-supply = <&vcc3v3_sys>; >> }; >> -- >> 2.36.1 >
On 2022-07-15 19:11, Robin Murphy wrote: > On 2022-07-15 18:16, Christian Kohlschütter wrote: >> OK, this took me a while to figure out. >> >> When no undervoltage limit is configured, I can reliably trigger the >> initialization bug upon boot. >> When the limit is set to 3.0V, it rarely occurs, but just after I send >> the v3 patch, I was able to reproduce... > > Well this has to be in the running for "weirdest placebo ever"... :/ > > All it actually seems to achieve is printing an error[1] (this is after > all a tiny 5-pin fixed-voltage LDO regulator, not an intelligent PMIC), > and if that makes an appreciable difference then there has to be some > kind of weird timing condition at play. Maybe regulator_register() ends > up turning it off and on again rapidly enough that the card sees a > voltage brownout and glitches, and adding more delay by printing to the > console somewhere in the middle gives it enough time to act as a proper > power cycle with no ill effect? ...and apparently the answer is yes, it seems to be doing exactly that (see attached). But seemingly my SD cards don't mind, or maybe my T4 board happens to have more capacitance than Christian's R4S so my voltage dip isn't as bad, or both. So it seems like the solution here might indeed simply be to remove the regulator-always-on which doesn't seem to have any reason to be here anyway. Without that, the enable stays low until the MMC driver probes and claims it, which is then massively longer than the time it takes for VCC3V0_SD to ramp down completely. Robin. > > If you just whack something like an mdelay(500) at around that point in > set_machine_constraints(), without the DT property, does it have the > same effect? > > Robin. > > [1] > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/regulator/core.c#n1521 > > >>> Am 15.07.2022 um 19:12 schrieb Christian Kohlschütter >>> <christian@kohlschutter.com>: >>> >>> mmc/SD-card initialization may fail on NanoPi R4S with >>> "mmc1: problem reading SD Status register" / >>> "mmc1: error -110 whilst initialising SD card" >>> either on cold boot or after a reboot. >>> >>> Moreover, the system would also sometimes hang upon reboot. >>> >>> This is prevented by setting an explicit undervoltage protection limit >>> for the SD-card-specific vcc3v0_sd voltage regulator. >>> >>> Set the undervoltage protection limit to 2.7V, which is the minimum >>> permissible SD card operating voltage. >>> >>> Signed-off-by: Christian Kohlschütter <christian@kohlschutter.com> >>> --- >>> arch/arm64/boot/dts/rockchip/rk3399-nanopi4.dtsi | 4 ++++ >>> 1 file changed, 4 insertions(+) >>> mode change 100644 => 100755 >>> arch/arm64/boot/dts/rockchip/rk3399-nanopi4.dtsi >>> >>> diff --git a/arch/arm64/boot/dts/rockchip/rk3399-nanopi4.dtsi >>> b/arch/arm64/boot/dts/rockchip/rk3399-nanopi4.dtsi >>> old mode 100644 >>> new mode 100755 >>> index 8c0ff6c96e03..669c74ce4d13 >>> --- a/arch/arm64/boot/dts/rockchip/rk3399-nanopi4.dtsi >>> +++ b/arch/arm64/boot/dts/rockchip/rk3399-nanopi4.dtsi >>> @@ -73,6 +73,10 @@ vcc3v0_sd: vcc3v0-sd { >>> regulator-always-on; >>> regulator-min-microvolt = <3000000>; >>> regulator-max-microvolt = <3000000>; >>> + >>> + // must be configured or SD card may fail to initialize >>> occasionally >>> + regulator-uv-protection-microvolt = <2700000>; >>> + >>> regulator-name = "vcc3v0_sd"; >>> vin-supply = <&vcc3v3_sys>; >>> }; >>> -- >>> 2.36.1 >> > > _______________________________________________ > Linux-rockchip mailing list > Linux-rockchip@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-rockchip
On 2022-07-15 20:04, Christian Kohlschütter wrote: > Am 15.07.2022 um 20:57 schrieb Robin Murphy <robin.murphy@arm.com>: >> >> On 2022-07-15 19:11, Robin Murphy wrote: >>> On 2022-07-15 18:16, Christian Kohlschütter wrote: >>>> OK, this took me a while to figure out. >>>> >>>> When no undervoltage limit is configured, I can reliably trigger the initialization bug upon boot. >>>> When the limit is set to 3.0V, it rarely occurs, but just after I send the v3 patch, I was able to reproduce... >>> Well this has to be in the running for "weirdest placebo ever"... :/ >>> All it actually seems to achieve is printing an error[1] (this is after all a tiny 5-pin fixed-voltage LDO regulator, not an intelligent PMIC), and if that makes an appreciable difference then there has to be some kind of weird timing condition at play. Maybe regulator_register() ends up turning it off and on again rapidly enough that the card sees a voltage brownout and glitches, and adding more delay by printing to the console somewhere in the middle gives it enough time to act as a proper power cycle with no ill effect? >> >> ...and apparently the answer is yes, it seems to be doing exactly that (see attached). But seemingly my SD cards don't mind, or maybe my T4 board happens to have more capacitance than Christian's R4S so my voltage dip isn't as bad, or both. >> >> So it seems like the solution here might indeed simply be to remove the regulator-always-on which doesn't seem to have any reason to be here anyway. Without that, the enable stays low until the MMC driver probes and claims it, which is then massively longer than the time it takes for VCC3V0_SD to ramp down completely. >> >> Robin. > > Removing "regulator-always-on" has the effect that the system freezes upon reboot. Ah, right (can we fast-forward to a world where everyone has a reliable bootloader in SPI flash or similar?). Is that more glitching, or a firmware bug not resetting the GPIOs to their default state on warm reset, I wonder. > There may well be another bug slumbering in the codebase that is circumvented by 1. adding a delay in the code and 2. not turning the regulator off upon shutdown. Yes, it seems suboptimal that the regulator core allows this glitch where an always-on regulator which is already on gets turned off at all, but I guess that's its own problem. In the meantime, off-on-delay-us sounds like the most likely property to bandage this locally. I'm seeing a fall time in the order of milliseconds (attached), so we'd probably want a fair chunk of that to be safe. Robin.
> Am 15.07.2022 um 21:38 schrieb Robin Murphy <robin.murphy@arm.com>: > > On 2022-07-15 20:04, Christian Kohlschütter wrote: >> Am 15.07.2022 um 20:57 schrieb Robin Murphy <robin.murphy@arm.com>: >>> >>> On 2022-07-15 19:11, Robin Murphy wrote: >>>> On 2022-07-15 18:16, Christian Kohlschütter wrote: >>>>> OK, this took me a while to figure out. >>>>> >>>>> When no undervoltage limit is configured, I can reliably trigger the initialization bug upon boot. >>>>> When the limit is set to 3.0V, it rarely occurs, but just after I send the v3 patch, I was able to reproduce... >>>> Well this has to be in the running for "weirdest placebo ever"... :/ >>>> All it actually seems to achieve is printing an error[1] (this is after all a tiny 5-pin fixed-voltage LDO regulator, not an intelligent PMIC), and if that makes an appreciable difference then there has to be some kind of weird timing condition at play. Maybe regulator_register() ends up turning it off and on again rapidly enough that the card sees a voltage brownout and glitches, and adding more delay by printing to the console somewhere in the middle gives it enough time to act as a proper power cycle with no ill effect? >>> >>> ...and apparently the answer is yes, it seems to be doing exactly that (see attached). But seemingly my SD cards don't mind, or maybe my T4 board happens to have more capacitance than Christian's R4S so my voltage dip isn't as bad, or both. >>> >>> So it seems like the solution here might indeed simply be to remove the regulator-always-on which doesn't seem to have any reason to be here anyway. Without that, the enable stays low until the MMC driver probes and claims it, which is then massively longer than the time it takes for VCC3V0_SD to ramp down completely. >>> >>> Robin. >> Removing "regulator-always-on" has the effect that the system freezes upon reboot. > > Ah, right (can we fast-forward to a world where everyone has a reliable bootloader in SPI flash or similar?). Is that more glitching, or a firmware bug not resetting the GPIOs to their default state on warm reset, I wonder. > >> There may well be another bug slumbering in the codebase that is circumvented by 1. adding a delay in the code and 2. not turning the regulator off upon shutdown. > > Yes, it seems suboptimal that the regulator core allows this glitch where an always-on regulator which is already on gets turned off at all, but I guess that's its own problem. In the meantime, off-on-delay-us sounds like the most likely property to bandage this locally. I'm seeing a fall time in the order of milliseconds (attached), so we'd probably want a fair chunk of that to be safe. > > Robin.<SDS00003.png> I think we have a way where there's no need to pick a delay value that may ultimately not work in all cases. Following up with "[PATCH] regulator: core: Resolve supply name earlier to prevent double-init" [1] Thank you so much for helping me getting that far! It would be great if you'd keep following the thread. Best, Christian [1] https://www.spinics.net/lists/kernel/msg4440365.html
>> >>> There may well be another bug slumbering in the codebase that is circumvented by 1. adding a delay in the code and 2. not turning the regulator off upon shutdown. >> >> Yes, it seems suboptimal that the regulator core allows this glitch where an always-on regulator which is already on gets turned off at all, but I guess that's its own problem. In the meantime, off-on-delay-us sounds like the most likely property to bandage this locally. I'm seeing a fall time in the order of milliseconds (attached), so we'd probably want a fair chunk of that to be safe. >> >> Robin.<SDS00003.png> > > I think we have a way where there's no need to pick a delay value that may ultimately not work in all cases. > Following up with "[PATCH] regulator: core: Resolve supply name earlier to prevent double-init" [1] > > Thank you so much for helping me getting that far! It would be great if you'd keep following the thread. > > Best, > Christian > > [1] https://www.spinics.net/lists/kernel/msg4440365.html @Robin, oddly enough, setting off-on-delay-us with values of up to a second (1000000 us) still results in failed inits. I hope we can find another bandage until the regular-core patch gets merged.
diff --git a/arch/arm64/boot/dts/rockchip/rk3399-nanopi4.dtsi b/arch/arm64/boot/dts/rockchip/rk3399-nanopi4.dtsi old mode 100644 new mode 100755 index 8c0ff6c96e03..669c74ce4d13 --- a/arch/arm64/boot/dts/rockchip/rk3399-nanopi4.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3399-nanopi4.dtsi @@ -73,6 +73,10 @@ vcc3v0_sd: vcc3v0-sd { regulator-always-on; regulator-min-microvolt = <3000000>; regulator-max-microvolt = <3000000>; + + // must be initialized or SD card may fail to initialize / system may hang + regulator-uv-protection-microvolt = <2700000>; + regulator-name = "vcc3v0_sd"; vin-supply = <&vcc3v3_sys>; };
mmc/SD-card initialization may fail on NanoPi R4S with "mmc1: problem reading SD Status register" / "mmc1: error -110 whilst initialising SD card" either on cold boot or after a reboot. Moreover, the system would also sometimes hang upon reboot. This is prevented by setting an explicit undervoltage protection limit for the SD-card-specific vcc3v0_sd voltage regulator. While using a limit of 3V seems to work, an additional safety buffer should prevent accidental tripping, preventing a system hang. Set the undervoltage protection limit to 2.7V, which is the minimum permissible SD card operating voltage. Signed-off-by: Christian Kohlschütter <christian@kohlschutter.com> --- arch/arm64/boot/dts/rockchip/rk3399-nanopi4.dtsi | 4 ++++ 1 file changed, 4 insertions(+) mode change 100644 => 100755 arch/arm64/boot/dts/rockchip/rk3399-nanopi4.dtsi