Message ID | 20230323083312.199189-1-m.szyprowski@samsung.com |
---|---|
State | New |
Headers | show |
Series | regulator: wm8994: Use PROBE_FORCE_SYNCHRONOUS | expand |
On Thu, Mar 23, 2023 at 09:33:12AM +0100, Marek Szyprowski wrote: > Restore synchronous probing for wm8994 regulators because otherwise the > sound device is never initialized on Exynos5250-based Arndale board. > > Fixes: 259b93b21a9f ("regulator: Set PROBE_PREFER_ASYNCHRONOUS for drivers that existed in 4.14") > Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> > --- > drivers/regulator/wm8994-regulator.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/regulator/wm8994-regulator.c b/drivers/regulator/wm8994-regulator.c > index 8921051a00e9..2946db448aec 100644 > --- a/drivers/regulator/wm8994-regulator.c > +++ b/drivers/regulator/wm8994-regulator.c > @@ -227,7 +227,7 @@ static struct platform_driver wm8994_ldo_driver = { > .probe = wm8994_ldo_probe, > .driver = { > .name = "wm8994-ldo", > - .probe_type = PROBE_PREFER_ASYNCHRONOUS, > + .probe_type = PROBE_FORCE_SYNCHRONOUS, > }, > }; Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com> Yes, these seems to be a wider problem with these complex CODECs that have an internal LDO. Changing to async probe, means the internal LDO driver doesn't probe before the code in the main MFD carries on, which means the regulator framework finds no driver and swaps in the dummy. Which means the CODEC never powers up. I think these basically have to be forced sync, I will do a patch to update the other devices working like this. Thanks, Charles
Hi, On Thu, Mar 23, 2023 at 4:40 AM Charles Keepax <ckeepax@opensource.cirrus.com> wrote: > > On Thu, Mar 23, 2023 at 09:33:12AM +0100, Marek Szyprowski wrote: > > Restore synchronous probing for wm8994 regulators because otherwise the > > sound device is never initialized on Exynos5250-based Arndale board. > > > > Fixes: 259b93b21a9f ("regulator: Set PROBE_PREFER_ASYNCHRONOUS for drivers that existed in 4.14") > > Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> > > --- > > drivers/regulator/wm8994-regulator.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/regulator/wm8994-regulator.c b/drivers/regulator/wm8994-regulator.c > > index 8921051a00e9..2946db448aec 100644 > > --- a/drivers/regulator/wm8994-regulator.c > > +++ b/drivers/regulator/wm8994-regulator.c > > @@ -227,7 +227,7 @@ static struct platform_driver wm8994_ldo_driver = { > > .probe = wm8994_ldo_probe, > > .driver = { > > .name = "wm8994-ldo", > > - .probe_type = PROBE_PREFER_ASYNCHRONOUS, > > + .probe_type = PROBE_FORCE_SYNCHRONOUS, > > }, > > }; > > Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com> > > Yes, these seems to be a wider problem with these complex CODECs > that have an internal LDO. Changing to async probe, means the > internal LDO driver doesn't probe before the code in the main MFD > carries on, which means the regulator framework finds no driver > and swaps in the dummy. Which means the CODEC never powers up. > > I think these basically have to be forced sync, I will do a patch > to update the other devices working like this. Sorry for the breakage and thank you for the fix. No question that a quick switch back to PROBE_FORCE_SYNCHRONOUS is the right first step here, but I'm wondering if there are any further steps we want to take. If my analysis is correct, there's still potential to run into similar problems even with PROBE_FORCE_SYNCHRONOUS. I don't think that mfd_add_devices() is _guaranteed_ to finish probing all the sub-devices by the time it returns. Specifically, imagine that wm8994_ldo_probe() tries to get a GPIO that that system hasn't made available yet. Potentially the system could return -EPROBE_DEFER there and that would put the LDO on the deferred probe list. Doing so won't cause mfd_add_devices() to fail. Now we'll end up with a dummy regulator again. Admittedly most cases GPIO providers probe really early and so this argument is a bit of a stretch, but I guess the point is that there's no official guarantee that mfd_add_devices() will finish probing all sub-devices so it's not ideal to rely on. Also, other drivers with a similar pattern might have other reasons to -EPROBE_DEFER. These types of issues are the same ones I faced with DP AUX bus and the solutions were never super amazing... One solution we ended up with for the DP AUX bus was to add a "done_probing" callback argument to devm_of_dp_aux_populate_bus(). This allowed the parent to be called back when all the children were done probing. This implementation is easier for DP AUX bus than it would be in the generic MFD case, but conceivably it would be possible there too? Another possible solution is to somehow break the driver up into more sub-drivers. Essentially, you have a top-level "wm8994" that's not much more than a container. Then you create a new-sub-device and relegate anything that needs the regulators to the new sub-device. The new sub-device can just -EPROBE_DEFER until it detects that the regulators have finished probing. I ended up doing stuff like this for "ti-sn65dsi86.c" using the Linux aux bus (not to be confused with the DP Aux bus) and it was a bit odd but worked OK. -Doug
On Thu, Mar 23, 2023 at 09:53:18AM -0700, Doug Anderson wrote: > On Thu, Mar 23, 2023 at 4:40 AM Charles Keepax > If my analysis is correct, there's still potential to run into similar > problems even with PROBE_FORCE_SYNCHRONOUS. I don't think that > mfd_add_devices() is _guaranteed_ to finish probing all the > sub-devices by the time it returns. Specifically, imagine that > wm8994_ldo_probe() tries to get a GPIO that that system hasn't made > available yet. Potentially the system could return -EPROBE_DEFER there > and that would put the LDO on the deferred probe list. Doing so won't > cause mfd_add_devices() to fail. Now we'll end up with a dummy > regulator again. Admittedly most cases GPIO providers probe really > early and so this argument is a bit of a stretch, but I guess the > point is that there's no official guarantee that mfd_add_devices() > will finish probing all sub-devices so it's not ideal to rely on. > Also, other drivers with a similar pattern might have other reasons to > -EPROBE_DEFER. > > These types of issues are the same ones I faced with DP AUX bus and > the solutions were never super amazing... > > One solution we ended up with for the DP AUX bus was to add a > "done_probing" callback argument to devm_of_dp_aux_populate_bus(). > This allowed the parent to be called back when all the children were > done probing. This implementation is easier for DP AUX bus than it > would be in the generic MFD case, but conceivably it would be possible > there too? > > Another possible solution is to somehow break the driver up into more > sub-drivers. Essentially, you have a top-level "wm8994" that's not > much more than a container. Then you create a new-sub-device and > relegate anything that needs the regulators to the new sub-device. The > new sub-device can just -EPROBE_DEFER until it detects that the > regulators have finished probing. I ended up doing stuff like this for > "ti-sn65dsi86.c" using the Linux aux bus (not to be confused with the > DP Aux bus) and it was a bit odd but worked OK. Yes I believe you are correct, there is still an issue here, indeed a quick test suggests I can still cause this by forcing a probe defer in the regulator driver. I think really the best place to look at this would be at the regulator level. It is fine if mfd_add_devices passes, the problem really is that the regulator core doesn't realise the regulator is going to be arriving, and thus returns a dummy regulator, rather than returning EPROBE_DEFER. If it did the MFD driver would probe defer at the point of requesting the regulator, which would all make sense. I will see if I can find some time to think about that further but very unlikely to happen this week. Thanks, Charles
Hi, On Thu, Mar 23, 2023 at 10:45 AM Charles Keepax <ckeepax@opensource.cirrus.com> wrote: > > I think really the best place to look at this would be at the > regulator level. It is fine if mfd_add_devices passes, the problem > really is that the regulator core doesn't realise the regulator is > going to be arriving, and thus returns a dummy regulator, rather > than returning EPROBE_DEFER. If it did the MFD driver would probe > defer at the point of requesting the regulator, which would all > make sense. I think something like your suggestion could be made to work for the "microphone" supply in the arizona MFD, but not for the others looked at here. The problem is that if the MFD driver gets -EPROBE_DEFER then it will go through its error handling path and call mfd_remove_devices(). That'll remove the sub-device providing the regulator. If you try again, you'll just do the same. :-) Specifically in wm8994 after we've populated the regulator sub-devices then we turn them on and start talking to the device. I think the two options I have could both work for wm8994's case: either add some type of "my children have done probing" to MFD and move the turning on of regulators / talking to devices there, or add another stub-device and add it there. ;-) -Doug
On Thu, Mar 23, 2023 at 05:49:28PM +0000, Mark Brown wrote: > On Thu, Mar 23, 2023 at 05:45:31PM +0000, Charles Keepax wrote: > > > I think really the best place to look at this would be at the > > regulator level. It is fine if mfd_add_devices passes, the problem > > really is that the regulator core doesn't realise the regulator is > > going to be arriving, and thus returns a dummy regulator, rather > > than returning EPROBE_DEFER. If it did the MFD driver would probe > > defer at the point of requesting the regulator, which would all > > make sense. > > You need the MFD to tell the regulator subsystem that there's a > regulator bound there, or to force all the users to explicitly do the > mapping of the regulator in their firmwares (which isn't really a > viable approach). Yeah changing the firmware situation is definitely not a goer. I need to just clarify in my head exactly what is missing, with respect to the know the regulator exists. Thanks, Charles
On Thu, Mar 23, 2023 at 11:00:32AM -0700, Doug Anderson wrote: > Hi, > > On Thu, Mar 23, 2023 at 10:45 AM Charles Keepax > <ckeepax@opensource.cirrus.com> wrote: > > > > I think really the best place to look at this would be at the > > regulator level. It is fine if mfd_add_devices passes, the problem > > really is that the regulator core doesn't realise the regulator is > > going to be arriving, and thus returns a dummy regulator, rather > > than returning EPROBE_DEFER. If it did the MFD driver would probe > > defer at the point of requesting the regulator, which would all > > make sense. > > I think something like your suggestion could be made to work for the > "microphone" supply in the arizona MFD, but not for the others looked > at here. > > The problem is that if the MFD driver gets -EPROBE_DEFER then it will > go through its error handling path and call mfd_remove_devices(). > That'll remove the sub-device providing the regulator. If you try > again, you'll just do the same. :-) > > Specifically in wm8994 after we've populated the regulator sub-devices > then we turn them on and start talking to the device. > > I think the two options I have could both work for wm8994's case: > either add some type of "my children have done probing" to MFD and > move the turning on of regulators / talking to devices there, or add > another stub-device and add it there. ;-) Is this true if we keep the regulator as sync though? Yes it will remove the children but when it re-adds them the reason that the regulator probe deferred in the first place will hopefully be removed. So it will now fully probe in path. Thanks, Charles
Hi, On Fri, Mar 24, 2023 at 2:23 AM Charles Keepax <ckeepax@opensource.cirrus.com> wrote: > > On Thu, Mar 23, 2023 at 11:00:32AM -0700, Doug Anderson wrote: > > Hi, > > > > On Thu, Mar 23, 2023 at 10:45 AM Charles Keepax > > <ckeepax@opensource.cirrus.com> wrote: > > > > > > I think really the best place to look at this would be at the > > > regulator level. It is fine if mfd_add_devices passes, the problem > > > really is that the regulator core doesn't realise the regulator is > > > going to be arriving, and thus returns a dummy regulator, rather > > > than returning EPROBE_DEFER. If it did the MFD driver would probe > > > defer at the point of requesting the regulator, which would all > > > make sense. > > > > I think something like your suggestion could be made to work for the > > "microphone" supply in the arizona MFD, but not for the others looked > > at here. > > > > The problem is that if the MFD driver gets -EPROBE_DEFER then it will > > go through its error handling path and call mfd_remove_devices(). > > That'll remove the sub-device providing the regulator. If you try > > again, you'll just do the same. :-) > > > > Specifically in wm8994 after we've populated the regulator sub-devices > > then we turn them on and start talking to the device. > > > > I think the two options I have could both work for wm8994's case: > > either add some type of "my children have done probing" to MFD and > > move the turning on of regulators / talking to devices there, or add > > another stub-device and add it there. ;-) > > Is this true if we keep the regulator as sync though? Yes it will > remove the children but when it re-adds them the reason that the > regulator probe deferred in the first place will hopefully be > removed. So it will now fully probe in path. Ah, I see. So you keep it as synchronous _and_ make it so that it won't get a dummy. Yeah, I was trying to brainstorm ways we could fix it if we kept the regulator async. If we keep it as sync and fix the dummy problem then, indeed, it should work as you say. I've spent a whole lot of time dealing with similar issues, though, and I think there is actually another related concern with that design (where the regulator is synchronous). ;-) If the child device ends up depending on a resource that _never_ shows up then you can get into an infinite probe deferral loop at bootup. If it works the way it did last time I analyzed similar code: 1. Your MFD starts to probe and kicks off probing of its children (including the regulator). 2. Your regulator tries to probe and tries to get a resource that will never exist, maybe because of a bug in dts or maybe because it won't show up until userspace loads a module. It returns -EPROBE_DEFER. 3. The MFD realizes that the regulator didn't come up and it also returns -EPROBE_DEFER after removing all its children. 4. That fact that we added/removed devices in the above means that the kernel thinks it should retry probes of previously deferred devices because, maybe, the device showed up that everyone was waiting for. Thus, we go back to step #1. ...the system can actually loop forever in steps #1 - #4 and we ended up in that situation several times during development with similar architected systems. -Doug
On Fri, Mar 24, 2023 at 08:06:15AM -0700, Doug Anderson wrote: > On Fri, Mar 24, 2023 at 2:23 AM Charles Keepax > > On Thu, Mar 23, 2023 at 11:00:32AM -0700, Doug Anderson wrote: > > > On Thu, Mar 23, 2023 at 10:45 AM Charles Keepax > I've spent a whole lot of time dealing with similar issues, though, > and I think there is actually another related concern with that design > (where the regulator is synchronous). ;-) If the child device ends up > depending on a resource that _never_ shows up then you can get into an > infinite probe deferral loop at bootup. If it works the way it did > last time I analyzed similar code: > > 1. Your MFD starts to probe and kicks off probing of its children > (including the regulator). > > 2. Your regulator tries to probe and tries to get a resource that will > never exist, maybe because of a bug in dts or maybe because it won't > show up until userspace loads a module. It returns -EPROBE_DEFER. > > 3. The MFD realizes that the regulator didn't come up and it also > returns -EPROBE_DEFER after removing all its children. > > 4. That fact that we added/removed devices in the above means that the > kernel thinks it should retry probes of previously deferred devices > because, maybe, the device showed up that everyone was waiting for. > Thus, we go back to step #1. > > ...the system can actually loop forever in steps #1 - #4 and we ended > up in that situation several times during development with similar > architected systems. Hmm... shoot yes you are correct that would indeed happen. Thanks, Charles
diff --git a/drivers/regulator/wm8994-regulator.c b/drivers/regulator/wm8994-regulator.c index 8921051a00e9..2946db448aec 100644 --- a/drivers/regulator/wm8994-regulator.c +++ b/drivers/regulator/wm8994-regulator.c @@ -227,7 +227,7 @@ static struct platform_driver wm8994_ldo_driver = { .probe = wm8994_ldo_probe, .driver = { .name = "wm8994-ldo", - .probe_type = PROBE_PREFER_ASYNCHRONOUS, + .probe_type = PROBE_FORCE_SYNCHRONOUS, }, };
Restore synchronous probing for wm8994 regulators because otherwise the sound device is never initialized on Exynos5250-based Arndale board. Fixes: 259b93b21a9f ("regulator: Set PROBE_PREFER_ASYNCHRONOUS for drivers that existed in 4.14") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> --- drivers/regulator/wm8994-regulator.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)