Message ID | 20200312114034.20016-1-rasmus.villemoes@prevas.dk |
---|---|
State | New |
Headers | show |
Series | watchdog: allow overriding the rate-limiting logic | expand |
Hi Rasmus, (added Christophe to Cc) On 12.03.20 12:40, Rasmus Villemoes wrote: > On the MPC8309-derived board I'm currently working on, the current > rate-limiting logic in the generic watchdog_reset function poses two > problems: > > First, the hard-coded limit of 1000ms is too large for the > external GPIO-triggered watchdog we have. I remember a discussion a few weeks ago with Christophe, where you pointed out, that there is already the official "hw_margin_ms" DT property for this delay here. Why don't you introduce it here so that all boards can use it as well? > Second, in the SPL, the decrementer interrupt is not enabled, so the > get_timer() call calls always returns 0, so wdt_reset() is never > actually called. Enabling that interrupt (i.e. calling > interrupt_init() somewhere in my early board code) is a bit > problematic, since U-Boot proper gets loaded to address 0, hence > overwriting exception vectors - and the reason I even need to care > about the watchdog in the SPL is that the signature verification takes > a rather long time, so just disabling interrupts before loading U-Boot > proper doesn't work. > > Both problems can be solved by allowing the board to override the > rate-limiting logic. For example, in my case I will implement the > function in terms of get_ticks() (i.e. the time base registers on > PPC) which do not depend on interrupts, and use a threshold of > gd->bus_clk / (4*16) ticks (~62ms). I would assume that you might run into multiple issues, when your timer infrastructure is not working correctly in SPL. Can't you instead "fix" this by using this get_ticks() option for the get_timer() functions in SPL so that you can use the common functions here and in all other places in SPL as well? > Signed-off-by: Rasmus Villemoes <rasmus.villemoes at prevas.dk> > --- > > This is on top of https://patchwork.ozlabs.org/patch/1242772/, but > it's obviously trivial to do on master instead. > > drivers/watchdog/wdt-uclass.c | 21 ++++++++++++++------- > 1 file changed, 14 insertions(+), 7 deletions(-) > > diff --git a/drivers/watchdog/wdt-uclass.c b/drivers/watchdog/wdt-uclass.c > index 309a0e9c5b..ad53e86b80 100644 > --- a/drivers/watchdog/wdt-uclass.c > +++ b/drivers/watchdog/wdt-uclass.c > @@ -67,6 +67,19 @@ int wdt_expire_now(struct udevice *dev, ulong flags) > } > > #if defined(CONFIG_WATCHDOG) > +__weak int watchdog_reset_ratelimit(void) > +{ > + static ulong next_reset; > + ulong now; > + > + now = get_timer(0); > + if (time_after(now, next_reset)) { > + next_reset = now + 1000; /* reset every 1000ms */ > + return 1; > + } > + return 0; > +} > + > /* > * Called by macro WATCHDOG_RESET. This function be called *very* early, > * so we need to make sure, that the watchdog driver is ready before using > @@ -74,19 +87,13 @@ int wdt_expire_now(struct udevice *dev, ulong flags) > */ > void watchdog_reset(void) > { > - static ulong next_reset; > - ulong now; > - > /* Exit if GD is not ready or watchdog is not initialized yet */ > if (!gd || !(gd->flags & GD_FLG_WDT_READY)) > return; > > /* Do not reset the watchdog too often */ > - now = get_timer(0); > - if (time_after(now, next_reset)) { > - next_reset = now + 1000; /* reset every 1000ms */ > + if (watchdog_reset_ratelimit()) > wdt_reset(gd->watchdog_dev); > - } > } > #endif > > Thanks, Stefan
On 12/03/2020 12.58, Stefan Roese wrote: > Hi Rasmus, > > (added Christophe to Cc) > > On 12.03.20 12:40, Rasmus Villemoes wrote: >> On the MPC8309-derived board I'm currently working on, the current >> rate-limiting logic in the generic watchdog_reset function poses two >> problems: >> >> First, the hard-coded limit of 1000ms is too large for the >> external GPIO-triggered watchdog we have. > > I remember a discussion a few weeks ago with Christophe, where you > pointed out, that there is already the official "hw_margin_ms" DT > property for this delay here. Why don't you introduce it here so that > all boards can use it as well? Well, I considered that. Reading the value is probably just adding a dev_read_u32_default(..., 1000) next to the one reading the timeout-sec property in initr_watchdog(). But what is a good place to stash the result? It really belongs with the device, but as long as only one is supported I suppose one could move initr_watchdog to wdt-uclass.c and use a static variable in that file. I can certainly do that. That leaves the other problem. >> Second, in the SPL, the decrementer interrupt is not enabled, so the >> get_timer() call calls always returns 0, so wdt_reset() is never >> actually called. Enabling that interrupt (i.e. calling >> interrupt_init() somewhere in my early board code) is a bit >> problematic, since U-Boot proper gets loaded to address 0, hence >> overwriting exception vectors - and the reason I even need to care >> about the watchdog in the SPL is that the signature verification takes >> a rather long time, so just disabling interrupts before loading U-Boot >> proper doesn't work. >> >> Both problems can be solved by allowing the board to override the >> rate-limiting logic. For example, in my case I will implement the >> function in terms of get_ticks() (i.e. the time base registers on >> PPC) which do not depend on interrupts, and use a threshold of >> gd->bus_clk / (4*16) ticks (~62ms). > > I would assume that you might run into multiple issues, when your timer > infrastructure is not working correctly in SPL. None so far, except for the ratelimiting in terms of get_timer() not working. > Can't you instead "fix" > this by using this get_ticks() option for the get_timer() functions in > SPL so that you can use the common functions here and in all other > places in SPL as well? I don't understand what you mean. On PPC, get_timer(0) just returns a static variable which is incremented from the timer_interrupt(). When that interrupt is not enabled, that variable is always 0. I can enable the timer interrupt and get the time (in terms of get_timer) going, but as I wrote, I would have to disable that interrupt again before actually loading U-Boot [you can probably imagine the fun of debugging what happens when one overwrites the exception vectors], so time would stop passing as far as get_timer() is concerned, which means that again the watchdog would no longer be serviced. So if there is to be any ratelimiting at all, it really has to be based on a time that does tick without relying on interrupts. Rasmus
Some watchdogs must be reset more often than the once-per-second ratelimit used by the generic watchdog_reset function in wdt-uclass.c. There's precedent (from the gpio-wdt driver in linux) for using a property called hw_margin_ms to let the device tree tell the driver how often the device needs resetting. So use that generically. No change in default behaviour. On top of https://patchwork.ozlabs.org/patch/1242772/ . Stefan, something like this? That at least solves half my problems and might be useful to others as well. Then I'll have to figure out the time-stands-still problem in some other way. Rasmus Villemoes (3): watchdog: remove stale ifndef CONFIG_WATCHDOG_TIMEOUT_MSECS from wdt.h watchdog: move initr_watchdog() to wdt-uclass.c watchdog: honour hw_margin_ms DT property drivers/watchdog/wdt-uclass.c | 43 ++++++++++++++++++++++++++++++++++- include/wdt.h | 38 +------------------------------ 2 files changed, 43 insertions(+), 38 deletions(-)
On 13.03.20 17:04, Rasmus Villemoes wrote: > Some watchdogs must be reset more often than the once-per-second > ratelimit used by the generic watchdog_reset function in > wdt-uclass.c. There's precedent (from the gpio-wdt driver in linux) > for using a property called hw_margin_ms to let the device tree tell > the driver how often the device needs resetting. So use that > generically. No change in default behaviour. > > On top of https://patchwork.ozlabs.org/patch/1242772/ . > > Stefan, something like this? Yes, thanks for looking into this. > That at least solves half my problems and > might be useful to others as well. Then I'll have to figure out the > time-stands-still problem in some other way. If its too hard to enable interrupts in SPL for you or to provide some other means of a working get_timer() API, then we needto find another solution. You started with this weak function, which of course works. What other options are there? Adding a callback mechanism to register platform specific callback functions? Even though this might get a little bit too complicated. Thanks, Stefan
On 14/03/2020 13.04, Stefan Roese wrote: > On 13.03.20 17:04, Rasmus Villemoes wrote: >> That at least solves half my problems and >> might be useful to others as well. Then I'll have to figure out the >> time-stands-still problem in some other way. > > If its too hard to enable interrupts in SPL for you or to provide some > other means of a working get_timer() API, then we needto find another > solution. You started with this weak function, which of course works. > What other options are there? Adding a callback mechanism to register > platform specific callback functions? Even though this might get a > little bit too complicated. Now that I dig a bit more into this, I seem to remember that we actually also had problems in U-Boot proper when loading a compressed kernel, so for now we're using an uncompressed kernel in our FIT images. I will have to re-investigate, but it now occurs to me that it might be due to the fact that interrupts get disabled during bootm (which makes sense, the same reason I stated previously of interrupt vectors about to be overwritten), so even in U-Boot proper, time as measured by get_timer() ceases to pass after that point, so all the WATCHDOG_RESET() calls from the inflate code effectively get ignored. So it may be necessary to have some wdt_ratelimit_disable() hook that can be called from bootm_disable_interrupts() and e.g. some board-specific SPL code. I'll do some experiments and figure out if I do indeed need such a hook. Rasmus
diff --git a/drivers/watchdog/wdt-uclass.c b/drivers/watchdog/wdt-uclass.c index 309a0e9c5b..ad53e86b80 100644 --- a/drivers/watchdog/wdt-uclass.c +++ b/drivers/watchdog/wdt-uclass.c @@ -67,6 +67,19 @@ int wdt_expire_now(struct udevice *dev, ulong flags) } #if defined(CONFIG_WATCHDOG) +__weak int watchdog_reset_ratelimit(void) +{ + static ulong next_reset; + ulong now; + + now = get_timer(0); + if (time_after(now, next_reset)) { + next_reset = now + 1000; /* reset every 1000ms */ + return 1; + } + return 0; +} + /* * Called by macro WATCHDOG_RESET. This function be called *very* early, * so we need to make sure, that the watchdog driver is ready before using @@ -74,19 +87,13 @@ int wdt_expire_now(struct udevice *dev, ulong flags) */ void watchdog_reset(void) { - static ulong next_reset; - ulong now; - /* Exit if GD is not ready or watchdog is not initialized yet */ if (!gd || !(gd->flags & GD_FLG_WDT_READY)) return; /* Do not reset the watchdog too often */ - now = get_timer(0); - if (time_after(now, next_reset)) { - next_reset = now + 1000; /* reset every 1000ms */ + if (watchdog_reset_ratelimit()) wdt_reset(gd->watchdog_dev); - } } #endif
On the MPC8309-derived board I'm currently working on, the current rate-limiting logic in the generic watchdog_reset function poses two problems: First, the hard-coded limit of 1000ms is too large for the external GPIO-triggered watchdog we have. Second, in the SPL, the decrementer interrupt is not enabled, so the get_timer() call calls always returns 0, so wdt_reset() is never actually called. Enabling that interrupt (i.e. calling interrupt_init() somewhere in my early board code) is a bit problematic, since U-Boot proper gets loaded to address 0, hence overwriting exception vectors - and the reason I even need to care about the watchdog in the SPL is that the signature verification takes a rather long time, so just disabling interrupts before loading U-Boot proper doesn't work. Both problems can be solved by allowing the board to override the rate-limiting logic. For example, in my case I will implement the function in terms of get_ticks() (i.e. the time base registers on PPC) which do not depend on interrupts, and use a threshold of gd->bus_clk / (4*16) ticks (~62ms). Signed-off-by: Rasmus Villemoes <rasmus.villemoes at prevas.dk> --- This is on top of https://patchwork.ozlabs.org/patch/1242772/, but it's obviously trivial to do on master instead. drivers/watchdog/wdt-uclass.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-)