Message ID | 1454519923-25230-6-git-send-email-fu.wei@linaro.org |
---|---|
State | New |
Headers | show |
Hi Timur, Thanks for your rapid feedback :-) On 4 February 2016 at 01:27, Timur Tabi <timur@codeaurora.org> wrote: > fu.wei@linaro.org wrote: >> >> +#ifdef CONFIG_ARM_SBSA_WATCHDOG_PANIC >> +static bool panic_enabled = true; > > > I think this should default to 'false', because IMHO, this seems like an odd yes, It make sense to make it default to 'false'. > feature. I'm not crazy about the fact that there's a Kconfig option for it > either, but I'm not going to NACK this patch. > > I personally would prefer to drop this patch, and just wait for full-blown > pre-timeout support. It feels like a debugging feature that doesn't really sorry, are you saying : using pre-timeout instead of this half timeout? But even we have pre-timeout support, pre-timeout == timeout / 2, it can not be configured without touch timeout. if you want pre-timeout != timeout / 2, we have to modify WCV in the interrupt routine. (because of the explicit watchdog refresh mechanism) Could you let me know why we need pre-timeout here ?? :-) > belong upstream. But like I said, it's just my opinion, and I won't > complain if I'm outvoted. I think this debugging feature is the purpose of the two-stage watchdog, if I understand correctly -- Best regards, Fu Wei Software Engineer Red Hat Software (Beijing) Co.,Ltd.Shanghai Branch Ph: +86 21 61221326(direct) Ph: +86 186 2020 4684 (mobile) Room 1512, Regus One Corporate Avenue,Level 15, One Corporate Avenue,222 Hubin Road,Huangpu District, Shanghai,China 200021
Hi Timur On 4 February 2016 at 01:53, Timur Tabi <timur@codeaurora.org> wrote: > Fu Wei wrote: >> >> sorry, are you saying : using pre-timeout instead of this half timeout? >> >> But even we have pre-timeout support, pre-timeout == timeout / 2, it >> can not be configured without touch timeout. >> >> if you want pre-timeout != timeout / 2, we have to modify WCV in the >> interrupt routine. >> (because of the explicit watchdog refresh mechanism) >> >> Could you let me know why we need pre-timeout here ??:-) > > > What I meant was that if we had full-blown pre-timeout support in the > watchdog layer, then you could use that to implement the > panic-on-half-timeout feature. > > When pre-timeout is implemented, will you modify the interrupt handler to > use it? Sorry I am little confused. Actually I am taking your suggestion to avoid touching WCV in interrupt routine. So even we have pre-timeout support , it is useless for this panic-on-half-timeout feature, because pre-timeout == timeout / 2 (always). So maybe I misunderstand your suggestion, could you let me know : why we want pre-timeout here? > >>> >belong upstream. But like I said, it's just my opinion, and I won't >>> >complain if I'm outvoted. >> >> I think this debugging feature is the purpose of the two-stage >> watchdog, if I understand correctly > > > Hmmm... that make sense. I think maybe you should drop the Kconfig option, > and just have "static bool panic_enabled = false;" Also, then do this: > > if (panic_enabled) { > ret = devm_request_irq(dev, irq, sbsa_gwdt_interrupt, 0, > pdev->name, gwdt); > if (ret) { > dev_err(dev, "unable to request IRQ %d\n", irq); > return ret; > } > } yes, agree > > That way, the interrupt handler is never registered if the command-line > parameter is not specified. > -- Best regards, Fu Wei Software Engineer Red Hat Software (Beijing) Co.,Ltd.Shanghai Branch Ph: +86 21 61221326(direct) Ph: +86 186 2020 4684 (mobile) Room 1512, Regus One Corporate Avenue,Level 15, One Corporate Avenue,222 Hubin Road,Huangpu District, Shanghai,China 200021
On 5 February 2016 at 22:42, Guenter Roeck <linux@roeck-us.net> wrote: > On 02/05/2016 01:51 AM, Fu Wei wrote: >> >> Hi Guenter, >> >> On 4 February 2016 at 13:17, Guenter Roeck <linux@roeck-us.net> wrote: >>> >>> On 02/03/2016 03:00 PM, Fu Wei wrote: >>>> >>>> >>>> On 4 February 2016 at 02:45, Timur Tabi <timur@codeaurora.org> wrote: >>>>> >>>>> >>>>> Fu Wei wrote: >>>>>> >>>>>> >>>>>> >>>>>> As you know I have made the pre-timeout support patch, If people like >>>>>> it, i am happy to go on upstream it separately. >>>>>> >>>>>> If we want to use pre-timeout here, user only can use get_pretimeout >>>>>> and disable panic by setting pretimeout to 0 >>>>>> but user can not really set pretimeout, because "pre-timeout == >>>>>> timeout / 2 (always)". >>>>>> if user want to change pretimeout, he/she has to set_time instead. >>>>> >>>>> >>>>> >>>>> >>>>> Ok, I think patches 4 and 5 should be combined, and I think the Kconfig >>>>> entry should be removed and just use panic_enabled. >>> >>> >>> >>> Agreed. >> >> >> np, will do >> >>>> >>>> >>>> NP, will update this patchset like that , thanks :-) >>>> >>> >>> Also, if panic is enabled, the timeout needs to be adjusted accordingly >>> (to only panic after the entire timeout period has expired, not after >>> half of it). We can not panic the system after timeout / 2. >> >> >> OK, my thought is >> >> if panic is enabled : >> |--------WOR-------WS0--------WOR-------WS1 >> |------timeout------(panic)------timeout-----reset >> >> if panic is disabled . >> |--------WOR-------WS0--------WOR-------WS1 >> |---------------------timeout---------------------reset >> >> panic_enabled only can be configured when module is loaded by module >> parameter >> >> But user should know that max_timeout(panic_enable) = >> max_timeout(panic_disable) / 2 >> > > That means you'll have to update max_timeout accordingly. panic_enabled only can be configured when module is loaded, so we don't need to update it. max_timeout will only be set up in the init stage. Does it make sense ? :-) > >>> >>> I am not too happy with the parameter name (panic_enabled). How about >>> "action", to match machzwd ? >> >> >> yes, makes sense. Maybe we can do something like this: >> >> /* >> * action refers to action taken when watchdog gets WS0 >> * 0 = SKIP >> * 1 = PANIC >> * defaults to SKIP (0) >> */ >> static int action; >> module_param(action, int, 0); >> MODULE_PARM_DESC(action, "after watchdog gets WS0 interrupt, do: " >> "0 = SKIP(*) 1 = PANIC"); >> > Yes, though I would suggest to use lower case letters. yes, NP, will do , Thanks :-) > > Thanks, > Guenter > -- Best regards, Fu Wei Software Engineer Red Hat Software (Beijing) Co.,Ltd.Shanghai Branch Ph: +86 21 61221326(direct) Ph: +86 186 2020 4684 (mobile) Room 1512, Regus One Corporate Avenue,Level 15, One Corporate Avenue,222 Hubin Road,Huangpu District, Shanghai,China 200021
diff --git a/Documentation/watchdog/watchdog-parameters.txt b/Documentation/watchdog/watchdog-parameters.txt index 300eb4d..31641e2 100644 --- a/Documentation/watchdog/watchdog-parameters.txt +++ b/Documentation/watchdog/watchdog-parameters.txt @@ -286,6 +286,7 @@ nowayout: Watchdog cannot be stopped once started ------------------------------------------------- sbsa_gwdt: timeout: Watchdog timeout in seconds. (default 20s) +panic_enabled: Enable panic at half timeout. (default=true) nowayout: Watchdog cannot be stopped once started (default=kernel config parameter) ------------------------------------------------- diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig index 4ab1b05..42adfdf 100644 --- a/drivers/watchdog/Kconfig +++ b/drivers/watchdog/Kconfig @@ -218,6 +218,16 @@ config ARM_SBSA_WATCHDOG To compile this driver as module, choose M here: The module will be called sbsa_gwdt. +config ARM_SBSA_WATCHDOG_PANIC + bool "ARM SBSA Generic Watchdog triggers panic at the half timeout" + depends on ARM_SBSA_WATCHDOG + help + ARM SBSA Generic Watchdog will trigger panic in the first signal + (WS0) interrupt routine when the half timeout is reached. + This function can help administrator to backup the system context + info by panic console output or kdump (if supported). + But user can skip panic by setting moduleparam panic_enabled as 0. + config ASM9260_WATCHDOG tristate "Alphascale ASM9260 watchdog" depends on MACH_ASM9260 diff --git a/drivers/watchdog/sbsa_gwdt.c b/drivers/watchdog/sbsa_gwdt.c index 5a2dba3..d18cf37 100644 --- a/drivers/watchdog/sbsa_gwdt.c +++ b/drivers/watchdog/sbsa_gwdt.c @@ -16,18 +16,22 @@ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * - * This SBSA Generic watchdog driver is a single stage timeout version. + * This SBSA Generic watchdog driver is a two stages version. * Since this watchdog timer has two stages, and each stage is determined * by WOR. So the timeout is (WOR * 2). - * When first timeout is reached, WS0 is triggered, the interrupt - * triggered by WS0 will be ignored, then the second watch period starts; - * when second timeout is reached, then WS1 is triggered, system reset. + * When the first stage(the half timeout) is reached, WS0 interrupt is + * triggered, at this moment the second watch period starts; + * In the WS0 interrupt routine, panic will be triggered for saving the + * system context. + * If the system is getting into trouble and cannot be reset by panic or + * restart properly by the kdump kernel(if supported), then the second + * stage (the timeout) will be reached, system will be reset by WS1. * * More details about the hardware specification of this device: * ARM DEN0029B - Server Base System Architecture (SBSA) * * SBSA GWDT: |--------WOR-------WS0--------WOR-------WS1 - * |----------------timeout----------------reset + * |--half_timeout--(panic)--half_timeout--reset * */ @@ -84,6 +88,13 @@ MODULE_PARM_DESC(timeout, "Watchdog timeout in seconds. (>=0, default=" __MODULE_STRING(DEFAULT_TIMEOUT) ")"); +#ifdef CONFIG_ARM_SBSA_WATCHDOG_PANIC +static bool panic_enabled = true; +module_param(panic_enabled, bool, 0); +MODULE_PARM_DESC(panic_enabled, + "enable panic at half timeout. (default=true)"); +#endif + static bool nowayout = WATCHDOG_NOWAYOUT; module_param(nowayout, bool, S_IRUGO); MODULE_PARM_DESC(nowayout, @@ -159,6 +170,16 @@ static int sbsa_gwdt_stop(struct watchdog_device *wdd) return 0; } +#ifdef CONFIG_ARM_SBSA_WATCHDOG_PANIC +static irqreturn_t sbsa_gwdt_interrupt(int irq, void *dev_id) +{ + if (panic_enabled) + panic("SBSA Watchdog half timeout"); + + return IRQ_HANDLED; +} +#endif + static struct watchdog_info sbsa_gwdt_info = { .identity = "SBSA Generic Watchdog", .options = WDIOF_SETTIMEOUT | @@ -186,6 +207,9 @@ static int sbsa_gwdt_probe(struct platform_device *pdev) struct resource *res; u32 status; int ret; +#ifdef CONFIG_ARM_SBSA_WATCHDOG_PANIC + int irq; +#endif gwdt = devm_kzalloc(dev, sizeof(*gwdt), GFP_KERNEL); if (!gwdt) @@ -202,6 +226,14 @@ static int sbsa_gwdt_probe(struct platform_device *pdev) if (IS_ERR(rf_base)) return PTR_ERR(rf_base); +#ifdef CONFIG_ARM_SBSA_WATCHDOG_PANIC + irq = platform_get_irq(pdev, 0); + if (irq < 0) { + dev_err(dev, "unable to get ws0 interrupt.\n"); + return irq; + } +#endif + /* * Get the frequency of system counter from the cp15 interface of ARM * Generic timer. We don't need to check it, because if it returns "0", @@ -228,6 +260,14 @@ static int sbsa_gwdt_probe(struct platform_device *pdev) dev_warn(dev, "System reset by WDT.\n"); wdd->bootstatus |= WDIOF_CARDRESET; } +#ifdef CONFIG_ARM_SBSA_WATCHDOG_PANIC + ret = devm_request_irq(dev, irq, sbsa_gwdt_interrupt, 0, + pdev->name, gwdt); + if (ret) { + dev_err(dev, "unable to request IRQ %d\n", irq); + return ret; + } +#endif ret = watchdog_register_device(wdd); if (ret) @@ -242,6 +282,10 @@ static int sbsa_gwdt_probe(struct platform_device *pdev) dev_info(dev, "Initialized with %ds timeout @ %u Hz%s\n", wdd->timeout, gwdt->clk, status & SBSA_GWDT_WCS_EN ? " [enabled]" : ""); +#ifdef CONFIG_ARM_SBSA_WATCHDOG_PANIC + dev_info(dev, "Half timeout panic %s.\n", + panic_enabled ? "enabled" : "disabled"); +#endif return 0; }