Message ID | 1406196811-5384-3-git-send-email-drjones@redhat.com |
---|---|
State | New |
Headers | show |
Il 24/07/2014 12:13, Andrew Jones ha scritto: > > The running kernel still has the ability to enable/disable at any > time with /proc/sys/kernel/nmi_watchdog us usual. However even > when the default has been overridden /proc/sys/kernel/nmi_watchdog > will initially show '1'. To truly turn it on one must disable/enable > it, i.e. > echo 0 > /proc/sys/kernel/nmi_watchdog > echo 1 > /proc/sys/kernel/nmi_watchdog Why is it hard to make this show the right value? :) Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
> ----- Original Message ----- > From: "Paolo Bonzini" <pbonzini@redhat.com> > To: "Andrew Jones" <drjones@redhat.com>, linux-kernel@vger.kernel.org, kvm@vger.kernel.org > Cc: uobergfe@redhat.com, dzickus@redhat.com, akpm@linux-foundation.org, mingo@redhat.com > Sent: Thursday, July 24, 2014 12:46:11 PM > Subject: Re: [PATCH 2/3] watchdog: control hard lockup detection default > >Il 24/07/2014 12:13, Andrew Jones ha scritto: >> >> The running kernel still has the ability to enable/disable at any >> time with /proc/sys/kernel/nmi_watchdog us usual. However even >> when the default has been overridden /proc/sys/kernel/nmi_watchdog >> will initially show '1'. To truly turn it on one must disable/enable >> it, i.e. >> echo 0 > /proc/sys/kernel/nmi_watchdog >> echo 1 > /proc/sys/kernel/nmi_watchdog > > Why is it hard to make this show the right value? :) > > Paolo 'echo 1 > /proc/sys/kernel/nmi_watchdog' enables both - hard lockup and soft lockup detection. watchdog_enable_all_cpus() starts a 'watchdog/N' thread for each CPU. If the kernel runs on a bare metal system where the processor does not have a PMU, or when perf_event_create_kernel_counter() returns failure to watchdog_nmi_enable(), or when the kernel runs as a guest on a hypervisor that does not emulate a PMU, then the 'watchdog/N' threads are still active for soft lockup detection. Patch 2/3 essentially makes watchdog_nmi_enable() behave in the same way as if -ENOENT would have been returned by perf_event_create_kernel_counter(). This is then reported via a console message. NMI watchdog: disabled (cpu0): hardware events not enabled It's hard say what _is_ 'the right value' (because lockup detection is then enabled 'partially'), regardless of whether patch 2/3 is applied or not. Regards, Uli -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Il 24/07/2014 13:18, Ulrich Obergfell ha scritto: >>> >> The running kernel still has the ability to enable/disable at any >>> >> time with /proc/sys/kernel/nmi_watchdog us usual. However even >>> >> when the default has been overridden /proc/sys/kernel/nmi_watchdog >>> >> will initially show '1'. To truly turn it on one must disable/enable >>> >> it, i.e. >>> >> echo 0 > /proc/sys/kernel/nmi_watchdog >>> >> echo 1 > /proc/sys/kernel/nmi_watchdog >> > >> > Why is it hard to make this show the right value? :) >> > >> > Paolo > 'echo 1 > /proc/sys/kernel/nmi_watchdog' enables both - hard lockup and > soft lockup detection. watchdog_enable_all_cpus() starts a 'watchdog/N' > thread for each CPU. If the kernel runs on a bare metal system where the > processor does not have a PMU, or when perf_event_create_kernel_counter() > returns failure to watchdog_nmi_enable(), or when the kernel runs as a > guest on a hypervisor that does not emulate a PMU, then the 'watchdog/N' > threads are still active for soft lockup detection. Patch 2/3 essentially > makes watchdog_nmi_enable() behave in the same way as if -ENOENT would > have been returned by perf_event_create_kernel_counter(). This is then > reported via a console message. > > NMI watchdog: disabled (cpu0): hardware events not enabled > > It's hard say what _is_ 'the right value' (because lockup detection is > then enabled 'partially'), regardless of whether patch 2/3 is applied > or not. But this means that it is not possible to re-enable softlockup detection only. I think that should be the effect of echo 0 + echo 1, if hardlockup detection was disabled by either the command line or patch 3. Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
>----- Original Message ----- >From: "Paolo Bonzini" <pbonzini@redhat.com> >To: "Ulrich Obergfell" <uobergfe@redhat.com> >Cc: "Andrew Jones" <drjones@redhat.com>, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, dzickus@redhat.com, akpm@linux-foundation.org, >mingo@redhat.com >Sent: Thursday, July 24, 2014 1:26:40 PM >Subject: Re: [PATCH 2/3] watchdog: control hard lockup detection default > >Il 24/07/2014 13:18, Ulrich Obergfell ha scritto: >>>> >> The running kernel still has the ability to enable/disable at any >>>> >> time with /proc/sys/kernel/nmi_watchdog us usual. However even >>>> >> when the default has been overridden /proc/sys/kernel/nmi_watchdog >>>> >> will initially show '1'. To truly turn it on one must disable/enable >>>> >> it, i.e. >>>> >> echo 0 > /proc/sys/kernel/nmi_watchdog >>>> >> echo 1 > /proc/sys/kernel/nmi_watchdog >>> > >>> > Why is it hard to make this show the right value? :) >>> > >>> > Paolo >> 'echo 1 > /proc/sys/kernel/nmi_watchdog' enables both - hard lockup and >> soft lockup detection. watchdog_enable_all_cpus() starts a 'watchdog/N' >> thread for each CPU. If the kernel runs on a bare metal system where the >> processor does not have a PMU, or when perf_event_create_kernel_counter() >> returns failure to watchdog_nmi_enable(), or when the kernel runs as a >> guest on a hypervisor that does not emulate a PMU, then the 'watchdog/N' >> threads are still active for soft lockup detection. Patch 2/3 essentially >> makes watchdog_nmi_enable() behave in the same way as if -ENOENT would >> have been returned by perf_event_create_kernel_counter(). This is then >> reported via a console message. >> >> NMI watchdog: disabled (cpu0): hardware events not enabled >> >> It's hard say what _is_ 'the right value' (because lockup detection is >> then enabled 'partially'), regardless of whether patch 2/3 is applied >> or not. > > But this means that it is not possible to re-enable softlockup detection > only. I think that should be the effect of echo 0 + echo 1, if > hardlockup detection was disabled by either the command line or patch 3. > > Paolo The idea was to give the user two options to override the effect of patch 3/3. Either via the kernel command line ('nmi_watchdog=') at boot time or via /proc ('echo 0' + 'echo 1') when the system is up and running. Regards, Uli -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Il 24/07/2014 13:44, Ulrich Obergfell ha scritto: > > But this means that it is not possible to re-enable softlockup detection > > only. I think that should be the effect of echo 0 + echo 1, if > > hardlockup detection was disabled by either the command line or patch 3. > > The idea was to give the user two options to override the effect of patch 3/3. > Either via the kernel command line ('nmi_watchdog=') at boot time or via /proc > ('echo 0' + 'echo 1') when the system is up and running. I think the kernel command line is enough; another alternative is to split the nmi_watchdog /proc entry in two. Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
>----- Original Message ----- >From: "Paolo Bonzini" <pbonzini@redhat.com> >To: "Ulrich Obergfell" <uobergfe@redhat.com> >Cc: "Andrew Jones" <drjones@redhat.com>, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, dzickus@redhat.com, akpm@linux-foundation.org, >mingo@redhat.com >Sent: Thursday, July 24, 2014 1:45:47 PM >Subject: Re: [PATCH 2/3] watchdog: control hard lockup detection default > >Il 24/07/2014 13:44, Ulrich Obergfell ha scritto: >> > But this means that it is not possible to re-enable softlockup detection >> > only. I think that should be the effect of echo 0 + echo 1, if >> > hardlockup detection was disabled by either the command line or patch 3. >> >> The idea was to give the user two options to override the effect of patch 3/3. >> Either via the kernel command line ('nmi_watchdog=') at boot time or via /proc >> ('echo 0' + 'echo 1') when the system is up and running. > > I think the kernel command line is enough; another alternative is to > split the nmi_watchdog /proc entry in two. > > Paolo The current behaviour (without the patch) already allows a user to disable NMI watchdog at boot time ('nmi_watchdog=0') and enable it explicitly when the system is up and running ('echo 0' + 'echo 1'). I think it would be more consistent with this behaviour and more intuitive if we would give the user the option to override the effect of patch 3/3 via /proc. By 'intuitive' I mean that the user says: 'I _want_ this to be enabled'. Regards, Uli -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
> ----- Original Message ----- > From: "Andrew Jones" <drjones@redhat.com> > To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org > Cc: uobergfe@redhat.com, dzickus@redhat.com, pbonzini@redhat.com, akpm@linux-foundation.org, mingo@redhat.com > Sent: Thursday, July 24, 2014 12:13:30 PM > Subject: [PATCH 2/3] watchdog: control hard lockup detection default [...] > The running kernel still has the ability to enable/disable at any > time with /proc/sys/kernel/nmi_watchdog us usual. However even > when the default has been overridden /proc/sys/kernel/nmi_watchdog > will initially show '1'. To truly turn it on one must disable/enable > it, i.e. > echo 0 > /proc/sys/kernel/nmi_watchdog > echo 1 > /proc/sys/kernel/nmi_watchdog [...] > @@ -626,15 +665,17 @@ int proc_dowatchdog(struct ctl_table *table, int write, > * disabled. The 'watchdog_running' variable check in > * watchdog_*_all_cpus() function takes care of this. > */ > - if (watchdog_user_enabled && watchdog_thresh) > + if (watchdog_user_enabled && watchdog_thresh) { > + watchdog_enable_hardlockup_detector(true); > err = watchdog_enable_all_cpus(old_thresh != watchdog_thresh); > - else > + } else [...] I just realized a possible issue in the above part of the patch: If we would want to give the user the option to override the effect of patch 3/3 via /proc, I think proc_dowatchdog() should enable hard lockup detection _only_ in case of a state transition from 'NOT watchdog_running' to 'watchdog_running'. | if (watchdog_user_enabled && watchdog_thresh) { | need to add this if (!watchdog_running) <---------------------------' watchdog_enable_hardlockup_detector(true); err = watchdog_enable_all_cpus(old_thresh != watchdog_thresh); } else ... The additional 'if (!watchdog_running)' would _require_ the user to perform the sequence of commands echo 0 > /proc/sys/kernel/nmi_watchdog echo 1 > /proc/sys/kernel/nmi_watchdog to enable hard lockup detection explicitly. I think changing the 'watchdog_thresh' while 'watchdog_running' is true should _not_ enable hard lockup detection as a side-effect, because a user may have a 'sysctl.conf' entry such as kernel.watchdog_thresh = ... or may only want to change the 'watchdog_thresh' on the fly. I think the following flow of execution could cause such undesired side-effect. proc_dowatchdog if (watchdog_user_enabled && watchdog_thresh) { watchdog_enable_hardlockup_detector hardlockup_detector_enabled = true watchdog_enable_all_cpus if (!watchdog_running) { ... } else if (sample_period_changed) update_timers_all_cpus for_each_online_cpu update_timers watchdog_nmi_disable ... watchdog_nmi_enable watchdog_hardlockup_detector_is_enabled return true enable perf counter for hard lockup detection Regards, Uli -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On Fri, Jul 25, 2014 at 04:32:55AM -0400, Ulrich Obergfell wrote: > > ----- Original Message ----- > > From: "Andrew Jones" <drjones@redhat.com> > > To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org > > Cc: uobergfe@redhat.com, dzickus@redhat.com, pbonzini@redhat.com, akpm@linux-foundation.org, mingo@redhat.com > > Sent: Thursday, July 24, 2014 12:13:30 PM > > Subject: [PATCH 2/3] watchdog: control hard lockup detection default > > [...] > > > The running kernel still has the ability to enable/disable at any > > time with /proc/sys/kernel/nmi_watchdog us usual. However even > > when the default has been overridden /proc/sys/kernel/nmi_watchdog > > will initially show '1'. To truly turn it on one must disable/enable > > it, i.e. > > echo 0 > /proc/sys/kernel/nmi_watchdog > > echo 1 > /proc/sys/kernel/nmi_watchdog > > [...] > > > @@ -626,15 +665,17 @@ int proc_dowatchdog(struct ctl_table *table, int write, > > * disabled. The 'watchdog_running' variable check in > > * watchdog_*_all_cpus() function takes care of this. > > */ > > - if (watchdog_user_enabled && watchdog_thresh) > > + if (watchdog_user_enabled && watchdog_thresh) { > > + watchdog_enable_hardlockup_detector(true); > > err = watchdog_enable_all_cpus(old_thresh != watchdog_thresh); > > - else > > + } else > > [...] > > > I just realized a possible issue in the above part of the patch: > > If we would want to give the user the option to override the effect of patch 3/3 > via /proc, I think proc_dowatchdog() should enable hard lockup detection _only_ > in case of a state transition from 'NOT watchdog_running' to 'watchdog_running'. > | > if (watchdog_user_enabled && watchdog_thresh) { | need to add this > if (!watchdog_running) <---------------------------' > watchdog_enable_hardlockup_detector(true); > err = watchdog_enable_all_cpus(old_thresh != watchdog_thresh); > } else > ... > > The additional 'if (!watchdog_running)' would _require_ the user to perform the > sequence of commands > > echo 0 > /proc/sys/kernel/nmi_watchdog > echo 1 > /proc/sys/kernel/nmi_watchdog > > to enable hard lockup detection explicitly. > > I think changing the 'watchdog_thresh' while 'watchdog_running' is true should > _not_ enable hard lockup detection as a side-effect, because a user may have a > 'sysctl.conf' entry such as > > kernel.watchdog_thresh = ... > > or may only want to change the 'watchdog_thresh' on the fly. > > I think the following flow of execution could cause such undesired side-effect. > > proc_dowatchdog > if (watchdog_user_enabled && watchdog_thresh) { > > watchdog_enable_hardlockup_detector > hardlockup_detector_enabled = true > > watchdog_enable_all_cpus > if (!watchdog_running) { > ... > } else if (sample_period_changed) > update_timers_all_cpus > for_each_online_cpu > update_timers > watchdog_nmi_disable > ... > watchdog_nmi_enable > > watchdog_hardlockup_detector_is_enabled > return true > > enable perf counter for hard lockup detection > > Regards, > > Uli Nice catch. Looks like this will need a v2. Paolo, do we have a consensus on the proc echoing? Or should that be revisited in the v2 as well? drew -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Il 30/07/2014 15:43, Don Zickus ha scritto: >> > Nice catch. Looks like this will need a v2. Paolo, do we have a >> > consensus on the proc echoing? Or should that be revisited in the v2 as >> > well? > As discussed privately, how about something like this to handle that case: > (applied on top of these patches) Don, what do you think about proc? My opinion is still what I mentioned earlier in the thread, i.e. that if the file says "1", writing "0" and then "1" should not constitute a change WRT to the initial state. Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On Wed, Jul 30, 2014 at 04:16:38PM +0200, Paolo Bonzini wrote: > Il 30/07/2014 15:43, Don Zickus ha scritto: > >> > Nice catch. Looks like this will need a v2. Paolo, do we have a > >> > consensus on the proc echoing? Or should that be revisited in the v2 as > >> > well? > > As discussed privately, how about something like this to handle that case: > > (applied on top of these patches) > > Don, what do you think about proc? > > My opinion is still what I mentioned earlier in the thread, i.e. that if > the file says "1", writing "0" and then "1" should not constitute a > change WRT to the initial state. > I can agree. The problem is there are two things this proc value controls, softlockup and hardlockup. I have always tried to keep the both disabled or enabled together. This patchset tries to separate them for an edge case. Hence the proc value becomes slightly confusing. I don't know the right way to solve this without introducing more proc values. We have /proc/sys/kernel/nmi_watchdog and /proc/sys/kernel/watchdog which point to the same internal variable. Do I separate them and have 'nmi_watchdog' just mean hardlockup and 'watchdog' mean softlockup? Then we can be clear on what the output is. Or does 'watchdog' represent a superset of 'nmi_watchdog' && softlockup? That is where the confusion lies. Cheers, Don -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
diff --git a/include/linux/nmi.h b/include/linux/nmi.h index 447775ee2c4b0..72aacf4e3d539 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -17,11 +17,20 @@ #if defined(CONFIG_HAVE_NMI_WATCHDOG) || defined(CONFIG_HARDLOCKUP_DETECTOR) #include <asm/nmi.h> extern void touch_nmi_watchdog(void); +extern void watchdog_enable_hardlockup_detector(bool val); +extern bool watchdog_hardlockup_detector_is_enabled(void); #else static inline void touch_nmi_watchdog(void) { touch_softlockup_watchdog(); } +static inline void watchdog_enable_hardlockup_detector(bool) +{ +} +static inline bool watchdog_hardlockup_detector_is_enabled(void) +{ + return true; +} #endif /* diff --git a/kernel/watchdog.c b/kernel/watchdog.c index c985a21926545..34eca29e28a4c 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -63,6 +63,25 @@ static unsigned long soft_lockup_nmi_warn; static int hardlockup_panic = CONFIG_BOOTPARAM_HARDLOCKUP_PANIC_VALUE; +static bool hardlockup_detector_enabled = true; +/* + * We may not want to enable hard lockup detection by default in all cases, + * for example when running the kernel as a guest on a hypervisor. In these + * cases this function can be called to disable hard lockup detection. This + * function should only be executed once by the boot processor before the + * kernel command line parameters are parsed, because otherwise it is not + * possible to override this in hardlockup_panic_setup(). + */ +void watchdog_enable_hardlockup_detector(bool val) +{ + hardlockup_detector_enabled = val; +} + +bool watchdog_hardlockup_detector_is_enabled(void) +{ + return hardlockup_detector_enabled; +} + static int __init hardlockup_panic_setup(char *str) { if (!strncmp(str, "panic", 5)) @@ -71,6 +90,14 @@ static int __init hardlockup_panic_setup(char *str) hardlockup_panic = 0; else if (!strncmp(str, "0", 1)) watchdog_user_enabled = 0; + else if (!strncmp(str, "1", 1) || !strncmp(str, "2", 1)) { + /* + * Setting 'nmi_watchdog=1' or 'nmi_watchdog=2' (legacy option) + * has the same effect. + */ + watchdog_user_enabled = 1; + watchdog_enable_hardlockup_detector(true); + } return 1; } __setup("nmi_watchdog=", hardlockup_panic_setup); @@ -451,6 +478,15 @@ static int watchdog_nmi_enable(unsigned int cpu) struct perf_event_attr *wd_attr; struct perf_event *event = per_cpu(watchdog_ev, cpu); + /* + * Some kernels need to default hard lockup detection to + * 'disabled', for example a guest on a hypervisor. + */ + if (!watchdog_hardlockup_detector_is_enabled()) { + event = ERR_PTR(-ENOENT); + goto handle_err; + } + /* is it already setup and enabled? */ if (event && event->state > PERF_EVENT_STATE_OFF) goto out; @@ -465,6 +501,7 @@ static int watchdog_nmi_enable(unsigned int cpu) /* Try to register using hardware perf events */ event = perf_event_create_kernel_counter(wd_attr, cpu, NULL, watchdog_overflow_callback, NULL); +handle_err: /* save cpu0 error for future comparision */ if (cpu == 0 && IS_ERR(event)) cpu0_err = PTR_ERR(event); @@ -610,11 +647,13 @@ int proc_dowatchdog(struct ctl_table *table, int write, void __user *buffer, size_t *lenp, loff_t *ppos) { int err, old_thresh, old_enabled; + bool old_hardlockup; static DEFINE_MUTEX(watchdog_proc_mutex); mutex_lock(&watchdog_proc_mutex); old_thresh = ACCESS_ONCE(watchdog_thresh); old_enabled = ACCESS_ONCE(watchdog_user_enabled); + old_hardlockup = watchdog_hardlockup_detector_is_enabled(); err = proc_dointvec_minmax(table, write, buffer, lenp, ppos); if (err || !write) @@ -626,15 +665,17 @@ int proc_dowatchdog(struct ctl_table *table, int write, * disabled. The 'watchdog_running' variable check in * watchdog_*_all_cpus() function takes care of this. */ - if (watchdog_user_enabled && watchdog_thresh) + if (watchdog_user_enabled && watchdog_thresh) { + watchdog_enable_hardlockup_detector(true); err = watchdog_enable_all_cpus(old_thresh != watchdog_thresh); - else + } else watchdog_disable_all_cpus(); /* Restore old values on failure */ if (err) { watchdog_thresh = old_thresh; watchdog_user_enabled = old_enabled; + watchdog_enable_hardlockup_detector(old_hardlockup); } out: mutex_unlock(&watchdog_proc_mutex);